
1. Imported Libraries
pandas : https://en.wikipedia.org/wiki/Pandas_(software)
pandas (software) - Wikipedia
Python programming library for data manipulation and analysis In computer programming, pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for
en.wikipedia.org
In computer programming, pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series. It is free software released under the three-clause BSD license.[2] The name is derived from the term "panel data", an econometrics term for data sets that include observations over multiple time periods for the same individuals
Python Data Analysis Library — pandas: Python Data Analysis Library
Python Data Analysis Library pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. pandas is a NumFOCUS sponsored project. This will help ensure t
pandas.pydata.org
numpy : https://en.wikipedia.org/wiki/NumPy
NumPy - Wikipedia
From Wikipedia, the free encyclopedia Jump to navigation Jump to search Numerical programming library for the Python programming language NumPy (pronounced (NUM-py) or sometimes [2][3] (NUM-pee)) is a library for the Python programming language, adding sup
en.wikipedia.org
NumPy (pronounced /ˈnʌmpaɪ/ (NUM-py) or sometimes /ˈnʌmpi/[2][3] (NUM-pee)) is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. The ancestor of NumPy, Numeric, was originally created by Jim Hugunin with contributions from several other developers. In 2005, Travis Oliphant created NumPy by incorporating features of the competing Numarray into Numeric, with extensive modifications. NumPy is open-source software and has many contributors.
NumPy — NumPy
NumPy NumPy is the fundamental package for scientific computing with Python. It contains among other things: a powerful N-dimensional array object sophisticated (broadcasting) functions tools for integrating C/C++ and Fortran code useful linear algebra, Fo
www.numpy.org
NumPy is the fundamental package for scientific computing with Python. It contains among other things:
- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities
Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.
NumPy is licensed under the BSD license, enabling reuse with few restrictions.
boto3 : Interacting for S3 https://boto3.amazonaws.com/v1/documentation/api/latest/index.html
Boto 3 Documentation — Boto 3 Docs 1.9.148 documentation
boto3.amazonaws.com
Boto is the Amazon Web Services (AWS) SDK for Python. It enables Python developers to create, configure, and manage AWS services, such as EC2 and S3. Boto provides an easy to use, object-oriented API, as well as low-level access to AWS services.
2. https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.seed.html
numpy.random.seed — NumPy v1.16 Manual
Parameters: seed : int or 1-d array_like, optional Seed for RandomState. Must be convertible to 32 bit unsigned integers.
docs.scipy.org
numpy.random.seed(seed=None)
Seed the generator.
This method is called when RandomState is initialized. It can be called again to re-seed the generator. For details, see RandomState.
Parameters:
seed : int or 1-d array_like, optional
Seed for RandomState. Must be convertible to 32 bit unsigned integers. |
See also
3. numpy.random.random_sample(size=None) : https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.random_sample.html?highlight=random%20random_sample#numpy.random.random_sample
numpy.random.random_sample — NumPy v1.16 Manual
Parameters: size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.
docs.scipy.org
numpy.random.randint : https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.randint.html
numpy.random.randint — NumPy v1.16 Manual
Parameters: low : int Lowest (signed) integer to be drawn from the distribution (unless high=None, in which case this parameter is one above the highest such integer). high : int, optional If provided, one above the largest (signed) integer to be drawn fro
docs.scipy.org
4.
5. df : https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html
pandas.DataFrame — pandas 0.24.2 documentation
Parameters: data : ndarray (structured or homogeneous), Iterable, dict, or DataFrame Dict can contain Series, arrays, constants, or list-like objects Changed in version 0.23.0: If data is a dict, argument order is maintained for Python 3.6 and later. index
pandas.pydata.org


6. df - Print values
7. 파일로 저장 pandas.DataFrame.to_csv : https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html
pandas.DataFrame.to_csv — pandas 0.24.2 documentation
Parameters: path_or_buf : str or file handle, default None File path or object, if None is provided the result is returned as a string. If a file object is passed it should be opened with newline=’‘, disabling universal newlines. Changed in version 0.24.0:
pandas.pydata.org
8. 함수 : 3개의 파라미터를 받음 - 파일을 S3에 저장하는 함수
9. 함수 : boto3를 사용해서 해당 파일을 S3 버킷으로부터 다운 받음
boto3.Session().resource('s3') : https://boto3.amazonaws.com/v1/documentation/api/latest/guide/session.html
Session — Boto 3 Docs 1.9.148 documentation
Session A session manages state about a particular configuration. By default a session is created for you when needed. However it is possible and recommended to maintain your own session(s) in some scenarios. Sessions typically store: Credentials Region Ot
boto3.amazonaws.com
10. 8번 함수를 실행시켜 해당 파일을 S3에 저장
upload_fileobj : https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html?highlight=upload_fileobj#S3.Bucket.upload_fileobj
S3 — Boto 3 Docs 1.9.148 documentation
The response of this operation contains an EventStream member. When iterated the EventStream will yield events based on the structure below, where only one of the top level keys will be present for any given event. Response Syntax { 'Payload': EventStream(
boto3.amazonaws.com
11. 9번 함수를 실행시켜 해당 파일을 S3로부터 다운 받음
Bucket object download_fileobj : https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html?highlight=download_fileobj#S3.Bucket.download_fileobj
S3 — Boto 3 Docs 1.9.148 documentation
The response of this operation contains an EventStream member. When iterated the EventStream will yield events based on the structure below, where only one of the top level keys will be present for any given event. Response Syntax { 'Payload': EventStream(
boto3.amazonaws.com
12.
13. 처음 시작 5개 데이터를 출력함
df.head() : https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.head.html
pandas.DataFrame.head — pandas 0.24.2 documentation
Return the first n rows. This function returns the first n rows for the object based on position. It is useful for quickly testing if your object has the right type of data in it. Parameters: n : int, default 5 Number of rows to select. Returns: obj_head :
pandas.pydata.org
14. 해당 컬럼들을 매트릭스에 담음 ???
pandas.DataFrame.as_matrix : https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.as_matrix.html?highlight=as_matrix#pandas.DataFrame.as_matrix
pandas.DataFrame.as_matrix — pandas 0.24.2 documentation
Parameters: columns : list, optional, default:None If None, return all columns, otherwise, returns specified columns.
pandas.pydata.org
15. X 값들
16. ???
17. y 컬럼을 매트릭스에 담음

18. y 값 형태. 10줄에 1개 컬럼
19. y 값
20. y 값을 한줄에 표시함
numpy.ravel : https://docs.scipy.org/doc/numpy/reference/generated/numpy.ravel.html
numpy.ravel — NumPy v1.16 Manual
Parameters: a : array_like Input array. The elements in a are read in the order specified by order, and packed as a 1-D array. order : {‘C’,’F’, ‘A’, ‘K’}, optional The elements of a are read using this index order. ‘C’ means to index the elements in row-m
docs.scipy.org
21. y 값
23. 함수 : 전달받은 파일을 ????
write_numpy_to_dense_tensor : https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/amazon/common.py
aws/sagemaker-python-sdk
A library for training and deploying machine learning models on Amazon SageMaker - aws/sagemaker-python-sdk
github.com
read_records
24. 함수 : 해당 파일을 ?????
25. write_recordio_file 함수를 실행 함


26. 첫 3 줄만 출력
27. read_recordio_file 함수 실행
32. 해당 파일을 S3에 저장함
33. 해당 파일을 S3에서 다운 받음
'IoT > AI' 카테고리의 다른 글
AWS SageMaker - xgboost : Linear Regression Straight Line Fit (0) | 2019.05.16 |
---|---|
AWS Machine Learning - Types of ML models etc. (0) | 2019.05.09 |
Elements of AI - 헬싱키 대학 인공지능 강좌 듣고 Deep Learning 앱 개발하기 (0) | 2018.06.30 |
Elements of AI - Summary (0) | 2018.06.29 |
Elements of AI - The societal implications of AI (0) | 2018.06.29 |
Elements of AI - About predicting the future (0) | 2018.06.27 |
Elements of AI - Advanced neural network techniques (0) | 2018.06.25 |
Elements of AI - How neural networks are built (0) | 2018.06.25 |
Elements of AI - Neural network basics (0) | 2018.06.23 |
Elements of AI - Regression (0) | 2018.06.22 |