반응형
블로그 이미지
개발자로서 현장에서 일하면서 새로 접하는 기술들이나 알게된 정보 등을 정리하기 위한 블로그입니다. 운 좋게 미국에서 큰 회사들의 프로젝트에서 컬설턴트로 일하고 있어서 새로운 기술들을 접할 기회가 많이 있습니다. 미국의 IT 프로젝트에서 사용되는 툴들에 대해 많은 분들과 정보를 공유하고 싶습니다.
솔웅

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

카테고리


반응형

S3 ( Simple Storage Service)


S3 provides developers and IT teams with secure, durable, highly-scalable object storage. Amazon S3 is easy to use, with a simple web services interface to store and retrieve any amount of data from anywhere on the web.


S3 is a safe place to store your files.

It is Object based storage.

The data is spread across multiple devices and facilities.


The Basics

- S3 is Object based i.e. allows you to upload files.

- Files can be from 0 Bytes to 5TB

- There is unlimited storage

- Files are stored in Buckets.

- S3 is a universal namespace, that is, names must be unique globally.

- https://s3-eu-west-1.amazonaws.com/acloudguru

- When you upload a file to S3 you will receive a HTTP 200 code if the upload was successful.


Data Consistency Model For S3 (***)

- Read after Write consistency for PUTS of new objects

- Eventual Consistency for overwrite PUTS and DELETES (can take some time to propagate)


S3 is a simple key, value store

- S3 is Object based. Objects consist of the following

: Key (this is simply the name of the object)

: Value (This is simply the data and is made up of a sequence of bytes)

: Version ID (Important for versioning)

: Metadata (Data about the data you are storing)

: Subresources

  Access Control Lists

  Torrent

: Built for 99.99% availability for the S3 platform

: Amazon guarantees 99.999999999% durability for S3 information (Remember 11X9's)

: Tiered Storage Available

: Lifecycle Management

: Versioning

: Encryption

: Secure your data using Access Control Lists and Bucket Policies


Storage Tiers/Classes

: S3 - 99.99% availability, 99.999999999% durability, stored redundantly across multiple devices in multiple facilities and is designed to sustain the loss of 2 facilities concurrently

: S3 - IA (Infrequently Accessed) For data that is accessed less frequently, but requires rapid access when needed. Lower fee than S3, but you are charged a retrieval fee.

: Reduced Redundancy Storage - Designed to provide 99.99% durability and 99.99% availability of objects over a given year.

: Glacier - Very cheap, but used for archival only. It takes 3-5 hours to restore from Glacier



What is Glacier?


Glacier is an extremely low-cost storage service for data archival. Amazon Glacier stores data for as little as $0.01 per gigabyte per month, and is optimized for data that is infrequently accessed and for which retrieval times of 3 to 5 hours are suitable.



S3- Charges

-  Charged for

: Storage

: Requests

: Storage Management Pricing

: Data Transfer Pricing

: Transfer Acceleration



What is S3 Transfer Acceleration?



Amazon S3 Transfer Acceleration enables fast, easy and secure transfers of files over long distances between your end users and an S3 bucket.

Transfer Acceleration takes advantage of Amazon CloudFront's globally distributed edge location. As the data arrives at an edge location, data is routed to Amazon S3 over an optimized network path.


Exam Tips for S3 101

- Remember that S3 is Object based i.e. allows you to upload files.

- Files can be from 0 Bytes to 5TB

- There is unlimited storage

- Files are stored in Buckets

- S3 is a universal namespace, that is, names must be unique globally.

- https://s3-eu-west-1.amazonaws.com/acloudguru

- Read after Write consistency for PUTS of new Objects

- Eventual Consistency for overwrite PUTS and DELETES (can take some time to propagate)

- S3 Storage Classes/Tiers

: S3 (durable, immediately available, frequently accessed)

: S3 - IA (durable, immediately available, infrequently accessed)

: S3 - Reduced Redundancy Storage (data that is easily reproducible, such as thumb nails etc)

: Glacier - Archived data, where you can wait 3-5 hours before accessing.

- Remember the core fundamentals of an S3 object

: key (name)

: value (data

: Version ID

: Metadata

: Subresources

  ACL

  Torrent

- object based storage only (for files) (*****)

- Not suitable to install an operating system on. (*****)

- Successful uploads will generate a HTTP 200 status code.


- Read the S3 FAQ before taking the exam It comes up A LOT! (*****)


================


S3 Essencial - 


Bucket is just folder where you can upload files


- Buckets are a universal name space

- Upload an object to S3 receive a HTTP 200 Code

- S3, S3 - IA, S3 Reduced Redundancy Storage

- Encryption

: Client Side Encryption

: Server Side Encryption

  Server side encryption with Amazon S3 Managed Keys (SSE-S3)

  Server side encryption with KMS (SSE-KMS)

  Server side encryption with Customer Provided Keys (SSE-C)

- Control access to buckets using either a bucket ACL or using Bucket Polices

- BY DEFAULT BUCKETS ARE PRIVATE AND ALL OBJECTS STORED INSIDE THEM ARE PRIVATE


===================


Create a S3 Website


Static page only, no dynamic page (PHP etc.)

Format of URL : bucketname.s3-website-region.amazonaws.com


===================


Cross Origin Resource Sharing (CORS)


Cross-origin resource sharing (CORS) defines a way for client web applications that are loaded in one domain to interact with resources in a different domain. With CORS support in Amazon S3, you can build rich client-side web applications with Amazon S3 and selectively allow cross-origin access to your Amazon S3 resources.



Lambda -> Create Functions

Triggers of Lambda function - ?? *****

Amazon API Gateway

upload html files to S3

IAM - Role, Policy

Route 53 
- Register Domain

Is it really serverless?
- Vendor takes care of provisioning and management of servers
- Vendor is responsible for capacity provisioning and automated scaling
- Moving away from servers and infrastructure concerns should be your goal

=====================







Using Polly to help you pass your exam - A serverless approach

Polly 
- Text-to-Speech : Type statements -> can download it to mp3

Create S3 bucket - 2 buckets

Simple Notification Service (SNS)

DynamoDB table

IAM - create new role : Lambda - Add permissions - attach new policy

Lambda - Create 2 Lambda functions

Add Trigger : SNS 


============================

Using Polly to help you pass your exam - A serverless approach : Part 2

Create 3rd Lambda function (PostReader_GetPosts)

Amazon API Gateway - Create new API (PostReaderAPI)

Go to S3 and deploy the website

=============================


=============================

S3 - Versioning

S3 - Create a Bucket - Enable versioning

Bucket - upload a text file to the bucket - update the file and upload it again
- Click on Latest Version link -> can select a version from dropdown list

Delete the text file - initiate restore => can restore the deleted file
Actions - Delete the Delete Marker

* Stores all version sof an object (including all writes and even if you delete an object)
* Great backup tool
* Once enabled, Versioning cannot be disabled, only suspended.
* Integrates with Lifecycle rules 
* Versioning's MFA Delete capability, which uses multi-factor authentication, can be used to provide an additional layer of security.

==============================================

Cross region replication

S3 - Create a new bucket
Existing and new bucket would be in different region

Existing bucket - Management - Replication - Add Rule - Select options - Select Destination (new bucket) - Enable versioning - Change the storage class - Select IAM role - Save 
Replication enabled
Go to new bucket - not replicated yet
Commend line - pip install awscli etc. 

IAM - Create Group - Attach Policy 
Create a User - Access key ID - Secret....
Terminal - aws configure
Access key ID - 
Secret Access Key - 
default region name - 

aws s3 ls - will show buckets (now there are 2 buckets)

aws s3 cp --recursive s3://existing_bucket s3://new_bucket -> will copy the contents from existing to new bucket

Back to console and check the new bucket - will be the objects from existing bucket

* Versioning must be enabled on both the source and destination buckets.
* Regions must be unique
* Files in an existing bucket are not replicated automatically. All subsequent updated files will be replicated automatically
* You cannot replicate to multiple buckets or use daisy chaining (at this time.)
* Delete markers are replicated
* Deleting individual versions or delete markers will not be replicated
* Understand what Cross Region Replication is at a high level

===================================


Glacier - Data Archival

S3 - Create a bucket - Enable Versioning - all default 

Management - Lifecycle - add lifecycle rule - rule name - Current version, select transition to standard-IA after 30 days - add transition - Select transition to Amazon Glacier after 60 days - previous version - Transition to Standard-IA after 30 days - Select transition to Amazon Glacier after 60 days - Configure expiration - Current/previous version expire after 425 days - Save

* Can be used in conjunction with versioning
* Can be applied to current versions and previous versions
* Following actions can now be done
  - Transition to the Standard - infrequent Access Storage Class (128kb and 30 days after the creation date)
  - Archive to the Glacier Storage Class (30 days after IA, if relevant)
  - Permanently Delete
  
============================================

Cloud Front Overview





A content delivery network (CDN) is a system of distributed servers (network) that deliver webpages and other web content to a user based on the geographic locations of the user, the origin of the webpage and a content delivery server.

CloudFront - Key Terminology
* Edge Location - THis is the location where content will be cached. This is separate to an AWS Region/AZ
* Origin - THis is the origin of all the files that the CDN will distribute. This can be either an S3 Bucket, an EC2 instance, an Elastic Load Balancer or Route 53
* Distribution - THis is the name given the CDN which consistes of a collection of Edge Locations

What is CloudFront

Amazon CloudFront can be used to deliver your entire website, including dynamic, static, streaming, and interactive content using a global network of edge locations. Requests for your content are automatically routed to the nearest edge location, so content is delivered with the best possible performance.

Amazon CloudFront is optimized to work with other Amazon Web Services, like Amazon Simple Storage Service (Amazon S3), Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Load Balancing, and Amazon Route 53. Amazon CloudFront also works seamlessly with any non-AWS origin server, which stores the original, definitive versions of your file.



CloudFront - Key Terminology

* Web Distribution - Typically used for Websites
* RTMP - Used for Media Streaming

CloudFront - Exam Tips

* Edge Location - This is the location where content will be cached. This is separate to an AWS Region/AZ
* Origin - This is the origin of all the files that the CDN will distribute. This can  be either an S3 Bucket, an EC2 Instance, an Elastic Load Balancer or Route53
* Distribution - This is the name given the CDN which consists of a collection of Edge Locations
  - Web Distribution - Typically used for Websites
  - RTMP - Used for Media Streaming
* Edge locations are not just READ only, you can write to them too. (i.e. put an object on to them)
* Objects are cached for the life of the TTL (Time To Live)
* You can clear cached objects, but you will be charged.

=======================================================

Create CDN

S3 - Create a bucket - upload a file - public permission 
Cloud Front - Service - Distribution - get started (web) - fill in fields - Create

Exam Topic - Distribution - Web, RTMP *****
- Restriction Type : Whitelist, Blacklist
- Invalidations : 

S3- goto Bucket - open the file uploaded  ==> Go to CloudFront - Copy domain name - enter the domain name + /uploaded file name ==> loading faster

CloudFront - Paid service

==========================================================



==========================================================

S3 - Security & Encryption

* By default, all newly created buckets are PRIVATE
* You can setup access control to your buckets using
  - Bucket Policies
  - Access Control Lists
* S3 buckets can be configured to create access logs which log all requests made to the S3 bucket. This can be done to another bucket.

Encryption
* In Transit : 
  - SSL/TLS
* At Rest
  - Server Side Encryption
    : S3 Managed Keys - SSE-S3
    :AWS Key Management Service, managed Keys - SSE-KMS
    : Server Side Encryption with Customer Provided Keys - SSE-C
  - Client Side Encryption
    
==============================================





AWS Storage Gateway is a service that connects an on-premises software appliance with cloud-based storage to provide seamless and secure integration between an organization's on-premises IT environment and AWS's storage infrastructure. The service enables you to securely store data to the AWS cloud for scalable and cost-effective storage.

AWS Storage Gateway's software appliance is available for download as a virtual machine (VM) image that you install on a host in your datacenter. Storage Gateway supports either VMware ESXi or Microsoft Hyper-V. Once you've installed your gateway and associated it with your AWS account through the activation process, you can use the AWS Management Console to create the storage gateway option that is right for you.

* Four Types of Storage Gateways
- File Gateway (NFS)
- Volumes Gateway (iSCSI)
  : Stored Volumes
  : Cache Volumes
- Tape Gateway (VTL)

* File Gateway
Files are stored as objects in your S3 buckets, accessed through a Network File System (NFS) mount point. Ownership, permissions, and timestamps are durably stored in S3 in the user-metadata of the object associated with the file. Once objects are transferred to S3, they can be managed as native S3 objects, and bucket policies such as versioning, lifecycle management, and cross-region replication apply directly to objects stored in your bucket.

* Volume Gateway
The volume interface presents your applications with disk volumes using the iSCSI block protocol.
Data written to these volumes can be asynchronously backed up as point-in-time snapshots of your volumes, and stored in the cloud as Amazon EBS snapshots.
Snapshots are incremental backups that capture only changed blocks. All snapshot storage is also compressed to minimize your storage charges.

* Stored Volumes
Stored volumes let you store your primary data locally, while asynchronously backing up that data to AWS. Stored volumes provide your on-premises applications with low-latency access to their entire datasets, while providing durable, off-site backups. You can create storage volumes and mount them as iSCSI devices from your on-premises application servers. Data written to your stored volumes is stored on your on-premises storage hardware. This data is asynchronously backed up to Amazon Simple Storage Service (Amazon S3) in the form of Amazon Elastic Block Store (Amazon EBS) snapshots. 1GB - 16 TB in size for stored Volumes.

* Cached Volumes
Cached volumes let you use Amazon Simple Storage Service (Amazon S3) as your primary data storage while retaining frequently accessed data locally in your storage gateway. Cached volumes minimize the need to scale your on-premises storage infrastructure, while still providing your applications with low-latency access to their frequently accessed data. You can create storage volumes up to 32 TiB in size and attach to them as iSCSI devices from your on-premises application servers. Your gateway stores data that you write to these volumes in Amazon S3 and retains recently read data in your on-premises storage gateway's cache and upload buffer storage. 1GB-32TB in size for Cached Volumes.

* Tape Gateway
Tape Gateway offers a durable, cost-effective solution to archive your data in the AWS Cloud. The VTL interface it provides lets you leverage your existing tape-based backup application infrastructure to store data on virtual tape cartridges that you create on your tape gateway. Each tape gateway is preconfigured with a media changer and tape drives, which are available to your existing client backup applications as iSCSI devices. You add tape cartridges as you need to archive your data. Supported by NetBackup, Backup Exec, Veam etc.

Exam Tips

- File Gateway - For flat files, stored directly on S3.
- Volume Gateway
  : Stored Volumes - Entire Dataset is stored on site and is asynchronously backed up to S3
  : Cached Volumes - Entire Dataset is stored on S3 and the most frequently accessed data is cached on site.
- Gateway Virtual Tape Library (VTL)
  : Used for backup and uses popular backup applications like NetBackup, Backup Exec, Veam etc.
  
=======================================


Import/Export Disk

AWS Import/Export Disk accelerates moving large amounts of data into and out of the AWS cloud using portable storage devices for transport. AWS Import/Export Disk transfers your data directly onto and off of storage devices using Amazon's high-speed internal network and bypassing the Internet.

Types of Snowballs
* Snowball
* Snowball Edge
* Snowmobile




* Snowball
Snowball is a petabyte-scale data transport solution that uses secure appliances to transfer large amounts of data into and out of AWS. Using Snowball addresses common challenges with large-scale data transfers including high network costs, long transfer times, and security concerns. Transferring data with Snowball is simple, fast, secure, and can be as little as one-fifth the cost of high-speed Internet.

80TB snowball in all regions. Snowball uses multiple layers of security designed to protect your data including tamper-resistant enclosures, 256-bit encryption, and an industry-standard Trusted Platform Module (TPM) designed to ensure both security and full chain-of-custody of your data. Once the data transfer job has been processed and verified, AWS performs a software erasure of the Snowball appliance.

* Snowball Edge
AWS Snowball Edge is a 100TB data transfer device with on-board storage and compute capabilities. You can use Snowball Edge to move large amounts of data into and out of AWS, as a temporary storage tier for large local datasets, or to support local workloads in remote or offline locations.

Snowball Edge connects to your existing applications and infrastructure using standard storage interfaces, streamlining the data transfer process and minimizing setup and integration. Snowball Edge can cluster together to form a local storage tier and process your data on-premises, helping ensure your applications continue to run even when they are not able to access the cloud.

* Snowmobile
AWS Snowmobile is an Exabyte-scale data transfer service used to move extremely large amounts of data to AWS. You can transfer up to 100PB per Snowmobile, a 45-foot long ruggedized shipping container, pulled by a semi-trailer truck. Snowmobile makes it easy to move massive volumes of data to the cloud, including video libraries, image repositories, or even a complete data center migration. Transferring data with Snowmobile is secure, fast and cost effective.

Exam Tips

* Understand what Snowball is
* Understand what Import Export is
* Snowball Can
  : Import to S3
  : Export from S3
  
==================================================


S3 Transfer Acceleration utilities the CloudFront Edge Network to accelerate your uploads to S3. Instead of uploading directly to your S3 bucket, you can use a distinct URL to upload directly to an edge location which will then transfer that file to S3. You will get a distinct URL to upload to acloudguru.s3-accelerate.amazonaws.com

S3 - Create a bucket 
Properties - Transfer acceleration - Enabled - Click on the link in the popup window - Check upload speed (Speed Comparison)

========================================





반응형