반응형
블로그 이미지
개발자로서 현장에서 일하면서 새로 접하는 기술들이나 알게된 정보 등을 정리하기 위한 블로그입니다. 운 좋게 미국에서 큰 회사들의 프로젝트에서 컬설턴트로 일하고 있어서 새로운 기술들을 접할 기회가 많이 있습니다. 미국의 IT 프로젝트에서 사용되는 툴들에 대해 많은 분들과 정보를 공유하고 싶습니다.
솔웅

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

카테고리

[AWS Certificate] Developer - DynamoDB memo

2017. 11. 14. 09:57 | Posted by 솔웅


반응형

DynamoDB from CloudGuru lectures



=====================================================

============= DynamoDB ====================

=====================================================



What is DynamoDB? (***********)






Amazon DynamoDB is a fast and flexible NoSQL database service for all applications that need consistent, single-digit millisecond latency at any scale. It is a fully managed database and supports both document and key-value data models. Its flexible data model and reliable performance make it a great fit for mobile, web, gaming, ad-tech, IoT, and many other applications.







Quick facts about DynamoDB


- Stored on SSD storage

- Spread across 3 geographically distinct data centers


- Eventual Consistent Reads (Default)

  : Consistency across all copies of data is usually reached within a second. Repeating a read after a short time should return the updated data. (Best Read Performance)

  

- Strongly Consistent Reads

  : A strongly consistent read returns a result that reflects all writes that received a successful response prior to the read.

  


The Basics


- Tables

- Items (Think a row of data in table)

- Attributes (Think of a column of data in a table)



Pricing


- Provisioned THroughput Capacity

  : Write Throughput $0.0065 per hour for every 10 units

  : Read Throughput $0.0065 per hour for every 50 units

  

- First 25 GB stored per month is free

- Storage costs of $0.25 GB per month there after.


Pricing Example


Let's assume that your application needs to perform 1 million writes and 1 million reads per day, while storing 28 GB of data.


First, you need to calculate how many writes and reads per seconds you need. 1 million evenly spread writes per day is equivalent to 1,000,000 (writes) / 24 (hours) / 60 (minutes) / 60 (seconds) = 11.6 writes per second.


A dynamoDB Write capacity unit can handle 1 write per second, so you need 12 write capacity units. For write throughput, you are charged on $0.0065 for every 10 units.


So ($0.0065/10) * 12 * 24 = $0.1872 per day.


Similarly, to handle 1 million strongly consistent reads per day, you need 12 read capacity units. For read throughput you are charged $0.0065 for every 50 units.


So ($0.0065/50) * 12 * 24 = $0.0374 per day.


Storage costs is $0.25 per GB per month. Lets assume our database is 28 GB. We get the first 25 GB for free so we only pay for 3 GB of storage which is $0.75 per month.


Total Cost = $0.1872 per day + $0.0374 per day Plus Storage of 0.75 per month


(30 X ($0.1872 + $0.0372)) $0.75 = $7.488


With free tier you get

25 read capacity units

25 write capacity units


Easiest way to learn DynamoDB?


- Let's start our first Lab


======================================================


Creating a DynamoDB Table


Create a Role - Dynamo full access

Create a instance - Assign the Role to the instance


#!/bin/bash

yum update -y

yum install httpd24 php56 git -y

service httpd start

chkconfig httpd on

cd /var/www/html

echo "<?php phpinfo();?>" > test.php

git clone https://github.com/acloudguru/dynamodb



1178578-C02NW6G1G3QD:AWS_SSH changsoopark$ ssh ec2-user@52.91.230.105 -i EC2KeyPair.pem.txt 

The authenticity of host '52.91.230.105 (52.91.230.105)' can't be established.

ECDSA key fingerprint is SHA256:Zo4LcW4QASmSaf4H4kg5ioPGeqLicxV8TsJ+/JTQVj0.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added '52.91.230.105' (ECDSA) to the list of known hosts.


       __|  __|_  )

       _|  (     /   Amazon Linux AMI

      ___|\___|___|


https://aws.amazon.com/amazon-linux-ami/2017.09-release-notes/

[ec2-user@ip-172-31-85-82 ~]$ sudo su

[root@ip-172-31-85-82 ec2-user]# cd /var/www/html

[root@ip-172-31-85-82 html]# ls

dynamodb  test.php

[root@ip-172-31-85-82 html]# curl -sS https://getcomposer.org/installer | php

All settings correct for using Composer

Downloading...


Composer (version 1.5.2) successfully installed to: /var/www/html/composer.phar

Use it: php composer.phar


[root@ip-172-31-85-82 html]# php composer.phar require aws/aws-sdk-php

Do not run Composer as root/super user! See https://getcomposer.org/root for details

Using version ^3.38 for aws/aws-sdk-php

./composer.json has been created

Loading composer repositories with package information

Updating dependencies (including require-dev)

Package operations: 6 installs, 0 updates, 0 removals

  - Installing mtdowling/jmespath.php (2.4.0): Downloading (100%)         

  - Installing psr/http-message (1.0.1): Downloading (100%)         

  - Installing guzzlehttp/psr7 (1.4.2): Downloading (100%)         

  - Installing guzzlehttp/promises (v1.3.1): Downloading (100%)         

  - Installing guzzlehttp/guzzle (6.3.0): Downloading (100%)         

  - Installing aws/aws-sdk-php (3.38.0): Downloading (100%)         

guzzlehttp/guzzle suggests installing psr/log (Required for using the Log middleware)

aws/aws-sdk-php suggests installing aws/aws-php-sns-message-validator (To validate incoming SNS notifications)

aws/aws-sdk-php suggests installing doctrine/cache (To use the DoctrineCacheAdapter)

Writing lock file

Generating autoload files

[root@ip-172-31-85-82 html]# cd dynamodb

[root@ip-172-31-85-82 dynamodb]# ls -l

total 24

-rw-r--r-- 1 root root  4933 Nov  9 00:32 createtables.php

-rw-r--r-- 1 root root    11 Nov  9 00:32 README.md

-rw-r--r-- 1 root root 11472 Nov  9 00:32 uploaddata.php

[root@ip-172-31-85-82 dynamodb]# nano createtables.php

==> update the Region info - create and update php




http://52.91.230.105/dynamodb/createtables.php


==> will create 4 dynamoDB tables


==> 

Creating table ProductCatalog... Creating table Forum... Creating table Thread... Creating table Reply... Waiting for table ProductCatalog to be created. Table ProductCatalog has been created. Waiting for table Forum to be created. Table Forum has been created. Waiting for table Thread to be created. Table Thread has been created. Waiting for table Reply to be created. Table Reply has been created.


Picture : DynamoDBCreated


http://52.91.230.105/dynamodb/uploaddata.php





===============================================


DynamoDB Indexes & Streams


* Primary Keys


Tow Types of Primary Keys available

- Single Attribute (think unique ID)

  : Partition Key (Hash Key) composed of one attribute


- Composite (think unique ID and a date range)

  : Partition Key & Sort Key (Hash & Range) composed of two attributes

  

Partition Key

- DynamoDB uses the partition key's value as input to an internal hash function. The output from the hash function determines the partition (this is simply the physical location in which the data is stored)

- No two items in a table can have the same partition key value (*****)


Partition Key and Sort key

- DynamoDB uses the partition key's value as input to an internal hash function. the output from the hash function determines the partition (this is simply the physical location in which the data is stored)

- Two items can have the same partition key, but they must have a different sort key

- All items with the same partition key are stored together, in sorted order by sort key value


* Indexes (***)


Local Secondary Index

- Has the SAME Partition key, different sort key

- Can ONLY be created when creating a table. They cannot be removed or modified later.


Global Secondary Index

- Has DIFFERENT Partition key and different sort key

- Can be created at table creation or added LATER


Used to capture any kind of modification of the DynamoDB tables

- If a new item is added to the table, the stream captures an image of the entire item, including all of its attributes

- If an item is updated, the stream captures the "before" and "after" image of any attributes that were modified in the item

- If an item is deleted from the table, the stream captures an image of the entire item before it was deleted


DynamoDB Streams






Practice - Tabs

Overview, Items, Metrics, Alarms, Capacity, Indexes, Triggers, Access control, Tags


=========================================


Scan vs. Query API Calls



What is a Query?


- A Query operation finds items in a table using only primary key attribute values. You must provide a partition attribute name and a distinct value to search for.


- You can optionally provide a sort key attribute name and value, and use a comparison operator to refine the search results.


- By default, a Query returns all of the data attributes for items with the specified primary key(s); however, you can use the ProjectionExpression parameter so that the Query only returns some of the attributes, rather than all of them


- Query results are always sorted by the sort key. If the data type of the sort key is a number, the results are returned in numeric order. otherwise, the results are returned in order of ASCII character code values. By default, the sort order is ascending. To reverse the order set the ScanIndexForward parameter to false.


- By Default is eventually consistent but can be changed to be strongly consistent.



What is a Scan?


- A Scan operation examines every item in the table. By default, a Scan returns all of the data attributes for every item. however, you can use the ProjectionExpression parameter so that the Scan only returns some of the attributes, rather than all of them


What should I use? Query vs. Scan?


Generally, a Query operation is more efficient than a Scan operation.


A Scan operation always scans the entire table, then filters out values to provide the desired result, essentially adding the extra step of removing data from the result set. Avoid using a Scan operation on a large table with a filter that removes many results, if possible. Also, as a table grows, the Scan operation slows. The Scan operation examines every item for the requested values, and can use up the provisioned throughput for a large table in a single operation


For quicker response times, design your tables in a way that can use the Query, Get, or BatchGetItem APIs, instead. Alternatively, design your application to use Scan operations in a way that minimizes the impact on your table's request rate.





Query & Scans Exam Tips


- A Query operation finds items in a table using only primary key attribute values. You must provide a partition key attribute name and a distinct value to search for


- A Scan operation examines every item in the table. By default, a Scan returns all of the data attributes for every item. however, you can use the ProjectionExpression parameter so that the Scan only returns some of the attributes, rather than all of them


- Query results are always sorted by the sort key in ascending order. Set ScanIndexForward parameter to false to reverse it.


- Try to use a query operation over a Scan operation as it is more efficient


=======================================


DynamoDB Provisioned Throughput Calculations (***)


- Unit of Read provisioned throughput

  : All reads are rounded up to increments of 4KB

  : Eventually Consistent Reads (default) consist of 2 reads per second

  : Strongly Consistent Reads consist of 1 read per second

  

- Unit of Write provisioned throughput

  : All writes are 1 KB

  : All writes consist of 1 write per second

  

The Magic Formula


Question 1 - You have an application that requires to read 10 items of 1 KB per second using evnetual consistency. What should you set the read throughput to?


(Size of Read rounded to nearest 4 KB chunk/ 4KB) X no of items = read throughput


Divide by 2 if eventually consistent


- First we calculate how many read units per item we need


- 1 KB rounded to the nearest 4 KB increment = 4

- 4 KB / 4KB = 1 read unit per item


- 1 X 10 read items = 10

- Using eventual consistency we get 10 / 2 = 5

- 5 units of read throughput



Question 2

You have an application that requires to read 10 items of 6 KB per second using eventual consistency. What should you set the read throughput to?


- First we calculate how many read units per item we need

- 6 KB rounded up to nearest increment of 4 KB is 8 KB

- 8 KB / 4 KB = 2 read units per item


- 2 X 10 read items = 20

- Using eventual consistency we get 20 / 2 = 10


- 10 units of read throughput



Question 3


You have an application that requires to read 5 items of 10 KB per second using eventual consistency. What should you set the read throughput to?


- First we calculate how many read units per item we need

- 10 KB rounded up to nearest increment of 4 KB is 12 KB

- 12 KB / 4 KB = 3 read units per item.


- 3 X 5 read items = 15

- Using eventual consistency we get 15 / 2 = 7.5


- 8 units of read throughput



Question 4 - STRONG CONSISTENCY


You have an application that requires to read 5 items of 10 KB per second using strong consistency. What should you set the read throughput to?


- First we calculate how many read units per item we need 

- 10 KB rounded up to nearest increment of 4 KB is 12 KB

- 12 KB / 4 KB = 3 read units per item


- 3 X 5 read items = 15

- Using strong consistency we Don't divide by 2


- 15 units of read throughput



Question 5 - WRITE THROUGHPUT


You have an application that requires to write 5 items, with each item being 10 KB in size per second. What should you set the write throughput to?


- Each write unit consist of 1 KB of data. You need to write 5 items per second with each item using 10 KB of data


- 5 X 10 KB = 50 write units


- Write throughput of 50 Units



Question 6 - WRITE THROUGHPUT


You have an application that requires to write 12 items of 100 KB per item each second. What should you set the write throughput to?


- Each write unit consist of 1 KB of data. You need to write 12 items per second with each item having 100 KB of data.


- 12 X 100 KB = 12 write units


- Write throughput of 1200 Units



Error Code


400 HTTP Status Code - ProvisionedTHroughputExceededException


You exceeded your maximum allowed provisioned throughput for a table or for one or more global secondary indexes.





========================================





Using Web Identity Providers with DynamoDB


Web Identity Providers


You can authenticate users using Web Identity providers (such as Facebook, Google, Amazon or any other Open-ID Connect-compatible Identity provider). This is done using AssumeRoleWithWebIdentity API.


You will need to create a role first.


1. Web Identity Token

2. App ID of provider

3. ARN of Role

a. AccessKeyID

   SecretAccessKey

   SessionToken

b. Expiration (time limit)

c. AssumeRoleID

d. SubjectFromWebIdentityToken

(the unique ID that appears in an IAM policy variable for this particular identity provider)



Steps taken to authenticate


1. User Authenticates with ID provider (such as facebook)

2. They are passed a Token by their ID provider

3. Your code calls AssumeRoleWithWebIdentity API and provides the providers token and specifies the ARN for the IAM Role

4. App can now access Dynamodb from between 15 minutes to 1 hour (default is 1 hour)


========================================


Other important aspects of DynamoDB


Conditional Writes




If item = $10 then update to $12


Note that conditional writes are idempotent. This means that you can send the same conditional write request multiple times, but it will have no further effect on the item after the first time DynamoDB performs the specified update. For example, suppose you issue a request to update the price of a book item by 10%, with the expectation that the price is currently $20. However, before you get a response, a network error occurs and you don't know whether your request was successful or not. Because a conditional update is an idempotent operation, you can send the same request again. and DynamoDB will update the price only if the current price is still $20.



Atomic Counters


DynamoDB supports atomic counters, where you use UpdateItem operation to increment or decrement the value of an existing attribute without interfering with other write requests. (All write requests are applied in the order in which they were received.) For example, a web application might want to maintain a counter per visitor to their site. In this case, the application would need to increment this counter regardless of its current value.



Batch Operations


If your application needs to read multiple items, you can use the BatchGetItem API. A single BatchGetItem request can retrieve up to 1 MB of data, which can contain as many as 100 items. In addition, a single BatchGetItem request can retrieve items from multiple tables.



===============================================




반응형