DynamoDB from CloudGuru lectures
=====================================================
============= DynamoDB ====================
=====================================================
What is DynamoDB? (***********)
Amazon DynamoDB is a fast and flexible NoSQL database service for all applications that need consistent, single-digit millisecond latency at any scale. It is a fully managed database and supports both document and key-value data models. Its flexible data model and reliable performance make it a great fit for mobile, web, gaming, ad-tech, IoT, and many other applications.
Quick facts about DynamoDB
- Stored on SSD storage
- Spread across 3 geographically distinct data centers
- Eventual Consistent Reads (Default)
: Consistency across all copies of data is usually reached within a second. Repeating a read after a short time should return the updated data. (Best Read Performance)
- Strongly Consistent Reads
: A strongly consistent read returns a result that reflects all writes that received a successful response prior to the read.
The Basics
- Tables
- Items (Think a row of data in table)
- Attributes (Think of a column of data in a table)
Pricing
- Provisioned THroughput Capacity
: Write Throughput $0.0065 per hour for every 10 units
: Read Throughput $0.0065 per hour for every 50 units
- First 25 GB stored per month is free
- Storage costs of $0.25 GB per month there after.
Pricing Example
Let's assume that your application needs to perform 1 million writes and 1 million reads per day, while storing 28 GB of data.
First, you need to calculate how many writes and reads per seconds you need. 1 million evenly spread writes per day is equivalent to 1,000,000 (writes) / 24 (hours) / 60 (minutes) / 60 (seconds) = 11.6 writes per second.
A dynamoDB Write capacity unit can handle 1 write per second, so you need 12 write capacity units. For write throughput, you are charged on $0.0065 for every 10 units.
So ($0.0065/10) * 12 * 24 = $0.1872 per day.
Similarly, to handle 1 million strongly consistent reads per day, you need 12 read capacity units. For read throughput you are charged $0.0065 for every 50 units.
So ($0.0065/50) * 12 * 24 = $0.0374 per day.
Storage costs is $0.25 per GB per month. Lets assume our database is 28 GB. We get the first 25 GB for free so we only pay for 3 GB of storage which is $0.75 per month.
Total Cost = $0.1872 per day + $0.0374 per day Plus Storage of 0.75 per month
(30 X ($0.1872 + $0.0372)) $0.75 = $7.488
With free tier you get
25 read capacity units
25 write capacity units
Easiest way to learn DynamoDB?
- Let's start our first Lab
======================================================
Create a Role - Dynamo full access
Create a instance - Assign the Role to the instance
#!/bin/bash
yum update -y
yum install httpd24 php56 git -y
service httpd start
chkconfig httpd on
cd /var/www/html
echo "<?php phpinfo();?>" > test.php
git clone https://github.com/acloudguru/dynamodb
1178578-C02NW6G1G3QD:AWS_SSH changsoopark$ ssh ec2-user@52.91.230.105 -i EC2KeyPair.pem.txt
The authenticity of host '52.91.230.105 (52.91.230.105)' can't be established.
ECDSA key fingerprint is SHA256:Zo4LcW4QASmSaf4H4kg5ioPGeqLicxV8TsJ+/JTQVj0.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '52.91.230.105' (ECDSA) to the list of known hosts.
__| __|_ )
_| ( / Amazon Linux AMI
___|\___|___|
https://aws.amazon.com/amazon-linux-ami/2017.09-release-notes/
[ec2-user@ip-172-31-85-82 ~]$ sudo su
[root@ip-172-31-85-82 ec2-user]# cd /var/www/html
[root@ip-172-31-85-82 html]# ls
dynamodb test.php
[root@ip-172-31-85-82 html]# curl -sS https://getcomposer.org/installer | php
All settings correct for using Composer
Downloading...
Composer (version 1.5.2) successfully installed to: /var/www/html/composer.phar
Use it: php composer.phar
[root@ip-172-31-85-82 html]# php composer.phar require aws/aws-sdk-php
Do not run Composer as root/super user! See https://getcomposer.org/root for details
Using version ^3.38 for aws/aws-sdk-php
./composer.json has been created
Loading composer repositories with package information
Updating dependencies (including require-dev)
Package operations: 6 installs, 0 updates, 0 removals
- Installing mtdowling/jmespath.php (2.4.0): Downloading (100%)
- Installing psr/http-message (1.0.1): Downloading (100%)
- Installing guzzlehttp/psr7 (1.4.2): Downloading (100%)
- Installing guzzlehttp/promises (v1.3.1): Downloading (100%)
- Installing guzzlehttp/guzzle (6.3.0): Downloading (100%)
- Installing aws/aws-sdk-php (3.38.0): Downloading (100%)
guzzlehttp/guzzle suggests installing psr/log (Required for using the Log middleware)
aws/aws-sdk-php suggests installing aws/aws-php-sns-message-validator (To validate incoming SNS notifications)
aws/aws-sdk-php suggests installing doctrine/cache (To use the DoctrineCacheAdapter)
Writing lock file
Generating autoload files
[root@ip-172-31-85-82 html]# cd dynamodb
[root@ip-172-31-85-82 dynamodb]# ls -l
total 24
-rw-r--r-- 1 root root 4933 Nov 9 00:32 createtables.php
-rw-r--r-- 1 root root 11 Nov 9 00:32 README.md
-rw-r--r-- 1 root root 11472 Nov 9 00:32 uploaddata.php
[root@ip-172-31-85-82 dynamodb]# nano createtables.php
==> update the Region info - create and update php
http://52.91.230.105/dynamodb/createtables.php
==> will create 4 dynamoDB tables
==>
Creating table ProductCatalog... Creating table Forum... Creating table Thread... Creating table Reply... Waiting for table ProductCatalog to be created. Table ProductCatalog has been created. Waiting for table Forum to be created. Table Forum has been created. Waiting for table Thread to be created. Table Thread has been created. Waiting for table Reply to be created. Table Reply has been created.
Picture : DynamoDBCreated
http://52.91.230.105/dynamodb/uploaddata.php
===============================================
* Primary Keys
Tow Types of Primary Keys available
- Single Attribute (think unique ID)
: Partition Key (Hash Key) composed of one attribute
- Composite (think unique ID and a date range)
: Partition Key & Sort Key (Hash & Range) composed of two attributes
Partition Key
- DynamoDB uses the partition key's value as input to an internal hash function. The output from the hash function determines the partition (this is simply the physical location in which the data is stored)
- No two items in a table can have the same partition key value (*****)
Partition Key and Sort key
- DynamoDB uses the partition key's value as input to an internal hash function. the output from the hash function determines the partition (this is simply the physical location in which the data is stored)
- Two items can have the same partition key, but they must have a different sort key
- All items with the same partition key are stored together, in sorted order by sort key value
* Indexes (***)
Local Secondary Index
- Has the SAME Partition key, different sort key
- Can ONLY be created when creating a table. They cannot be removed or modified later.
Global Secondary Index
- Has DIFFERENT Partition key and different sort key
- Can be created at table creation or added LATER
Used to capture any kind of modification of the DynamoDB tables
- If a new item is added to the table, the stream captures an image of the entire item, including all of its attributes
- If an item is updated, the stream captures the "before" and "after" image of any attributes that were modified in the item
- If an item is deleted from the table, the stream captures an image of the entire item before it was deleted
DynamoDB Streams
Practice - Tabs
Overview, Items, Metrics, Alarms, Capacity, Indexes, Triggers, Access control, Tags
=========================================
What is a Query?
- A Query operation finds items in a table using only primary key attribute values. You must provide a partition attribute name and a distinct value to search for.
- You can optionally provide a sort key attribute name and value, and use a comparison operator to refine the search results.
- By default, a Query returns all of the data attributes for items with the specified primary key(s); however, you can use the ProjectionExpression parameter so that the Query only returns some of the attributes, rather than all of them
- Query results are always sorted by the sort key. If the data type of the sort key is a number, the results are returned in numeric order. otherwise, the results are returned in order of ASCII character code values. By default, the sort order is ascending. To reverse the order set the ScanIndexForward parameter to false.
- By Default is eventually consistent but can be changed to be strongly consistent.
What is a Scan?
- A Scan operation examines every item in the table. By default, a Scan returns all of the data attributes for every item. however, you can use the ProjectionExpression parameter so that the Scan only returns some of the attributes, rather than all of them
What should I use? Query vs. Scan?
Generally, a Query operation is more efficient than a Scan operation.
A Scan operation always scans the entire table, then filters out values to provide the desired result, essentially adding the extra step of removing data from the result set. Avoid using a Scan operation on a large table with a filter that removes many results, if possible. Also, as a table grows, the Scan operation slows. The Scan operation examines every item for the requested values, and can use up the provisioned throughput for a large table in a single operation
For quicker response times, design your tables in a way that can use the Query, Get, or BatchGetItem APIs, instead. Alternatively, design your application to use Scan operations in a way that minimizes the impact on your table's request rate.
Query & Scans Exam Tips
- A Query operation finds items in a table using only primary key attribute values. You must provide a partition key attribute name and a distinct value to search for
- A Scan operation examines every item in the table. By default, a Scan returns all of the data attributes for every item. however, you can use the ProjectionExpression parameter so that the Scan only returns some of the attributes, rather than all of them
- Query results are always sorted by the sort key in ascending order. Set ScanIndexForward parameter to false to reverse it.
- Try to use a query operation over a Scan operation as it is more efficient
=======================================
DynamoDB Provisioned Throughput Calculations (***)
- Unit of Read provisioned throughput
: All reads are rounded up to increments of 4KB
: Eventually Consistent Reads (default) consist of 2 reads per second
: Strongly Consistent Reads consist of 1 read per second
- Unit of Write provisioned throughput
: All writes are 1 KB
: All writes consist of 1 write per second
The Magic Formula
Question 1 - You have an application that requires to read 10 items of 1 KB per second using evnetual consistency. What should you set the read throughput to?
(Size of Read rounded to nearest 4 KB chunk/ 4KB) X no of items = read throughput
Divide by 2 if eventually consistent
- First we calculate how many read units per item we need
- 1 KB rounded to the nearest 4 KB increment = 4
- 4 KB / 4KB = 1 read unit per item
- 1 X 10 read items = 10
- Using eventual consistency we get 10 / 2 = 5
- 5 units of read throughput
Question 2
You have an application that requires to read 10 items of 6 KB per second using eventual consistency. What should you set the read throughput to?
- First we calculate how many read units per item we need
- 6 KB rounded up to nearest increment of 4 KB is 8 KB
- 8 KB / 4 KB = 2 read units per item
- 2 X 10 read items = 20
- Using eventual consistency we get 20 / 2 = 10
- 10 units of read throughput
Question 3
You have an application that requires to read 5 items of 10 KB per second using eventual consistency. What should you set the read throughput to?
- First we calculate how many read units per item we need
- 10 KB rounded up to nearest increment of 4 KB is 12 KB
- 12 KB / 4 KB = 3 read units per item.
- 3 X 5 read items = 15
- Using eventual consistency we get 15 / 2 = 7.5
- 8 units of read throughput
Question 4 - STRONG CONSISTENCY
You have an application that requires to read 5 items of 10 KB per second using strong consistency. What should you set the read throughput to?
- First we calculate how many read units per item we need
- 10 KB rounded up to nearest increment of 4 KB is 12 KB
- 12 KB / 4 KB = 3 read units per item
- 3 X 5 read items = 15
- Using strong consistency we Don't divide by 2
- 15 units of read throughput
Question 5 - WRITE THROUGHPUT
You have an application that requires to write 5 items, with each item being 10 KB in size per second. What should you set the write throughput to?
- Each write unit consist of 1 KB of data. You need to write 5 items per second with each item using 10 KB of data
- 5 X 10 KB = 50 write units
- Write throughput of 50 Units
Question 6 - WRITE THROUGHPUT
You have an application that requires to write 12 items of 100 KB per item each second. What should you set the write throughput to?
- Each write unit consist of 1 KB of data. You need to write 12 items per second with each item having 100 KB of data.
- 12 X 100 KB = 12 write units
- Write throughput of 1200 Units
Error Code
400 HTTP Status Code - ProvisionedTHroughputExceededException
You exceeded your maximum allowed provisioned throughput for a table or for one or more global secondary indexes.
========================================
Using Web Identity Providers with DynamoDB
Web Identity Providers
You can authenticate users using Web Identity providers (such as Facebook, Google, Amazon or any other Open-ID Connect-compatible Identity provider). This is done using AssumeRoleWithWebIdentity API.
You will need to create a role first.
1. Web Identity Token
2. App ID of provider
3. ARN of Role
a. AccessKeyID
SecretAccessKey
SessionToken
b. Expiration (time limit)
c. AssumeRoleID
d. SubjectFromWebIdentityToken
(the unique ID that appears in an IAM policy variable for this particular identity provider)
Steps taken to authenticate
1. User Authenticates with ID provider (such as facebook)
2. They are passed a Token by their ID provider
3. Your code calls AssumeRoleWithWebIdentity API and provides the providers token and specifies the ARN for the IAM Role
4. App can now access Dynamodb from between 15 minutes to 1 hour (default is 1 hour)
========================================
Other important aspects of DynamoDB
If item = $10 then update to $12
Note that conditional writes are idempotent. This means that you can send the same conditional write request multiple times, but it will have no further effect on the item after the first time DynamoDB performs the specified update. For example, suppose you issue a request to update the price of a book item by 10%, with the expectation that the price is currently $20. However, before you get a response, a network error occurs and you don't know whether your request was successful or not. Because a conditional update is an idempotent operation, you can send the same request again. and DynamoDB will update the price only if the current price is still $20.
DynamoDB supports atomic counters, where you use UpdateItem operation to increment or decrement the value of an existing attribute without interfering with other write requests. (All write requests are applied in the order in which they were received.) For example, a web application might want to maintain a counter per visitor to their site. In this case, the application would need to increment this counter regardless of its current value.
If your application needs to read multiple items, you can use the BatchGetItem API. A single BatchGetItem request can retrieve up to 1 MB of data, which can contain as many as 100 items. In addition, a single BatchGetItem request can retrieve items from multiple tables.
===============================================
'IoT > AWS Certificate' 카테고리의 다른 글
[AWS Certificate] Developer - Route53 memo (0) | 2017.11.25 |
---|---|
[AWS Certificate] Developer - CloudFormation, Shared Responsibility Model and DNS Basic (0) | 2017.11.21 |
[AWS Certificate] Developer - SNS, SWF and Elastic Beanstalk (0) | 2017.11.17 |
[AWS Certificate] Developer - SQS (Simple Queue Service) (0) | 2017.11.16 |
[AWS Certificate] Developer - DynamoDB Summary & Quiz (0) | 2017.11.15 |
[AWS Certificate] Developer - Databases Overview & Concepts (0) | 2017.11.08 |
[AWS Certificate] Developer - S3 Summary and Quiz (1) | 2017.11.07 |
[AWS Certificate] Developer - S3 Memo - from Cloud Guru Lecture (0) | 2017.11.03 |
[AWS Certificate] Developer - EC2 - Summary & Exam TIps (A Cloud Guru) (0) | 2017.10.18 |
[AWS Certificate] Developer - ELB, SDK and Lambda memo (0) | 2017.10.17 |