You are the new IT architect in a company that operates a mobile sleep tracking application
When activated at night, the mobile app is sending collected data points of 1 kilobyte every 5 minutes to your
backend
The backend takes care of authenticating the user and writing the data points into an Amazon DynamoDB
table.
Every morning, you scan the table to extract and aggregate last night’s data on a per user basis, and store the
results in Amazon S3.
Users are notified via Amazon SMS mobile push notifications that new data is available, which is parsed and
visualized by (he mobile app Currently you have around 100k users who are mostly based out of North
America.
You have been tasked to optimize the architecture of the backend system to lower cost what would you
recommend? (Choose 2 answers)
A.
Create a new Amazon DynamoDB (able each day and drop the one for the previous day after its data is on
Amazon S3.
B.
Have the mobile app access Amazon DynamoDB directly instead of JSON files stored on Amazon S3.
C.
Introduce an Amazon SQS queue to buffer writes to the Amazon DynamoDB table and reduce provisioned
write throughput.
D.
Introduce Amazon Elasticache lo cache reads from the Amazon DynamoDB table and reduce provisioned
read throughput.
E.
Write data directly into an Amazon Redshift cluster replacing both Amazon DynamoDB and Amazon S3.
A and C
Courtesy of Laurentiu V.
A,C
A: you store around 1.2GB/hour (100000*1kb*60/5), most customers being in the US it means you would store that kind of data mostly over 10 hours, that’s 12GB/day. Storing that kind of data would be expensive so we drop the previous data that was already stored in S3.
C: Second most costly factor is your write units, using a SQS queue would split that in half (most customers being in north america).
C and D
From reading the whitepapers:
https://d0.awsstatic.com/whitepapers/performance-at-scale-with-amazon-elasticache.pdf
Elasticache isnt going to help you with dynamodb in this scenario. You basically only read the data once before its stored to S3. The cache would only be useful if you read things multiple times. I would probably also go A and C in this question…
A and C are the right answers.
B is wrong because it doesn’t help with reducing costs. You will still need to parse files and storing raw files in S3 is cheaper than in DynamoDB.
I think A & C