Which service should you use?

You are deploying an application to collect votes for a very popular television show. Millions of users will
submit votes using mobile devices. The votes must be collected into a durable, scalable, and highly available
data store for real-time public tabulation. Which service should you use?

You are deploying an application to collect votes for a very popular television show. Millions of users will
submit votes using mobile devices. The votes must be collected into a durable, scalable, and highly available
data store for real-time public tabulation. Which service should you use?

A.
Amazon DynamoDB

B.
Amazon Redshift

C.
Amazon Kinesis

D.
Amazon Simple Queue Service



Leave a Reply 29

Your email address will not be published. Required fields are marked *


JM

JM

Amazon DynamoDB, a managed NoSQL database that offers extremely fast performance, seamless scalability and reliability, low cost and more.

Could be A, however don`t feel to be an expert here.

Harsh Wardhan

Harsh Wardhan

C : “real-time public tabulation”

Frank

Frank

Im pretty sure its C as well

JK

JK

I agree that C is the best answer.

One of the use cases on the Kinesis page (https://aws.amazon.com/kinesis/streams/) matches this use case in the question.

“Mobile Data Capture
You can have your mobile applications push data to Amazon Kinesis Streams from hundreds of thousands of devices, making the data available to you as soon as it is produced on the mobile devices.”

mr_tienvu

mr_tienvu

I have the same idea. A

Muhammad Soliman

Muhammad Soliman

I agree with C. Amazon Kinesis

From “Real time public tabulation” I fee the following, also insights should be collected or gained in minutes rather than days.

Real-time data analytics: With Amazon Kinesis Streams, you can run real-time streaming data analytics. For example, you can add clickstreams to your Amazon Kinesis stream and have your Amazon Kinesis Application run analytics in real-time, enabling you to gain insights out of your data at a scale of minutes instead of hours or days.

Also, it is accepting hundreds of thousands of inputs from Mobile devices at the same time.

Sanjeev

Sanjeev

Found answer on another website. I think this clarifies why Kinesis iw not right answer.

Redshift is for data warehouseing, not real time data.
Kinesis keeps data for 7 days maximum which IMO does not fit the use case.
SQS is for well, queue items, it’s not a data store

Bones Cisco

Bones Cisco

Amazon Kinesis Streams
Amazon Kinesis Streams enables you to build custom applications that process or analyze streaming data for specialized needs. Amazon Kinesis Streams can continuously capture and store terabytes of data per hour from hundreds of thousands of sources such as website clickstreams, financial transactions, social media feeds, IT logs, and location-tracking events. With Amazon Kinesis Client Library (KCL), you can build Amazon Kinesis Applications and use streaming data to power real-time dashboards, generate alerts, implement dynamic pricing and advertising, and more. You can also emit data from Amazon Kinesis Streams to other AWS services such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, Amazon Elastic Map Reduce (Amazon EMR), and AWS Lambda.

Rj

Rj

The votes must be collected into a durable, scalable, and highly available
data store for real-time public tabulation. Which service should you use?

data store for real time public tabulation is what you got to watch out.

Its dynamo-db as kenisis isnt a data store.

Right answer is dynamo db.

Vishnu Konatham

Vishnu Konatham

I will go with A.

Wasil

Wasil

Amazon Kinesis Streams manages the infrastructure, storage, networking, and configuration needed to stream your data at the level of your data throughput. You do not have to worry about provisioning, deployment, ongoing-maintenance of hardware, software, or other services for your data streams. In addition, Amazon Kinesis Streams synchronously replicates data across three facilities in an AWS Region, providing high availability and data durability.

SO, I go with C

ref : https://aws.amazon.com/kinesis/streams/faqs/

vladam

vladam

The main function of DynamoDB is to store data. Where as the main function of “Kinesis” is to analyze data in real-time. The requirement in the question is to find an AWS service which provides highly available “datastore”. Also, Kinesis keeps data for 7 days maximum which does not fit the use case: https://aws.amazon.com/kinesis/streams/faqs/

In my opinion C is the right answer.

vladam

vladam

Apologies, A is the right answer.

kirrim

kirrim

A, DynamoDB.

While it’s very likely that you would want to use something like Kinesis to collect the votes from the mobile devices (this is a great use case for Kinesis), you still need a data store to drop the votes into and from which you can tabulate the votes. And the question is specifically asking about the data store, not about the means of collecting the incoming votes from the mobile devices.

Possible candidate services for the data store could be RDS, DynamoDB, RedShift, or ElastiCache. (The question asks about which service you would want to use for the data store, which eliminates running a distributed database or cache on EC2 instances.)

RDS is not a good fit because these would be incoming writes from the Kinesis stream. RDS is not very scalable from a write standpoint (other than building a much larger instance, which only gets you so far with a scenario of this scale), nor is it as low-latency as DynamoDB, and you need low-latency for “real-time” tabulation.

DynamoDB is a great option here, since it fits the requirements for durability, scalability, and high availability, and can operate with low-latency to meet the “real-time” tabulation requirement. It can easily scale out to handle the large amount of incoming writes to increment the vote values for candidate names in a table as you pull data out of the Kinesis stream.

RedShift is probably not a ideal fit, since you’re not looking to do complex analytics across many records or tables taking up terebytes/petabytes of data here, you’re looking to just take incoming votes and increment the vote values for your candidate names as they come in. A few records in a single table should suffice to capture the max dozen or so possible TV show candidates voters would be choosing from. (I wouldn’t want to watch a TV show with so many candidates that you would need millions of records to track all of their names!). So while Kinesis Firehose ingesting to RedShift is an option here, it’s not as ideal an option as using DynamoDB for the data store.

ElastiCache would definitely be low latency, and could even do the vote tabulation for you in real time upon data insertion into the cache if you used Redis with sorted data sets. And selecting Redis would also get you the durability option with backups to S3, which you couldn’t get with Memcached. Really the only knock on ElastiCache running Redis for this scenario would be that you can’t scale it horizontally for writes with sorted data sets, which eliminates it as an option for something of this scale. ElastiCache with Memcached might meet the scalability requirements because you could shard it horizontially for writes, but Memcached doesn’t have the option to back up its data to S3 like Redis does, so it wouldn’t meet the durability requirement. If you lose a Memcached node, the data on it is gone.

So for the possible data stores, RDS, RedShift, and ElastiCache are out. That just leaves DynamoDB. Which is a great candidate, and is one of the possible answers.

Paul

Paul

Its C

A

Millions of votes and real time data tabulation are the requirements.-so its Kinesis. See the FAQ https://aws.amazon.com/kinesis/streams/faqs/

“Amazon Kinesis Streams synchronously replicates data across three facilities in an AWS Region, providing high availability and data durability”

And building analysis / dashboarding in real time is one of the AWS suggested use cases

The other reason not to go with Dynamo is you would be getting millions of writes in a short time frame meaning you’d need a lot of Write Capacity Units which is expensive. You could throttle the writes but then you wouldnt have real time analysis

Rekha

Rekha

Amazon Kinesis for real-time tabulation

Halloween

Halloween

I’m sure it’s C, but the question should be worded a little different.
It should say “which service should be used for collection”.

The way it’s worded, it makes us think “which service should be used for storage”.

Parmod Kumar

Parmod Kumar

Dynamo DB is NO SQL so no table. So no “real-time public tabulation”. I would go with C.