Your company is storing millions of sensitive transactions across thousands of 100-GB files that must be
encrypted in transit and at rest. Analysts concurrently depend on subsets of files, which can consume up to 5
TB of space, to generate simulations that can be used to steer business decisions. You are required to design
an AWS solution that can cost effectively accommodate the long-term storage and in-flight subsets of data.
A.
Use Amazon Simple Storage Service (S3) with server-side encryption, and run simulations on subsets in
ephemeral drives on Amazon EC2.
B.
Use Amazon S3 with server-side encryption, and run simulations on subsets in-memory on Amazon EC2.
C.
Use HDFS on Amazon EMR, and run simulations on subsets in ephemeral drives on Amazon EC2.
D.
Use HDFS on Amazon Elastic MapReduce (EMR), and run simulations on subsets in-memory on Amazon
Elastic Compute Cloud (EC2).
E.
Store the full data set in encrypted Amazon Elastic Block Store (EBS) volumes, and regularly capture
snapshots that can be cloned to EC2 workstations.
In-memory simulations? with datas that can reach 5TB? I’d go for C instead.
long term storage, I think s3-sse is suitable
I agree – A is the best option, as it has at-rest and in-transit encryption for the main datastore, and can handle 5TB in ephemeral drives