Which of these is the best system architectures for thi…

You work for a company that automatically tags photographs using artificial neural networks (ANNs), which run on GPUs
using C++. You receive millions of images at a time, but only 3 times per day on average. These images are loaded into
an AWS S3 bucket you control for you in a batch, and then the customer publishes a JSON-formatted manifest into
another S3 bucket you control as well. Each image takes 10 milliseconds to process using a full GPU. Your neural network
software requires 5 minutes to bootstrap. Image tags are JSON objects, and you must publish them to an S3 bucket.
Which of these is the best system architectures for this system?

You work for a company that automatically tags photographs using artificial neural networks (ANNs), which run on GPUs
using C++. You receive millions of images at a time, but only 3 times per day on average. These images are loaded into
an AWS S3 bucket you control for you in a batch, and then the customer publishes a JSON-formatted manifest into
another S3 bucket you control as well. Each image takes 10 milliseconds to process using a full GPU. Your neural network
software requires 5 minutes to bootstrap. Image tags are JSON objects, and you must publish them to an S3 bucket.
Which of these is the best system architectures for this system?

A.
Create an OpsWorks Stack with two Layers. The first contains lifecycle scripts for launching and bootstrapping an HTTP API on G2
instances for ANN image processing, and the second has an always-on instance which monitors the S3 manifest bucket for new files.
When a new file is detected, request instances to boot on the ANN layer. When the instances are booted and the HTTP APIs are up,
submit processing requests to individual instances.

B.
Make an S3 notification configuration which publishes to AWS Lambda on the manifest bucket. Make the Lambda create a
CloudFormation Stack which contains the logic to construct an autoscaling worker tier of EC2 G2 instances with the ANN code on
each instance. Create an SQS queue of the images in the manifest. Tear the stack down when the queue is empty.

C.
Deploy your ANN code to AWS Lambda as a bundled binary for the C++ extension. Make an S3 notification configuration on the
manifest, which publishes to another AWS Lambda running controller code. This controller code publishes all the images in the
manifest to AWS Kinesis. Your ANN code Lambda Function uses the Kinesis as an Event Source. The system automatically scales
when the stream contains image events.

D.
Create an Auto Scaling, Load Balanced Elastic Beanstalk worker tier Application and Environment. Deploy the ANN code to G2
instances in this tier. Set the desired capacity to 1. Make the code periodically check S3 for new manifests. When a new manifest is
detected, push all of the images in the manifest into the SQS queue associated with the Elastic Beanstalk worker tier.

Explanation:
The Elastic Beanstalk option is incorrect because it requires a constantly-polling instance, which may break and costs
money. The Lambda fleet option is incorrect because AWS Lambda does not support GPU usage. The OpsWorks stack
option both requires a constantly-polling instance, and also requires complex timing and capacity planning logic. The
CloudFormation option requires no polling, has no always-on instances, and allows arbitrarily fast processing by simply
setting the instance count as high as needed.
http://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html



Leave a Reply 2

Your email address will not be published. Required fields are marked *


BT

BT

B is wrong. I image the flow of B as:
Put files to S3 => Notice Lambda => Create Stack => Run EC2 G2 instance (this step takes as least 5 minutes for starting and 10ms/image) => Lambda tear down Stack.
Wrong because:
– Maximum Execution time of Lambda is 5 minutes
– Lambda will trigger create new stack for every new file on S3
C is wrong because Maximum Execution time of Lambda is 5 minutes and C++ and GPU
D is wrong because Elastic Beanstalk
A seems the best in this question

bcw

bcw

I read B as saying the manifest bucket triggers the Lambda function, and the manifest is published only once.
I agree the Lambda session will only survive 5 minutes, but the tear down could be initiated by logic in the EC2 instances in the Stack which check the queue – each must request work items from the queue as part of it’s function.