which naming scheme would give optimal performance on S3?

Brian Smith

Probably D

Reply

Sandeep

I agree with D.

Thousands of Instance IDs + Hourly logs seems like the most random sequence option.

Reply

Abdul

Yes, you are correct. You have correct explanation.

Reply

seenagape

I choose C

Reply

Vijay

I think B is the correct choice

Reply

Martin

The answer should be B. See http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html

Reply

Balaji

B looks correct to me,

http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html

Reply

zz

B

Reply

venkat sai

Yes B is right option. The main reason is the random prefix and the performance would be higher in this case.

A – Don’t make sense
C – YYYY ( This would be same and would be difficult to achieve good performance)
D & E – The instance Id would be same for the first two characters ( i-)

Reply

Support

Agree!

Reply

Dev

B

Reply

Ashish Chaturvedi

D

Reply

Niranjana HK

D

Reply

Ankit Shah

D

Reply

Max

D. It seems thousands of keys with same prefix “HH-” in one hour is not an optimized performance case.

Reply

Duck Bro

D
Even if the first couple characters are “i-“, the first 3-4 characters provides more random
prefix than HH-DD.

Reply

BDA

D , the random hostname prevents hammering a specific partition, and the HH-DD following hostname is more random than E

B will hammer a partition once per day at HH-DD

A changes i/o pattern, does not apply

C is just as bad as A

E is almost as good as D by YYYY will not be as random as D

Reply

Ryan

D is the answer.
A,B,C are all sequential.
E is less random than D.

Reply

joe

C

Reply

basant

d

Reply

VK

C is still sequential. Ans is D

Reply

sam

D

Reply

@dynadml

I think the answer is C because it is anticipated that you will tend to search for logs based on date and time for various instances but the word log should be at the end.

Reply

dickloveqdd

The correct answer is B 参见S3性能优化章节 CDE都是原文的反面教材百分百选B

Reply

certified

Anyone who understands how S3 stores data knows that B is the option if you want performance. They key thing to remember here is the more random or changing you can get the prefix to be, the more distributed your objects will be across the stack.

Reply

CrazzyFrog

I guess D is correct

Reply

PowerCram

NONE of these answers is correct. In order to partition data stored on S3 the key needs to use one or more slashes (/), therefore the best way in this scenario would be to use _log/YYYY/MM/DD/HH (the order of YY, MM, DD, HH essentially doesn’t matter). This would cause the log file from each instance to be written to a different S3 partition because the instance IDs are unique, therefore they would be an effective hash key.

The way these keys (I.E. file names) are written above they would all be written to the same partition in S3, no matter how the names are jumbled as listed. Effectively there is no difference (performance-wise) among the listed options.

Reply

PowerCram

NONE of these answers is correct. In order to partition data stored on S3 the key needs to use one or more slashes (/), therefore the best way in this scenario would be to use instanceID_log/YYYY/MM/DD/HH (the order of YY, MM, DD, HH essentially doesn’t matter). This would cause the log file from each instance to be written to a different S3 partition because the instance IDs are unique, therefore they would be an effective hash key.

The way these keys (I.E. file names) are written above they would all be written to the same partition in S3, no matter how the names are jumbled as listed. Effectively there is no difference (performance-wise) among the listed options.

(Had to repost because “instanceID” isn’t displayed.)

Reply