So many of the web services we build and use today store and deliver their data using AWS S3. When we first began designing Takipi, we decided to test the upload/download speed between different AWS regions to get a feel for the differences in speed when transmitting data between EC2 and S3 buckets in different regions. We were particularly interested in the penalty physical distance has on performance.
We set up a test using EC2 instances and S3 buckets in all Amazon AWS regions currently available – three in the U.S (Oregon/California/Virginia), one in South America (Brazil), one in Europe (Ireland), two in Asia (Singapore/Japan) and one in Australia.
We also wanted to check how much benefit we would get from uploading data from an EC2 instance to an S3 bucket in the same region.
Below are the results of the first run we did in January 2012 when we just began developing and the current results from March 2013. A fun thing we did around this test was have everyone at the office write down their guesses. Tal won in January and I won this time around (no cheating — promise!).
In the following charts you can see our average results for the 10MB upload test (numbers are seconds).
The X-axis data is the S3 buckets and the Y-axis data is the EC2 origin.
Highlights (the good, the bad and the ugly)
- As expected, the best upload time was achieved when the EC2 instance and S3 bucket shared the same location.
- When an EC2 instance and S3 bucket were in the same location, the fastest upload time was achieved in Oregon and the slowest in Virginia.
- California’s combined upload time was the best.
- Singapore’s combined upload time was the worst.
- The slowest upload speed was when uploading from Australia to Brazil.
- Uploading from Singapore to Ireland was 4X slower times than from Japan to Ireland.
What has changed since 2012?
- Singapore’s combined up-speeds were the best — a surprising finding!
- Europe’s combined upload time was the slowest.
- When EC2 instance and S3 buckets were in the same location, the best upload time was achieved in Singapore and the slowest in California.
- California struggled much more with traffic to Brazil than the other U.S regions.
- The slowest upload speed was between Ireland to Japan.
The Java code used for the test is available on GitHub: https://github.com/takipi/aws-s3-speed. Fork, clone, review or simply run the experiment yourself. Let us know if you experienced different results.
Takipi is cross-platform and is written in Java and C++. We first tested some C++ vs. Java networking. We ended up dropping the C++ communication modules altogether and went on implementing our HTTP communication only in Java. Some of the reasons for dropping C++ were cross-platform issues, ease of debugging, reconnection/timeout handling and security.
Diving into Java, we had two approaches we had to choose from. The first was using the open source AWS Java SDK to deal with everything (it uses Apache HTTP components internally). The second was signing URLs and uploading them using plain old Java HTTP(S) URLConnection. We soon found that the different methods didn’t have a significant effect on the results, so for the rest of the experiment and post we’ll be talking about the SDK code. Mark Rasmussen has a great post about pushing S3 upload speeds to the maximum.
The code creates/deletes buckets in all regions (using prefixes). Once the buckets are set, the test code uploads a few chunks (10MB, 100MB, etc..) to all regions over several rounds and averages the time it took from the initial call to the end.
The code shuffles the regions at each round, and never sends chunks concurrently. We went for a dozen rounds, removing best and worst scores before averaging. This left us with 10 valid scores to average, which seems to us like a fair number.
- Europe’s was pretty slow last year. This year it showed the same speeds as the U.S which is a good thing for both users and developers.
- We’ve learned that the newest region, Sydney, Australia, still has a lot of catching up to do.
- We’ve confirmed our original assumption that staying inside a region is hands down the best way to guarantee maximum performance.
Let me know if you run the test and experience different results or have any questions, I’ll be happy to hear your feedback.
P.s. Takipi has a new Twitter account, follow us – @takipid