FiftyThree’s Load Testing: An Unexpected Journey
FiftyThree creates Paper using data stored in a high-availability Neo4j cluster
FiftyThree, creators of the award-winning iPad app Paper, recently started blogging about the engineering process behind the app. Aseem Kishore, frequent speaker at GraphConnect conferences and developer at FiftyThree, co-wrote a blog post with fellow FiftyThree developer Dave Stern on developing the Made With Paper stream in the app. Through fine-tuning their Neo4j configuration, they were able to achieve close to 2000 requests per second across the cluster.
Watch Aseem present Betting the Company on a Graph Database at GraphConnect Boston 2013 here.
Blog post below written by Aseem Kishore and Dave Stern, both developers at FiftyThree
At FiftyThree, we love to inspire. Currently we do this by featuring the great work that our community creates on platforms like Facebook, Twitter and Tumblr. But we wanted to bring inspiration even closer to the creative process, so we built a Made With Paper stream directly into Paper.
After working in stealth mode for several months to build and test the stream, we still had to ensure that it could scale. This is the story of our load testing adventure, through the peaks and valleys of endless graphs, to the stream of new creations that you see in Paper every day.
When we began we had no idea how much traffic we would get. So we did what any experienced engineering team would do when faced with potentially huge load and millions of dollars on the line: we guessed.
Based on the total number of Paper users, downloads per month, average time spent in the app, and our API polling frequency, we estimated that we could face peak traffic of around 1000 requests per second, and meltdown traffic of over 5000 requests per second. These numbers became our goal.
Fortunately, our initial stream required a minimal, read-only API that serves only one, infrequently-changing set of data. This simplified things for us considerably, but we still had our work cut out for us.
Smooth Start: The Database and Load Balancer
We began by testing our database for the baseline performance of our foundational layer. Using open-source tools like siege and httperf, we recorded dismal metrics under 200 reqs/sec. We worked with the Neo4j support team, however, to fine-tune our Neo4j configuration. We were then able to achieve close to 2000 reqs/sec across the cluster.
We were encouraged by a quick but important discovery: for our type of traffic, high-CPU instances were much more effective than high-memory ones. Our database fits into memory and will for quite a while, even at scale. Neo4j runs in a JVM with a memory size fixed at service start, so we had maximized memory settings to use all available RAM. Our experiments showed, however, that we could get the same effect with less memory and more CPU.
Once we knew the baseline numbers for requests directly to our database, we tested with HAProxy in front and saw no performance loss. Since we knew our caching layer would only improve our ability to handle traffic, we were satisfied with our core components.
Special thanks to Neo4j engineer Max De Marzi for lending us his expertise.