How Nouvola Helped Us Engineer A System That Scaled To 3,000Req/s In Less Than Two Weeks
Posted by alex on June 21, 2017 12:30am
This is an account of how Nouvola helped scale Validated‘s infrastructure without ambiguity and with the confidence that it would perform at peak efficiency during a nationwide U.S. rollout that coincided with Validated appearance on Shark Tank.
According to past Shark Tank survivors, it looked like we could expect up to 130,000 requests per minute on the corporate web page and roughly 1,500 app downloads per minute. With less than five weeks to move our infrastructure and fortify it to scale, we were in a bit of a time crunch. Furthermore, given the high stakes of our national rollout, we wanted to do more than withstand the surge in traffic, we wanted to capitalize on it.
- Limited time: Five weeks notice that Validated would be featured on ABC’s Shark Tank ( S8 E21).
- Inestimable surge: The Shark Tank episode was scheduled to coincide with our nationwide rollout to ten additional cities across the US.
- Infrastructure behavior: Common shared hosting is not adequate for major national TV spikes and with such high stakes, we could not risk being shut down. If your app is slow, nobody cares what it does.
The Solution: Nouvola
Performance Testing & Optimization
We decided to host our infrastructure ourselves and use a PaaS on our own AWS cloud so that we would be in complete control of scaling it, leveraging AWS EC2 and RDS services. On top of untold improvements and optimizations to the app and backend, we updated our app with social login integration and security hardening, and honed our website to capture as many leads as possible with a sweepstakes, live chat, and B2B dashboard. And while the infrastructure move was successful, there was no guarantee it could operate at scale.
We needed to make absolutely sure our infrastructure and software would withstand the traffic for both our products (web & app). But how could we simulate the anticipated loads on our infrastructure quickly and efficiently?
Nouvola offers real-world performance & load testing for web, mobile and API. We chose Nouvola because we were impressed by its turnkey simplicity and simultaneous ability to run large loads seamlessly, model realistic scenarios, and provide a dynamic, comprehensive dashboard. With Nouvola, we were able to identify otherwise unforeseen cliffs, tipping points, and bottlenecks — and fix them right away.
Through Nouvola’s continuous deployment and testing, we were able to assess the impact of every change and and optimize our capacity accordingly. For example, WordPress opens a new DB connection every time a user visits the page, which is detrimental to page load times and resource strain. To fix, we turned to caching and, thanks to Nouvola, were able to confirm it as a viable solution in real time. (Nouvola saves your tests so you can conveniently rerun them.)
Capacity Planning & Lessons Learned
It pays to be prepared, but you need credible data to make decisions and Google Analytics are not enough. By the same token, you don’t want to over-provision or spend money on things you don’t need.
Because we could continuously test and vary specific variable parameters, Nouvola allowed us to hone in on the exact problem and identify the needed solution. Just three of the countless invaluable lessons learned include:
Test duration matters.
Thanks to a cascading effect, problems compound over time. So while everything might be fine after 3-5 minutes, the system could fail at 10-15 minutes under load. In our case, things started failing seven minutes into the test. Nouvola enabled us to foresee such cliffs and preclude resulting problems.
Don’t overlook database index and sizing.
When AWS RDS instances CPU started to choke, DB connections spiked and the system became unresponsive. In our case, we knew our queries were already optimized so it came down to increasing the DB instance size.
Continuous Testing and Continuous Performance.
Thanks to Nouvola, we also discovered AWS CodePipeline. AWS CodePipeline is a pipeline management tool, essentially a hosted CD service. CodePipeline is a fantastic tool if your stack is already on AWS. With AWS CodePipeline, it is very easy to model the entire release process. We now use CodePipeline as our CD solution.
Before & After
Our first test:
We ran one of our first tests with 3000 users/sec and 3-second think time, over a period of 10 minutes. We had to stop the test halfway through as it was failing and the server stopped responding.
By the third test:
By the third test, all was well in paradise. Everything went well while running 3,000 concurrent users; the response time is amazing with 60ms at peak; and our infrastructure handled an average of 32,000 requests per minute with 10x t2.medium instances. Not too shabby.
Final Outcome & Future
Nouvola made it stupidly easy to test our infrastructure and stack, and ensured that we didn’t have any blind spots. You read online a lot about using command line tools and nightmare stories about setting up test cases that need entire teams but Nouvola greatly exceeded my expectations.
On the big day, the east coast broadcast kicked off and everything went off without hitch. We had three big waves, one hour apart, and experienced no issues whatsoever.
Proper performance testing doesn’t have to be a dauntingly complicated process. Good planning, the right tools, and real world scenarios in your testing process will go a long way. You need the right data to know that your app and website are ready every day, and Nouvola gives you just that. The ease and utility of Nouvola are unparalleled. Read a more detailed account of Validated’s experience with Nouvola on the Validated blog.
Alex Wilhelm is co-founder and CTO of Validated. You can also find him on Twitter at @khaosalex.
We were so impressed with the outcome that Nouvola is now a part of our development cycle, included in our default deployment and testing workflow. On top of that, we are now using Nouvola-generated data to optimize our AWS infrastructure cost-wise and test AWS autoscaling triggers reliably.