blitz.io – CouchDB in production

We launched blitz.io a couple of days ago, fully powered by CouchDB. In the last couple of days, we’ve generated over 1,000,000 hits (through our load and performance testing) against various servers and transfered over 2GBytes of data in and out EC2. Unlike massive terrabyte read-only CouchDB clusters, ours is inherently write-heavy. When each user either through the UI or through the API runs a sprint or a rush, we have multiple writes to CouchDB, constantly.

Architecture

What we have globally on EC2 are multiple scale engines that listen to regional CouchDB’s on a filtered _changes feed. Here’s kinda how it looks. Give the geographical nature of testing, we wanted to ensure that there were enough CouchDB’s near our scale engines to minimize latency.

blitzio.png

This is the classic east-west conflict (literally!). blitz.io is an Heroku app which proxies out of Seattle and so we found that running a CouchDB cluster on the east and and west of US gave us the smallest latency between the app, the database and the engines. This, in short, makes everybody happy. We are, though, looking to push out CouchDB’s closer to each EC2 region as much as possible, so the _changes_ feeds are as spiffy as possible.

Filtered _changes

When you run a sprint or rush in blitz.io, we end up posting a job to a region-specific CouchDB with something call job affinity. For example, when you specify:

--pattern 1-1000:60 --region japan http://my/restful/api

we post the job to the CouchDB geographically closest to Japan. At this point, the filtered _changes kicks in and only the scale engines in Japan are notified that there’s a new job. Each one of them (there are multiple) try and update the document to grab it and CouchDB’s conflict management helps us automatically pick the winner. Darwinism at its best! If one of the scale engines is busy, then the other ones win the job and proceed to now run the test. We had to monkey-patch the CouchRest gem so updating the job doesn’t always GET the document before updating it. This cuts down on a number of HTTP requests between the engine and the CouchDB that it’s assigned to.

Scale engines imply no local storage

Each scale engine in every region around the world uses CouchDB itself as its checkpointing mechanism for the _changes feed. We do this by using the db’s update_seq as well as the instance_id of each EC2 instance. This is cool because we don’t have to worry about any persistent storage on each scale engine. These are purely CPU, memory and network hogs and don’t rely on any persistent storage to resume where they left off. This is also why our scale engines have no EBS storage, ‘cos they don’t need any. If for some reason, a scale engine crashes, it comes back and knows exactly where the checkpoint is so it can resume without trouble.

Self Healing Engines

The first implementation (a couple of months ago) had all the engines hard-code their own databases. We wanted the ability to dynamically change the regional-Couch associated with each region. Also when upgrading CouchDB or extending the EBS storage, we didn’t want to shut off any regions. At any point if there’s any kind of exception, each regional engine restarts, exponentially backs-off and attempts to refetch the CouchDB that it needs to connect to. To do this, each engine uses the EC2 meta-data to report up to blitz.io to find out which DB it should listen on. It does this by using EC2′s meta-data:

curl http://169.254.169.254/latest/meta-data/placement/availability-zone

So the master (which is just a git push away on Heroku) assigns the appropriate CouchDB based on the region in which the scale engine belongs to. As I mentioned in the last blog, each scale engine is insanely scalable in that it can handle

50,000+ virtual users

on a single small EC2 instance. In most cases, we run out of TCP ports (about 60K usable ports for each instance) before we exhaust the CPU or memory.

Delayed Workers and stale=ok

Every modern web app uses delayed workers and jobs to keep the user interaction to be less than 250ms. While we explored Heroku’s workers using rake tasks, what we noticed was that we already had dozens of our scale engines worldwide. So we repurposed these scale engines to also do other jobs. For example, we have the scale engines periodically _compact the CouchDB’s that they are connected to. These engines also refresh the views at the end of each job. We extensively use in-memory caching and stale=ok on CouchDB views, especially the scoreboard, which is updated internally at the end of each test, but refreshed once every 30 seconds.

Scaling it out

CouchDB’s master-to-master replication simply rocks. It’s easy to setup, takes care of itself (unless you restart CouchDB, though this is getting fixed) with little or no attention from the devops folks. Bringing up a new CouchDB into this multi-master-replicated-cluster is also super easy because the RESTful-ness of it helps in major automation. We can bring up a CouchDB on a new EC2 instance without ever manually logging in.

Auto scaling engines

Given the insane scalability of our scale engines, bringing up a new engine in a region is as simple as:

rake engine:new region=japan

We’ve got all the automation to elastically scale the capacity of blitz.io without any manual intervention. The lack of storage on each engine plus the _changes feed makes it super easy for us to pretend we are running an IRC channel where these scale engines come and go.

So if you haven’t checked out blitz.io, please do so! It’s the load and performance testing solution for devops that rely on the RESTful-ness of their cloud apps.

Bookmark and Share