At blitz.io, for a while there, we were only relying on CouchDB clusters as the primary NoSQL database with some in-memory caching. As we grow (rapidly) and scale out, there are aspects of what we collect and store that are transient and real-time. While CouchDB is awesome for the map/reduce, replication and incremental view indexes, the real-time queues (emails, counters, stats, etc) natural lend themselves to, yup, redis. We are in the process of rolling out geo-located redis instances as part of our global infrastructure.
I’ve mentioned this in the past that we use CouchDB multi-master replication between the US east/west. In addition there are numerous scale engines attached to these CouchDB clusters from all around the world (California, Virginia, Oregon, Japan, Singapore and Ireland).
When you run a load/performance test from a given location, say, Singapore, we find the closest CouchDB cluster (California) and route the job there. Then the scale engines in Singapore try and acquire the job and then execute it. Very similar to resque, only global with replication. We’ve now added two sets of redis instances one on us-east and one on us-west on AWS. These two sets of redis instances also happen to be next to our CouchDB clusters.
Redis Transactions and Lazy CouchDB saves
Each time someone runs a load test, the overall queue size per-engine (and hence region) is updated. This gives us a sense for the overall pressure on the system so we can automatically spin up additional instances as necessary. In addition we also update site stats (scoreboard, number of hits etc) at the end of each test. We’ve now effectively moved the engine status information completely into redis with periodic RDB snapshots.
The scoreboard was interesting since we snapshot daily metrics into CouchDB to analyze overall trends and who did what how many times each day. So this information had to move into CouchDB, but we decided to do this lazily, thanks to redis transactions. Here’s the pseudo code:
The watch/exec combination effectively allows us to do atomic operations. More importantly this allows us to regulate writes into CouchDB in a lazy way as this information is really not required instantly.
CouchDB Document Caching
Finally, we’ve also completely switched over to redis as our caching layer by simply monkey patching CouchRest like below:
This localizes the cache-aware code and was a pretty simple way to add caching across all our apps. This also means that all writes go to both redis and CouchDB and redis always has the recent document for a certain period of time. This almost completely eliminates reads directly from CouchDB unless there’s a cache miss. We are actively tracking cache misses using the INFO command to see how much benefit we are really getting.
We got some awesome things cooking in blitz.io for this year as we start tackling problems beyond just load testing and the Redis/CouchDB combination is going to take us far!