It’s been a year since we launched blitz.io, an awesome multi-tenant application performance testing/monitoring platform running on AWS and Heroku. Looking back at the year, it’s been an amazing ride and we’ve helped a pretty diverse class of users that have no intent on becoming performance experts to really understand the difference between concurrency and hits. But I’m pretty disillusioned right now. Not to mention bored. And I think its the cloud.
This week there’s been much talk about NoOps with posts from @adrianco, @allspaw and @krishnan, to name a few. It all started with this infographic from @appfog. The challenge is that the combination of the words No and Ops is wide open for [mis]-interpretation.
At blitz.io, for a while there, we were only relying on CouchDB clusters as the primary NoSQL database with some in-memory caching. As we grow (rapidly) and scale out, there are aspects of what we collect and store that are transient and real-time. While CouchDB is awesome for the map/reduce, replication and incremental view indexes, the real-time queues (emails, counters, stats, etc) natural lend themselves to, yup, redis. We are in the process of rolling out geo-located redis instances as part of our global infrastructure.
This is a repost of my Atlassian’s guest blog, announcing a Bamboo plugin for blitz.io.
The pig of a problem
We all know what happens when your app performs like a pig. You lose users, customers and revenue. Your app is slow, the failing pigs don’t amuse your customers and you hear about it as the trending topic on Twitter. In most cases you don’t even know that it’s slow until you push the app into production, multiple times a day. How can you identify performance bottlenecks earlier in the cycle? And, if you don’t discover them how to you find and fix them as fast as possible?
Enter Blitz – a performance testing tool, built by Nerds that were angry at how the existing tools weren’t keeping pace with the new Application Development Lifecycle that has Continuous Integration as its center piece.
Okay, not the greatest, ground-breaking, coolest, earth-shattering feature ever. Let’s just get that out of the way. But, in the process of troubleshooting various latency issues for our customers, we found ourselves logging on to various EC2 instances of blitz.io to run traceroutes to our users sites/apps to diagnose problems. We are developers, hanging out in TextMate, vim and our terminals and the ability to take a local Unix command and run it remotely while staying in our zone (shell) was important. So …
We are super excited to bring blitz.io to CloudFlare‘s users. We’ve been slow rolling this over the course of the week and it has been pretty amazing to see CloudFlare users using blitz.io against their direct domain/origin server to see the benefits of performance and security provided by CloudFlare. CloudFlare is now the 7th blitz.io partner, in a growing list of ecosystem partnerships. In the era of PaaS, DevOps and Continuous Deployment, blitz.io makes load and performance testing a fun sport with no scripting and affordable self-service, utility pricing.
Over the weekend, I was experimenting with CouchDB to see if it can pass the C10K barrier. Some of the performance optimizations I made along the way are really OS-level optimizations that affect MochiWeb (erlang web server) and fairly well documented in many blogs. This one by @metabrew in particular is a pretty good read, since it focuses on Erlang and MochiWeb. While I am a performance junkie, I am not an Erlang hacker. So this is a call for help to the CouchDB hackers for recommendations on scaling out CouchDB.
blitz.io went down for a short duration yesterday morning. It was an interesting day uncovering and identifying issues we hadn’t encountered before with multi-region CouchDB clusters that are doing multi-master continuous replication. In a lot of ways, we are path-finding and pushing CouchDB to its limits given that we are a write-heavy app. In the process, we are making up our own best practices and working around issues. Some of these issues are already addressed in trunk, but I wanted to document what we went through today and what we can do about this. Any ways, if you are running a large CouchDB cluster in production, would love to hear from you.
PaaS providers like Heroku, CloudFoundry, RedHat and Joyent are all supporting node.js apps that you can simply git push and scale out. node.js is unlike anything you’ve encountered before. As Ryan Dahl puts it:
node.js helps you maintain connections
We are super excited to announce that Mu Dynamics has partnered with Acquia to bring blitz.io into the Acquia Network. This is immediately available to all Acquia customers to instantly and continuously integrate load testing as part of their Drupal deployment. Load testing web sites used to be a once-a-year, pre-holiday-season undertaking that cost significant $$$, time and resources. Given the complexity of existing tools and solutions, these types of tests could only be run by performance experts. blitz.io changes all of that to make load and performance testing a fun, affordable sport!