We launched xtractr earlier this week for network forensics, troubleshooting and handling support escalations involving large packet captures. Just so you know xtractr is a 4-tier app (more on that below) that combines the best of Web 2.0 with looking at packets in new light. Looking beyond the “unleash the power of packets” message, I wanted to write about what’s under the hood a little bit and how we are using CouchDB-style of Map/Reduce for uncovering all sorts of information inside large packet captures.
Technology Stack
xtractr is a single Linux executable that you download from pcapr. This executable uses Ferret for searching, Mongoose for a RESTful API, SQLite for flow classification and a persistent store for various packet fields and labels, and V8 for reporting. xtractr uses tshark for getting at the various field values tucked away in those pesky packets. We purpose-built all of the flow classification and content extraction capabilities in addition to bridging these diverse technologies in a seamless manner using a RESTful API.
The xtractr application (delivered from pcapr) runs in your browser and uses jQuery, Flot and Sammy. This application, written in all JavaScript, uses JSONP (a cross-domain way of making Ajax calls, at least until HTML5 is mainstream) to communicate with your instance of xtractr and makes it super easy to find the needle in your packet stack. Given that search queries are king, we wanted to build the application so that as you click around, you can see the search queries constantly being built. This learn-by-example mode of the UI combines the best of Web 2.0 ease of use with the powerful and open command-line kungfu that most people are used to.
One of our primary mandates when building xtractr was this:
the packet data never leaves your system!
Obviously pcaps contain a wealth of information (packets never lie) including usernames and passwords and we wanted to ensure that the index and the original pcaps stay with you. Besides, do you really want to upload a gig of data to the cloud?
One of the most common questions in multiple forensics and packet related mailing lists is How do I look for foo in my pcap?. The collaborative aspect of xtractr comes from the fact that users can explicitly share complex search queries with the rest of the community. These queries are stored in CouchDB on pcapr. This allows the collective intelligence of the packet-geek community to help out those that are just trying to solve everyday problems. These community-contributed queries are called Nuggets. When you use a nugget, we just fetch the search query from pcapr, but then apply it against your xtractr index.
Using Map/Reduce for Reporting
One of the huge challenges in packet forensics is that packets have incredibly rich information content and they come at many different layers each of which might be interesting on its own. Now, we didn’t want to build crazy SQL joins (I’m personally JOIN-challenged) across 90,000+ Wireshark fields. So we ended up using Map/Reduce very much inspired by CouchDB.
The simplest way to understand how this works is from the Interactive CouchDB tutorial that we published a while back.The basic idea is this. Each flow or packet in the index is conceptually a JSON document that looks like this:
{
 "id":169,
 "offset":16516,
 "length":496,
 "pcap":1,
 "flow":12,
 "time":28.9294,
 "dir":0,
 "src":"192.168.30.132",
 "dst":"192.168.40.234",
 "service":"HTTP",
 "title":"GET /index.html HTTP/1.1 ",
 "eth.src": "00:01:02:03:04:05",
 "eth.dst": "06:05:04:03:02:01",
 ...
}
Fields that have multiple values are conceptually stored as JSON arrays. Given this, let’s say you want to find the ‘Top Bandwidth Hoggers for HTTP’. The query string that generates a nice little chart looks like this:
flow.service:HTTP > sum('flow.src', 'flow.bytes')
The first part identifies all the flows that are HTTP. The second part is where the Map/Reduce comes in. Each conceptual flow is passed into the following JavaScript code. Here ‘flow.src’ becomes the _kfield and ‘flow.bytes’ becomes the _vfield. At a very high level, we are building a hash table with the concrete value of flow.src as the key and the sum of all the bytes as the value.
{
map: function(flow) {
var _key = flow[_kfield];
if (_key) {
flow.values(_vfield, function(_val) {
if (typeof(_val) === 'number') {
emit(_key, _val);
}
});
}
},
reduce: function(key, values) {
return _sum(values);
}
}
When you sprinkle some jQuery magic to the result data, we get this:

Now wasn’t that easy? xtractr comes with a few different map/reduce functions which allow you to generate all sorts of cool reports with just a few clicks. While xtractr is a powerful standalone application for forensics, a lot of our customers use it directly with Mu Studio to statefully replay the problem traffic to very rapidly resolve escalations. Besides, Mu Studio can also auto generate all the fuzz tests for you based on the flow you pulled out from xtractr.
So check out xtractr. You don’t have to be a packet geek to use it, but you get to benefit from the collective intelligence of those that are.

Pingback: Google Redirect Fix
Pingback: Error 1606
Pingback: cool caravans
Pingback: guaranteed rankings
Pingback: best supplements for muscle gain
Pingback: Dallas Advertising
Pingback: Dallas Video Production
Pingback: 1 Bedroom Condo Rental in Siesta Key
Pingback: Dallas Boudoir Photographer
Pingback: Tesla Turbine
Pingback: best bcaa
Pingback: Phil Cannella
Pingback: Low Cost Cruises
Pingback: noclegi zakopane
Pingback: SATNAV
Pingback: florida unemployment
Pingback: Dallas Pet Photographer
Pingback: Free Movie Downloads
Pingback: Swedish Meatballs
Pingback: weight loss
Pingback: lol
Pingback: Porn Discussion
Pingback: Legal Buds
Pingback: inkjet cartridges
Pingback: Legal Herbs
Pingback: uzaktan egitim
Pingback: private
Pingback: Search Inmates
Pingback: escorts in orange county ca
Pingback: best over the counter sleep aid
Pingback: pozyczki chwilowki
Pingback: scalp psoriasis
Pingback: read us
Pingback: shampoo for oily hair
Pingback: Yarn Bomb
Pingback: Scalewatcher
Pingback: buy google+
Pingback: work at home scams
Pingback: nikon coolpixs80
Pingback: wedding rings
Pingback: why should we hire you interview question
Pingback: document scanners
Pingback: MMA New York State
Pingback: Celeste Hall
Pingback: wedding photography Oxfordshire
Pingback: how to get rid of acne
Pingback: unwanted gold jewelry
Pingback: hcg recipes
Pingback: investment property
Pingback: luxury villa rental st lucia
Pingback: Marketing
Pingback: Holton Buggs
Pingback: Join project payday
Pingback: picsterten
Pingback: hypnosis weight loss
Pingback: tetris
Pingback: DSW printable coupons
Pingback: croscill comforters
Pingback: webdesign munchen
Pingback: Ameri Ichinose
Pingback: sports action photography
Pingback: Jennifer Ellison nuts
Pingback: Acne Scars Removal - Does it really work?
Pingback: Low Fee Payday Loans
Pingback: garmin 60csx review
Pingback: Tattoo
Pingback: dating sites
Pingback: bathroom tile decorating ideas
Pingback: how to start website
Pingback: hotels near uk cities
Pingback: what is serotonin
Pingback: Buy Electronic Cigarette In
Pingback: hugeyield
Pingback: Harvey Lingbeck
Pingback: About Stenography
Pingback: Solomon Petris
Pingback: motivation for weight loss
Pingback: dining tables
Pingback: Scleritis treatment
Pingback: Transfer Music from iPhone to Computer
Pingback: fire pit screens
Pingback: playhouses
Pingback: Blog porno
Pingback: commercial building inspection
Pingback: dog snuggie
Pingback: surgical loupes
Pingback: Okna Gniezno
Pingback: Scott Tucker Racing
Pingback: super bowl live streaming
Pingback: How To Loss Weight Fast
Pingback: Justin Bieber Baby
Pingback: Broker Price Opinion
Pingback: fanny pack
Pingback: football training
Pingback: Appraisal Management Companies