Michael Kjellman | Software Engineer
What does Barracuda Networks do?
While best known for their security solutions, Barracuda Networks offers products across three areas of IT: Content Security, Networking and Application Delivery and Data Storage, Protection and Disaster Recovery.
How are you using Cassandra?
The Barracuda Central Research Database is using Cassandra to battle the Zombies. Before adopting Cassandra, we could not monitor every malicious site and IP forever – the data volumes were just too great. We would monitor a site or IP for a while, and once we saw that the IP address was no longer alive we would stop monitoring it or need to truncate our history. The big problem however, was once we stop monitoring a site or domain they frequently come back to life – hence the Zombie moniker.
We had data coming in from multiple databases and flat files, and now we use Cassandra to consolidate all that data. Before it could take us as long as 3 or 4 hours to mark a site or IP; now with Cassandra we are able to do that in real-time and not worry about losing history.
You’ve been using Cassandra since the early days – what made you choose it?
Initially, around Version 0.8 we were using it as a key value store, but around 1.0 we looked at it to replace MySQL. In the past, taking down one botnet or IP would drastically reduce spam to our customers, but today spammers are smarter; the attacks change constantly. We had a scale problem and MySQL could not handle it, whereas Cassandra is designed to scale and be highly available. We needed a highly scalable system that could be real-time. No other database was ready for what we needed to do. The thriving community was also a reason for us to choose C*.
You’ve recently upgraded to Cassandra 1.2 and are the first company we know of to be in production with it – what features were you most excited about?
Collections and CQL3 are compelling for us. With collections, we are able to maintain associations between IPs, domains, and full uris . With CQL3 and Collections, we will be able to pull back all the data we need with one call.
What does your data center make-up look like?
At the Barracuda Central Research Database our configuration is 2 spindles, no raid, 2 data directories (one directory per spindle) and and an SSD for small “hot” column families. 12 cores, 32GB of RAM.
You wrote a client driver for Cassandra – which one and why?
I wrote Perlcassa; no other PERL drivers were up to snuff. Link to Perlcassa: http://planetcassandra.org/DownLoad/ViewDownLoadType/perlcassa
Any advice to anyone looking to upgrade to Cassandra 1.2?
Cassandra 1.2.1 is a much more stable release; so if you are looking to put it into production, I recommend upgrading.