Michael Nelson Software Engineering Lead at AOL
AOL does many things but mostly content generation, content value add (geotagging, entity tagging, personalization) and advertising. I am director of technology for a targeted, get things done team within the content group.
Cassandra at AOL
We are using Cassandra 1.2 as an article index for several AOL technologies. It is used as a service layer that facilitates the storage and retrieval of many millions of articles; processing 5 million articles an hour.
We adopted Cassandra mainly for linear scaling. We expected a performance improvement as well, but nothing like what we ended up with. We are currently (temporarily) dual writing to Cassandra and MySQL. Both clusters contain the same number of machines at roughly the same spec. Cassandra is on average 8X more efficient than MySQL with the exact same writes. In fact, Cassandra should be under more load because it is being used for reads as well. Switching to Cassandra was a big win for us.
We chose Cassandra because it was stable, uncomplicated, and a good fit for our use case compared with other big data technologies that are mostly analytics driven. We evaluated against HBase, Riak, MongoDB and Clustered MySQL.
Overall, our top reasons for choosing Cassandra include: always on, linear scalability, latency, compression, durability, and community support.
We are currently running 6 nodes in a single data center. We originally had 2/4 node clusters but reduced to 1 with raid 10. We have 6 other machines on reserve to add to the cluster as we increase retention and or add data to the cluster.
Advice on getting started
Make sure the language you choose has a good client with async support and knowledge of Cassandra hashing strategies so it chooses the right node as the coordinator.
Cassandra has a great community. They are passionate about the project and willing to help out people looking to adopt the technology. I should mention Aaron Morton in particular, who exchanged emails with me when I was evaluating Cassandra for our use case.