Eric Lubow | CTO & Co-Founder
Brady Gentile: Community Manager at DataStax
Eric Lubow: CTO & Co-Founder of SimpleReach
Brady: What does SimpleReach do?
Eric: SimpleReach is creating a measurement layer for the social web. What that means is content creators, such as People, Time, USA Today, create hundreds and sometimes thousands of pieces of content per day. They push it out to the web and they don’t have a good way to track social engagement. From the instant they post the article, we’re tracking the number of tweets, likes, pins, every social metric in real-time and we’re tracking that engagement on a very granular level: at the URL or content level; that’s something that nobody has previously been able to see.
Brady: How is SimpleReach using Cassandra?
Eric: So, as you can imagine, there are many pieces of content being published constantly. There aren’t too many database options out there that can ingest the high volumes of data that we need to be analyzing. Just at the bare minimum, we look at millions of URLs and hundreds of millions of social actions around each of those URLs every single day. Just to keep up with that volume, we needed something that can ingest data at high volumes and high velocity; that’s where Cassandra comes into our architecture.
Brady: Where you using another database technology prior to Cassandra or have you always used Cassandra?
Eric: So, we started out with MongoDB and you know, Mongo was great for testing our assumptions but it just didn’t handle everything that we wanted it to and it didn’t allow us to query our data the way we wanted it to. We even looked at MongoDB a little more in depth and realized it just wasn’t going to be sufficient for our purpose. We looked at Cassandra, HBase and the other characters in the NoSQL space and, after further investigation, we decided to choose Cassandra.
Brady: Any tips, tricks or advice for a new user?
Eric: I’d say that the biggest thing you can do for yourself as a new user is to take the easiest route possible to installation; this probably means using CCA (Cassandra Cluster Admin) and building a ready-made application. Build something like a blog, even though that’s a slightly less-obvious use case for it. This will will teach you the internals, the data structures and how the data’s handled once it leaves your application and touches the data storage layer.
Brady: Thank you very much for the interview Eric!