I’m Jesse Young. I’m the Vice President of Software Development. At Zonar we offer a safety inspection system and telematics for heavy fleet vehicles. Ultimately our offering is GPS tracking of vehicles that weigh over 10,000 pounds or carry more than 8 passengers.
We make sure sure users know exactly where those vehicles are going and collect a lot engine diagnostics information, so that we know exactly what’s going on with the vehicle in real-time. Today we’re tracking over 350,000 vehicles across the United States and Canada; we’re quickly growing and expect to be at 500,000 devices by the end of next year. We’re a leader in our industry space.
We’re over 100TB in our data stores right now. We’ve maxed out our RDBMS solution and have been looking at how we can quickly store data and retrieve it as fast as possible.
Many different factors motivated us to start looking for a better solution. Again, we really knew that we had a lot of data that we’re potentially starting to store. We needed to be able to quickly expand our storage and store this data in real-time without any bottlenecks.
At the same time our users require us to report on that data very quickly and we didn’t really have the desire to have both OLTP type databases and data warehousing, as they became very expensive. From a system’s approach, we needed a system that had built-in multi-data center replication.
We get very heavily into GPS data, but we’re also collecting a lot of information off of the engine computer itself such as oil temperatures, cooling temperatures, cruise control state, fault code, check engine lights, and stop engine light information.
We collect data around every 18 seconds; if you imagine that, it stacks up pretty quickly across 300,000 vehicles. We’re running anywhere from 6 to 12 hours a day, so do the math. It’s quite a bit of data.
We plan on supporting elevation data. We have a digital elevation model that we received from USGS that we store in Cassandra as well. We tried using it in the relational system, but it pretty much fall on its face. Cassandra was a perfect fit.
My favorite part of Cassandra has to be the scaling aspect, to be honest. It’s so much easier working with a whole cluster of nodes that’s one big mesh. In our Postgres systems you have to work on them individually; if you want to run any jobs you have to connect to each one, one by one around the job, wait for it to finish and then go to the next.
Cassandra just gets rid of a lot of that and lets us hit the cluster and use it that way. The performance is amazing. One of the things that I love about it too is the community; there’s a giant field of experts out there that are willing to help people for free, whether it be on Twitter, IRC, Planet Cassandra or all the meetups happening or even the summit events.