Vijesh Mehta CTO & Founder at CallFire
CallFire is a cloud telephony company that provides voice and text connectivity to over 50,000 business’. We handle over a million calls and texts every day. These include routing incoming calls, sending outbound notifications, text message conversations, and much more.
How are you using Cassandra?
CallFire’s system has handled more than 1.5 billion calls, and stores 80 million sound files to date and we are growing exponentially. We quickly became overwhelmed with managing NFS and keeping a high uptime. Our goal was to find a solution that works across multiple datacenters, scales on demand, has fault tolerance, can store 80 million sound files and wasn’t too expensive. After studying the landscape, we chose Cassandra. Currently we have two uses for Cassandra within our system.
The first use was to solve our storage problem and start storing the millions of sound files in Cassandra. This isn’t the most ideal scenario, but it fit our system very well. The reason the sound files aren’t a great fit is because they tend to be large at times (Up to 120MB). With the lack of streaming capabilities in Cassandra, things are inefficient when users try to listen to their files through the website. We talked to Jonathan Ellis early in our process and he recommended chunking the files in 2-5MB pieces so that we can simulate streaming. We never implemented that solution, but for now things are running to our satisfaction. We have a good strategy for growing our cluster as data increases and the added stability over NFS has been a great improvement for our customers.
Along the lines of scaling, we started to realize that we needed to partition our user base to improve performance and provide a long term growth strategy. Having a cross-shard database is an important piece to solve universal functions such as authentication and routing calls. We use Cassandra for cross-shard information and a traditional relational database for each shard. By having this kind of configuration, we easily make universal data available to all datacenters and shard’s.
What made you choose C*?
As mentioned earlier, having made 1.5 billion calls and growing exponentially, we needed something to grow with us. Our analysis went through many options, including Hadoop and Gluster. In the end, we felt that the ease scaling and setup of Cassandra were the best fit for our system. We achieved our intention of finding a scalable, and fault tolerant storage. We also expanded the scope of Cassandra when our decision to partition our database was made.
What tips do you have for someone getting started with C*?
One of the difficulties a small team has is dealing with operational issues. If you get started with Cassandra and move it to production in a fast growing environment, take extra care in understanding the management tasks. Make sure the drives don’t get full, monitor outages and use good strategies in designing token space for your cluster. Take the time to see other’s use cases and it will be a lot smoother for your own roll out.
Are you running C* in the cloud or your own DC?
Check out Vijesh’s slides, presenting their Apache Cassandra use case.