Comcast has redefined the way television is delivered with the X1 cloud-based platform. To ensure uptime and performance at scale, they chose Apache Cassandra over traditional technologies like Oracle to transform TV into an interactive, integrated experience.
For the past several months I have spent most of my time working on an infrastructure project called the CMB – the Comcast Message Bus. It’s a generic message queue – publish and subscribe infrastructure and we implemented it based on a Cassandra backend.
We have a new setup box that is entirely cloud based, the X1. The design allows you to run apps on it, and one of those apps is a sporting app that allows you get more detailed background information in real-time about live sporting events, so we use this message bus as a backend to the sporting app to send around notifications of live sporting events as they happen.
When we figured out that we needed a generic messaging infrastructure the first thing we did was look at off the shelf offerings for messaging, but then decided that none of them could meet all our requirements. These were the three requirements we were looking for:
1. Linear, or near-linear horizontal scalability
2. Availability and robustness; what I mean by that is cross datacenter availability. Say one datacenter goes down you seamlessly fail over to another datacenter
3. The third feature is what we call Active Active – what I mean by that is that say you have one message queue and you send to that message queue from datacenter A and then you have clients receiving messages from the message queue in datacenter B, and also the other way around. So it needs to be able to read and write to two datacenters at the same time.
These requirements are really hard to find in off the shelf offerings, so we decided to build our own. We chose Cassandra because it naturally gives us the availability and cross datacenter replication and we didn’t see that with any other solutions that we evaluated at the time.
There are many groups across Comcast using Cassandra in one way or another. The ones I’ve directly been involved in are the X1 Sports App and X1 DVR.
The CMB deployment for DVR sits on top of a Cassandra ring spanning two data centers with 8 nodes in each data center. The nodes are physical servers backed by SSD’s and are specifically tuned as high-performance Cassandra nodes (this is why we get away with just 8 nodes in each data center).
We are quite far along with the project, and we have open-sourced it at github. But as far as users we are just rolling it out, so we have that one deployment I just talked about, the sports app. That one spans two datacenters of our own, and then has the infrastructure to run the message queue on top of the Cassandra ring. We are also experimenting with third party cloud based providers, and will look to roll it out there as we gain more users.
If you are coming from the relational database world like I did. If you are completely new to the NoSQL world then it is important to understand how to design your data model in Cassandra. For example having a denormalized data model that contains a lot of redundant data might not be a bad thing, in fact it might be a good thing.
Then you need to understand how Cassandra works internally because that will allow you to come up with a design that works well. For instance things like tombstones, compaction or issues with wide rows. If you are aware of things like this you will be more successful designing systems that will scale well with Cassandra straight away.
It’s been a very smooth transition for me personally and my team for a couple of reasons. We were lucky to have some in-house expertise, and rely on some folks who could answer questions and give us a basic idea of how it works. My experience was that once I actually got into it I found it easier and faster to develop Cassandra applications than working with relational databases.