Rick Branson, Infrastructure Engineer at Instagram: "Adopt a technology by understanding what it's best at and letting it do that first, then expand…"
Cassandra is a critical part of Instagram's large scale site infrastructure that supports more than 100 million active users. They recently made the switch from Redis to Cassandra and this talk is a practical deep dive into data models, systems architecture, and challenges encountered during the implementation process.
The Good with Redis
- It's easy to prototype
- You don't need to worry about how fast you put data in and take data out because it's in memory.
The Bad with Redis
- Redis is an in-memory datastore and memory is expensive
- If you're storing stuff in it that you aren't reading all the time, it falls apart for those use cases
- In-memory degrades poorly
- This is a bad cliff — you will hit a wall and getting out of that hole is nearly impossible
- Flat namespace
- You don't know what's in there
- Heap fragmentation
- Single Threaded
Why We Initially Chose Cassandra
- Centralized logging with online reads
- We have a high skew of writes to reads (1,000:1)
- The absolute ideal use case for Cassandra
- Ever growing data set
- Needed durability
- Very high availability
How It Expanded
- Initial use case cluster was 3 nodes, now 12
- No downtime upgrade to Cassandra 1.2
- Adopted Cassandra for storing inbox notifications, 23K writes per second & 16K reads per second on a separate 12-node EC2 cluster
- Logging cluster stores ~20 billion records (1.2TB)
- Notification cluster stores ~10 billion records (550GB)
- 99.9999% availability since we started using Cassandra
Instagram Fun Fact
~10% of transactions on Instagram are "undos": unlikes, deleted comments, deleted pictures, etc.
To learn about about how Intagram implemented Apache Cassandra, check out Rick Branson's presentation from Cassandra Summit 2013 and accompanying slide deck found below: