Seastar is a managed platform for Apache Cassandra that spans hardware infrastructure, a hosting environment, a self-service API and dashboard, and a support team. It’s everything you need to get clusters up and running quickly and cost-effectively. This platform is being built by a team of engineers at Network Redux who have decades of combined experience in the enterprise hosting industry and have supported a wide variety of database deployments.
Why we’re building Seastar
My team decided early in the product development lifecycle that we wanted to build something that would allow companies to manage growing data sets effectively. Part of this task relies on first choosing, and then managing, a database effectively. We have seen many companies build out the same essential operational components around their data-driven applications. Time that could have been spent on domain-specific tasks was lost to building and deploying domain-independent services that process, store, and enable the analysis of data. Services for log aggregation, metrics collection and display, and backups are often required. Implementing these systems is not a trivial task. To further add to the difficulty of standing up these support services, they need to be present from the beginning, a time when knowledge of a database’s intricacies may still be nascent on many teams. Our goal is to free teams from the burden of building and managing redundant data infrastructure so they can start making use of their data sooner.
I have been following Cassandra’s development from the sidelines since version 0.6 and had spent some time at a previous company learning about and operating Cassandra. Coming from a RDBMS background, I admired that things like data distribution and replication are first-class citizens in the architecture. Cassandra was a young project back then, but it was getting a lot of things right from the beginning. Several years later at Seastar, we considered building around other databases. However, we felt that Cassandra was the best fit as it offers shared-nothing architecture, scalable performance, and a high-level query language. We also liked that Cassandra doesn’t require operating an additional distributed system for communication. The quality of documentation and the abundance of support channels available to Cassandra users was also important.
Provisioning for Cassandra
Over the past year we have spoken with hardware vendors, operators running a variety of Cassandra deployments, and have drawn on our own experiences to find hardware that will complement Cassandra workloads. Specifically, we’ve sought out fast CPU clocks, SSDs with longer life expectancies and excellent write performance, and data centers with single digit millisecond latency to AWS and other major public cloud providers.
We are writing services that enable automated deployment of Cassandra nodes running in Linux containers. These containers will be mapped to dedicated disks, run a minimal number of processes, and have no shell access. Communication between nodes will be encrypted with strong SSL ciphers. Encrypting communication adds a small amount of overhead, but by using containers and avoiding the hypervisor found in a typical cloud environment, users will see a net performance gain.
Our backend services are primarily written in Go and will enable automated provisioning, log retrieval, metrics retrieval, and other critical functionality like backup and repair. This is a massive undertaking, but we believe that the automation of these common tasks is key to operating Cassandra successfully at scale. Administrative functionality, log messages, and metrics data will be exposed through a public API that will also be consumed and presented by our dashboard.
The Seastar Dashboard
Our dashboard is written in EmberJS and provides a beautiful and simple interface on top of our public API. We’ve thought carefully about creating views that support common workflows for database administrators and developers using Cassandra such as adding capacity and monitoring. When a problem arises, the dashboard can suggest how to resolve it or let you open a support request with one of our engineers that automatically includes pointers to help us quickly understand the issue.
On Metrics and Alerting
We are applying some past, painful experiences with busy dashboards and noisy alerting systems to create a service that emphasizes meaningful metrics and actionable alerts specific to Cassandra. An example of an unhelpful metric is average request latency. An average can hide many slow requests and poor user experiences; it is far more useful for a team to know their 95th, 99th, or even 99.9th percentile latencies. Providing this data helps to ensure that SLAs are being met.
Another meaningful metric for Cassandra is the number of blocked and pending tasks in the thread pools at each execution stage. These numbers tell us when and where work is not getting done in Cassandra. Compaction performance and disk utilization are also key factors that alert an operator that it may be time to take action and add capacity to their cluster or revisit their compaction strategies.
Joining the Community
While we are still building the Seastar platform, we want to share its design with the wider community. We’ve put a lot of thought into it and value feedback.
We think that Cassandra is an amazing database with an outstanding community, and we hope that our platform will be a contribution to the larger ecosystem. We want to encourage community growth by making it easier to get started with Cassandra. We firmly believe that our unique blend of infrastructure, software services, and talented team members will foster community growth by offering a platform that is approachable, inexpensive, and a joy to use. If you’re interested in providing feedback please email us or join our beta program!