i2O Water provides intelligent solutions for water utility companies around the world, to help them reduce the pressure in their water networks, to save water. This helps them reduce leakages and bursts on their water network. We currently save over 100 million litres of water per day for our customers across the world.
I’m the software and IT Director of i2O Water, so my responsibility is for leading the teams that write the software for our intelligent devices, both the embedded software and platform software. Cassandra plays a role that keeps the data from our devices that interact with the Platform and the intelligent algorithms run against.
We use Apache Cassandra as our predominant column store, for time series data within our solution. We record time series data for multiple
physical channels from our devices out in the field, over the GPRS mobile phone network through to the Internet. We also record how the water company’s network topology changes over time, so that it evolves as new zones and devices are added and created. In addition we also store large amounts of spot events over time such as, alarms, pieces of equipment that are going faulty, and so on.
Prior to using Cassandra we had a traditional analysis technology using Microsoft SQL Server. We used a rather slow architecture and we changed the platform over to using an event driven architecture. We were getting more successful with our business so we were getting more and more of this
time series data. We really looked for a store to optimize the storage and retrieval of the time series.
The other technologies we looked at were other column stores, both open-source and commercial, and by far and away Cassandra had the best reputation and had the best performance for the testing that we did.
We have hosted deployment by Rackspace; it’s a software as a service (SaaS) solution that we provide to the water utilities. We have a virtualized environment comprising of about 16 virtual machines of various flavors that run our system. We have several virtual servers running Apache Cassandra as individual nodes and also have three nodes in our production system.
Presently we’re migrating our customers off our old platform to the new one, so what we store today on Cassandra’s not really representative of what we need to store. We currently have about 1.5 terabytes of data in our old existing platform, which we’ll be looking to move over soon. The migration is customer specific so it’s fine in terms of our architecture that uniquely supports the ability for us to replay history. We effectively replay the events that occurred in the past, as though the devices were talking to the new platform. We’re not actually doing a data migration as such, from database to database technology.
Our architecture is built around the ability to add services to the ecosystem whilst the ecosystem is running, but also, to pretend they have been there right from the beginning of time, thus catch up and replay anything that they may have missed.
Over here in Europe it’s a bit of a slow start on Cassandra, it’s just kicking off now really. DataStax has opened an office here, which is helping to promote that community, there’s more community events sparking up with meetups, which is useful. There’s none actually in our local area here in Southampton, there are some in London which we’ve gone to. Of course there’s an annual Cassandra Summit EU conference, which myself and a couple of our developers will be attending, too. As far as the on-line community’s concerned, it’s been really great. We’ve have quite a lot of help, in terms of asking questions on forums and getting responses.
I encourage people to take a look at our website and look at what we do, because I think we’re helping the world save water, which is a very laudable target for an innovative and commercial business. We’re using quite a lot of innovative technology to help us do that, of which Cassandra plays a major role.