We’ve been talking about migrating workloads to the cloud for a long time, but a look at the application portfolios of many IT organizations shows that there’s still a lot of work to be done. In many cases, challenges with maintaining and moving data in the cloud remain the major limiting factor slowing down cloud adoption, despite the fact that databases in the cloud have been available for years.
For this reason, there has been a recent increase in interest in data infrastructure that is designed to make the most of the benefits cloud computing offers. The cloud-native database achieves the goals of scalability, elasticity, flexibility, observability, and automation; The K8ssandra project is a great example. It packages Apache Cassandra and accessories into production-ready Kubernetes deployments.
This raises an interesting question: should databases running on Kubernetes be considered cloud-native? While Kubernetes was originally designed for stateless workloads, recent improvements to Kubernetes, such as StatefulSets and Persistent Volumes, have made it possible to run stateful workloads as well. Even DevOps practitioners who have long suspected running databases on Kubernetes are starting to arrive, and best practices are beginning to emerge.
But, of course, it is not our goal to accept databases running on Kubernetes. If we are not pushing for greater maturity in cloud-native databases, we are missing out on a huge opportunity. To make the database the most “cloud-native”, we need to embrace everything Kubernetes has to offer. A truly cloud-native approach means adopting key elements of the Kubernetes design paradigm. The cloud-native database should be such that it can run effectively on Kubernetes. Let’s explore some of the Kubernetes design principles that lead the way.
kubernetes design principles
Leverage Compute, Network and Storage as Commodity API
One of the keys to the success of cloud computing is the commoditization of compute, networking and storage, as resources we can provision through simple APIs. Consider this sample of AWS services:
- to compute: We allocate virtual machines through EC2 and Autoscaling Groups (ASGs).
- Network: We manage traffic using Elastic Load Balancers (ELB), Route 53 and VPC Peering.
- storage: We persist data using options such as Simple Storage Service (S3) for long-term object storage or Elastic Block Storage (EBS) volumes for our compute instances.
Kubernetes provides its own APIs to provide similar services to the world of containerized applications:
- to compute: Pods, deployment and replica sets manage the scheduling and lifecycle of containers on computing hardware.
- Network:services and ingress expose the network interface of a container.
- storage: Persistent volumes and stateful sets enable flexible joins for storing containers.
Kubernetes resources promote the portability of applications across Kubernetes distributions and service providers. What does this mean for the database? They are simply applications that take advantage of computing, networking and storage resources to provide services of data persistence and retrieval:
- to compute: A database requires sufficient processing power to process the incoming data and queries. Each database node is deployed as a pod and grouped into StatefulSets, enabling Kubernetes to manage scaling out and scaling.
- Network: A database needs to expose interfaces for data and controls. We can use Kubernetes Services and Ingress Controllers to expose these interfaces.
- storage: A database uses persistent volumes of a specified storage class to store and retrieve data.
Thinking about databases as their compute, network, and storage needs removes a lot of the complexity involved in deploying to Kubernetes.
Separate control and data planes
Kubernetes promotes the separation of control and data planes. The Kubernetes API is the key data plane interface used to request server computing resources, while the control plane manages the details of mapping those requests to an underlying IaaS platform.
We can apply this same pattern to the database. For example, each node in Cassandra’s data plane contains a port for clients to access Cassandra Query Language (CQL) and a port used for internode communication. The control plane contains the Java Management Extensions (JMX) interface provided by each Cassandra node. Although JMX is a standard that is showing its age and has some security vulnerabilities, it is a relatively simple task to take a more cloud-native approach. In K8ssandra, Cassandra is deployed in a custom container image that adds a RESTful management API, bypassing the JMX interface.
The remainder of the control plane consists of logic that takes advantage of the Management API to manage Cassandra nodes. It is implemented through the Kubernetes Operator Pattern. Operators define custom resources and provide control loops that observe the state of those resources and take action to steer them toward the desired state, helping to extend Kubernetes with domain-specific logic .
The K8ssandra project uses the cas-operator to automate Cassandra operations. The cas-operator defines a “cassandradatacenter” custom resource (CRD) to represent each top-level failover domain of the Cassandra cluster. This creates a high-level abstraction based on stateful sets and percentage volumes.
A sample K8ssandra deployment including Apache Cassandra and cas-operator:
make observation easy
The three pillars of observable systems are logging, metrics and tracing. Kubernetes provides a great starting point by exposing each container’s logs to third-party log aggregation solutions. Implementing metrics and tracing requires a little more effort, but there are several solutions available.
The K8ssandra project supports metrics collection using the kube-prometheus-stack. The Metrics Collector for Apache Cassandra (MCAC) is deployed as an agent on each Cassandra node, providing a dedicated Metrics endpoint. A service monitor from the kube-prometheus-stack pulls metrics from each agent and stores them in Prometheus for use by Grafana or other visualization and analysis tools.
Make the default configuration secure
Kubernetes networking is secure by default: ports must be explicitly exposed in order to access the pod externally. It sets a useful precedent for database deployment, forcing us to think carefully about how each control plane and data plane interface will be exposed and which interfaces should be exposed through the Kubernetes service. .
In Cassandra, CQL access is exposed as a service for each CassandraDatacenter resource, while APIs for management and metrics are accessed for individual Cassandra nodes by cas-operator and prometheus service monitor, respectively.
Kubernetes also provides features for secret management, including sharing encryption keys and configuring administrative accounts. The K8ssandra deployment replaces Cassandra’s default administrator account with a new administrator username and password.
Prefer declarative configuration
In the Kubernetes declarative approach, you specify the desired state of resources, and controllers manipulate the underlying infrastructure to achieve that state. The cas-operator allows you to specify the desired number of nodes in the cluster, manage the details of scaling up new nodes, and choose which nodes to remove to scale down.
Next generation operators should enable us to specify rules for the stored data size, the number of transactions per second, or both. Perhaps we will be able to specify the maximum and minimum cluster size and when to move less frequently used data to object storage.
Comment: The best designs are based on the knowledge of the community.
Hopefully, I’ve convinced you that Kubernetes is a great source of best practices for cloud-native database implementation, and innovation continues. Solutions for federated Kubernetes clusters are still maturing, but it will soon become much easier to manage multi-datacenter Cassandra clusters in Kubernetes. In the Cassandra community, we can work to make extensions for management and metrics part of the core Apache project so that Cassandra is inherently cloud-native for everyone.