Fast Kubernetes become an industry standard, with maximum ninety four% of organizations deploying their services and applications on a container orchestration platform, according to a survey. One of the main reasons companies deploy on Kubernetes is standardization, allowing advanced users to see a twofold increase in productivity.
Standardization on Kubernetes gives organizations the ability to deploy any workload, anywhere. But there’s one missing piece: the technology assumes that workloads are transient, meaning only stateless workloads can be securely deployed on Kubernetes. However, the community has recently changed the paradigm and brought in features like StatefulSets and Storage Classes, making it possible to use data on Kubernetes.
While it is possible to run stateful workloads on Kubernetes, there are still many challenges. In this article, I provide ways to make it happen and why it’s worth it.
Do it gradually
Kubernetes is on its way to becoming as ubiquitous as Linux and the de facto way to run any application, anywhere, in a distributed fashion. Using Kubernetes involves learning many technical concepts and vocabulary. For instance, newbies may struggle with many Kubernetes logic units such as containers, buckets, nodes, and clusters.
If you’re not already running Kubernetes in production, don’t jump straight into data workloads. Instead, start with migrating stateless apps to avoid data loss when things go sideways.
If you can’t find an operator that suits your needs, don’t worry, as most of them are open source.
Understand limitations and peculiarities
Once you’re familiar with general Kubernetes concepts, let’s dive into the specifics for state concepts. For example, because applications may have different storage needs, such as performance or capacity requirements, you must provide the correct underlying storage system.
What the industry often refers to as “records” storage is known as the Storage Layer in Kubernetes. They provide a way to describe the different types of layers that a Kubernetes cluster can access. Storage classes can have different levels of quality of service, such as I/O operations per second on GiB, backup policies, or arbitrary policies, such as topology and topology connection is allowed.
Another important component to understand is the StatefulSet. It is a Kubernetes API object used to manage stateful applications and provides key features such as:
- A unique, stable network identifier that allows you to monitor volumes and remove and reattach them at will;
- Stable, continuous storage to keep your data safe;
- Orderly, flexible deployment and scaling needed for many Day 2 operations.
While StatefulSet has been a successful replacement for the infamous (now deprecated) PetSet, it is far from perfect and has limitations. For example, the StatefulSet controller has does not support built-in volume resizing (PVC) — this is a big challenge if your application dataset size is about to grow on top of the currently allocated storage. Have solutionbut those limitations must be understood in advance so that the engineering team knows how to deal with them.