Kubernetes

The Stackable Data Platform runs on Kubernetes, a Kubernetes cluster is a prerequisite to running the platform. On this page you will find information on the supported Kubernetes distributions for production as well as how to set up a local test installation to try out parts of the platform right away.

Supported production distributions

The Stackable Data Platform requires a Kubernetes cluster to be present, where you can install things into. How to set up Kubernetes as well as a cluster depends on the distribution you chose.

The following distributions are supported for a production setup of the Stackable Data Platform:

In case a Kubernetes provider needs some special tuning or we have some tips for it, it has a subpage below this page.

In this version of the SDP, the following Kubernetes versions are supported:

  • 1.28

  • 1.27

  • 1.26

Consult the release notes to find out which specific versions are supported for the Stackable Data Platform you are using.

Installing a testinging/development Kubernetes instance locally

Stackable’s control plane is built around Kubernetes, and we’ll give some brief examples of how to install Kubernetes on your machine.

Installing kubectl

Stackable operators and their services are managed by applying manifest files to the Kubernetes cluster. For this purpose, you need to have the kubectl tool installed. Follow the instructions here for your platform.

Installing Kubernetes using Kind

Kind offers a very quick and easy way to bootstrap your Kubernetes infrastructure in Docker. The big advantage of this is that you can simply remove the Docker containers when you’re finished and clean up easily, making it great for testing and development.

If you don’t already have Docker then visit Docker Website to find out how to install Docker. Kind is a single executable that performs the tasks of installing and configuring Kubernetes for you within Docker containers. The Kind Website has instructions for installing Kind on your system.

Once you have both of these installed then you can build a Kubernetes cluster in Docker. We’re going to create a simple, single node cluster to test out Stackable, with the one node hosting both the Kubernetes control plane and the Stackable services.

kind create cluster --name quickstart

Installing Kubernetes using K3s

K3s provides a quick way of installing Kubernetes. On your control node run the following command to install K3s:

curl -sfL https://get.k3s.io | sh -s - --write-kubeconfig-mode 644

So long as you have an Internet connection K3s will download and automatically configure a simple Kubernetes environment.

Create a symlink to the Kubernetes configuration from your home directory to allow tools like Helm to find the correct configuration.

mkdir ~/.kube
ln -s /etc/rancher/k3s/k3s.yaml ~/.kube/config

Testing your Kubernetes installation

To check if everything worked as expected you can use kubectl cluster-info to retrieve the cluster information. The output should look similar to:

Kubernetes control plane is running at https://127.0.0.1:6443
CoreDNS is running at https://127.0.0.1:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

If you set up your cluster using K3s you will additionally see the metrics server:

Kubernetes control plane is running at https://127.0.0.1:6443
CoreDNS is running at https://127.0.0.1:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://127.0.0.1:6443/api/v1/namespaces/kube-system/services/https:metrics-server:https/proxy

Requirements

To install and use Stackable operators, you Kubernetes cluster needs to meet a few requirements. Also, you as the person installing the operators need some permissions to be able to install them.

RBAC

The operators need a lot of very "heavy" permissions. They need to be able to create ClusterRoles and also the bindings for them, which means that they are very powerful.

What exactly is need?

As a user installing the operators, you will need get, list and create permissions for CustomResourceDefinitions, ClusterRoles, ClusterRoleBindings, StorageClasses and CSIDrivers. Also for the Stackable custom resource SecretClass.

Why exactly?

Every operator comes with a custom resource that it manages, and the custom resource definition needs to be applied. Then, every operator gets its own ClusterRole that then needs to be bound to the operator Pods. A StorageClass is created by the secret and listener operator. Both use StorageClasses as a way to bind Pods and mount information.

Then, the operators themselves need extensive permissions.

Network policies

securityContext requirements

What does that mean? Why?

runAsuser, runAsGroup

TODO

root paths must be rw

readOnlyRootFilesystem

  • we might be able to fix this in some instances

  • sometimes the software is shitty and its hard to fix

storageclass and CSI driver reqs.

The secret operator is basically a fake CSI driver that you can request drives from, with certain labels. It has its own storageclass.

This mechanism is used to mount secrets into Pods, across namespaces. The secret operator can also dynamically update secrets, which is useful for example to renew certificates.

The secret operator is a core part of the Stackable Data Platform, and the Platform does not function without it.