Kubernetes deployment#

This guide explains how to deploy CLP on Kubernetes using Helm. This provides an alternative to Docker Compose and enables deployment on Kubernetes clusters ranging from local development setups to production environments.

Note

For a detailed overview of CLP’s services and their dependencies, see the deployment orchestration design doc.

Requirements#

The following tools are required to deploy CLP on Kubernetes:

kubectl >= 1.30
Helm >= 4.0
A Kubernetes cluster (see Setting up a cluster below)
When not using S3 storage, a shared filesystem accessible by all worker pods (e.g., NFS, SeaweedFS) or local storage for single-node deployments

Setting up a cluster#

You can deploy CLP on either a local development cluster or a production Kubernetes cluster.

Option 1: Local development with `kind`#

kind (Kubernetes in Docker) is ideal for testing and development. It runs a Kubernetes cluster inside Docker containers on your local machine.

For single-host kind deployments, see the quick-start guides, which cover creating a kind cluster and installing the Helm chart.

Option 2: Production Kubernetes cluster#

For production deployments, you can use any Kubernetes distribution:

Managed Kubernetes services: Amazon EKS, Azure AKS, Google GKE
Self-hosted: kubeadm, k3s, RKE2

Setting up a cluster with `kubeadm`#

kubeadm is the official Kubernetes tool for bootstrapping clusters. You can follow the official kubeadm installation guide to install the prerequisites, container runtime, and kubeadm on all nodes. Then follow the steps below to create a cluster.

Initialize the control plane (on the control-plane node only):
```
sudo kubeadm init --pod-network-cidr=10.244.0.0/16
```
Tip

Save the kubeadm join command printed at the end of the output. You’ll need it to join worker nodes later.

Note

The --pod-network-cidr specifies the IP range for pods. If 10.244.0.0/16 conflicts with your network, use a different private range as RFC 1918 specifies (e.g., 192.168.0.0/16, 172.16.0.0/16, or 10.200.0.0/16).

To set up kubectl for your user:
```
mkdir -p "$HOME/.kube"
sudo cp -i /etc/kubernetes/admin.conf "$HOME/.kube/config"
sudo chown "$(id -u):$(id -g)" "$HOME/.kube/config"
```
Install a CNI plugin (on the control-plane node):

A CNI plugin is required for pod-to-pod networking. The following installs Cilium, a high-performance CNI that uses eBPF:
```
helm repo add cilium https://helm.cilium.io/
helm repo update
helm install cilium cilium/cilium --namespace kube-system \
  --set ipam.operator.clusterPoolIPv4PodCIDRList=10.244.0.0/16
```
Note

The clusterPoolIPv4PodCIDRList must match the --pod-network-cidr used in kubeadm init.
Join worker nodes (on each worker node):

Run the kubeadm join command you saved from step 1. It should look something like:
```
sudo kubeadm join <control-plane-ip>:6443 \
  --token <token> \
  --discovery-token-ca-cert-hash sha256:<hash>
```
If you need to regenerate the command, on the control-plane node, run:
```
kubeadm token create --print-join-command
```

Installing the Helm chart#

Once your cluster is ready, you can install CLP using the Helm chart.

Getting the chart#

The CLP Helm chart is published to a Helm repository hosted on GitHub Pages.

helm repo add clp https://y-scope.github.io/clp
helm repo update clp

Production cluster requirements (optional)#

The following configurations are optional but recommended for production deployments. You can skip this section for testing or development.

Shared storage for workers (required for multi-node clusters using filesystem storage):

Tip

S3 storage is strongly recommended for multi-node clusters as it does not require shared local storage between workers. If you use S3 storage, you can skip this section.

If storage type is set to fs, users must manually provision the persistent volumes and update accessModes of PVCs.
External databases (recommended for production):
- See the external database setup guide for using external MariaDB/MySQL and MongoDB databases

Basic installation#

Generate credentials and install CLP:

# Credentials (change these for production)
export CLP_DB_PASS="pass"
export CLP_DB_ROOT_PASS="root-pass"
export CLP_QUEUE_PASS="pass"
export CLP_REDIS_PASS="pass"

# Worker replicas (increase for multi-node clusters)
export CLP_COMPRESSION_WORKER_REPLICAS=1
export CLP_QUERY_WORKER_REPLICAS=1
export CLP_REDUCER_REPLICAS=1

helm install clp clp/clp --devel \
  --set credentials.database.password="$CLP_DB_PASS" \
  --set credentials.database.root_password="$CLP_DB_ROOT_PASS" \
  --set credentials.queue.password="$CLP_QUEUE_PASS" \
  --set credentials.redis.password="$CLP_REDIS_PASS" \
  --set compressionWorker.replicas="$CLP_COMPRESSION_WORKER_REPLICAS" \
  --set queryWorker.replicas="$CLP_QUERY_WORKER_REPLICAS" \
  --set reducer.replicas="$CLP_REDUCER_REPLICAS"

Multi-node deployment#

For multi-node clusters with shared storage mounted on all nodes (e.g., NFS/CephFS via /etc/fstab), enable distributed storage mode and configure multiple worker replicas:

helm install clp clp/clp --devel \
  --set distributedDeployment=true \
  --set compressionWorker.replicas=3 \
  --set queryWorker.replicas=3 \
  --set reducer.replicas=3

Installation with custom values#

For highly customized deployments, create a values file instead of using many --set flags:

custom-values.yaml#

# Use a custom image. For local images, import to each node's container runtime first.
image:
  clpPackage:
    repository: "clp-package"
    pullPolicy: "Never"  # Use "Never" for local images, "IfNotPresent" for remote
    tag: "latest"

# Adjust worker concurrency
workerConcurrency: 16

# Configure CLP settings
clpConfig:
  # Use clp-text, instead of clp-json (default)
  package:
    storage_engine: "clp"  # Use "clp-s" for clp-json, "clp" for clp-text
    query_engine: "clp"   # Use "clp-s" for clp-json, "clp" for clp-text, "presto" for Presto

  # Configure archive output
  archive_output:
    target_archive_size: 536870912  # 512 MB
    compression_level: 6
    retention_period: 43200  # (in minutes) 30 days

  # Enable MCP server
  mcp_server:
    port: 30800
    logging_level: "INFO"

  # Configure results cache
  results_cache:
    retention_period: 120  # (in minutes) 2 hours

# Override credentials (use secrets in production!)
credentials:
  database:
    username: "clp-user"
    password: "your-db-password"
    root_username: "root"
    root_password: "your-db-root-password"
  queue:
    username: "clp-user"
    password: "your-queue-password"
  redis:
    password: "your-redis-password"

Install with custom values:

helm install clp clp/clp --devel -f custom-values.yaml

Tip

To preview the generated Kubernetes manifests before installing, use helm template:

helm template clp . -f custom-values.yaml

Worker scheduling#

You can control where workers are scheduled using standard Kubernetes scheduling primitives (nodeSelector, affinity, tolerations, topologySpreadConstraints).

Dedicated node pools#

To run compression workers, query workers, and reducers in separate node pools:

Label your nodes:

# Label compression nodes
kubectl label nodes node1 node2 yscope.io/nodeType=compression

# Label query nodes
kubectl label nodes node3 node4 yscope.io/nodeType=query

Configure scheduling:

dedicated-scheduling.yaml#

distributedDeployment: true

compressionWorker:
  replicas: 2
  scheduling:
    nodeSelector:
      yscope.io/nodeType: compression

queryWorker:
  replicas: 2
  scheduling:
    nodeSelector:
      yscope.io/nodeType: query

reducer:
  replicas: 2
  scheduling:
    nodeSelector:
      yscope.io/nodeType: query

Install:

helm install clp clp/clp --devel -f dedicated-scheduling.yaml

Shared node pool#

To run all worker types in the same node pool:

Label your nodes:

kubectl label nodes node1 node2 node3 node4 yscope.io/nodeType=compute

Configure scheduling:

shared-scheduling.yaml#

distributedDeployment: true

compressionWorker:
  replicas: 2
  scheduling:
    nodeSelector:
      yscope.io/nodeType: compute
    topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: "kubernetes.io/hostname"
        whenUnsatisfiable: "DoNotSchedule"
        labelSelector:
          matchLabels:
            app.kubernetes.io/component: compression-worker

queryWorker:
  replicas: 2
  scheduling:
    nodeSelector:
      yscope.io/nodeType: compute

reducer:
  replicas: 2
  scheduling:
    nodeSelector:
      yscope.io/nodeType: compute

Install:

helm install clp clp/clp --devel -f shared-scheduling.yaml

Verifying the deployment#

After installing the Helm chart, you can verify that all components are running correctly as follows.

Check pod status#

Wait for all pods to be ready:

# Watch pod status
kubectl get pods -w

# Wait for all pods to be ready
kubectl wait pods --all --for=condition=Ready --timeout=300s

The output should show all pods are in the Running state:

NAME                                        READY   STATUS    RESTARTS   AGE
clp-api-server-...                          1/1     Running   0          2m
clp-compression-scheduler-...               1/1     Running   0          2m
clp-compression-worker-...                  1/1     Running   0          2m
clp-database-0                              1/1     Running   0          2m
clp-garbage-collector-...                   1/1     Running   0          2m
clp-query-scheduler-...                     1/1     Running   0          2m
clp-query-worker-...                        1/1     Running   0          2m
clp-queue-0                                 1/1     Running   0          2m
clp-reducer-...                             1/1     Running   0          2m
clp-redis-0                                 1/1     Running   0          2m
clp-results-cache-0                         1/1     Running   0          2m
clp-webui-...                               1/1     Running   0          2m

Check initialization jobs#

CLP runs initialization jobs on first deployment. Check that these jobs completed successfully:

# Check job completion
kubectl get jobs

# Expected output:
# NAME                              COMPLETIONS   DURATION   AGE
# clp-db-table-creator              1/1           5s         2m
# clp-results-cache-indices-creator 1/1           3s         2m

Access the Web UI#

Once all pods are ready, you access the CLP Web UI at: http://<node-ip>:30000 (the value of clpConfig.webui.port)

Using CLP#

With CLP deployed on Kubernetes, you can compress and search logs using the same workflows as Docker Compose deployments. Refer to the quick-start guide for your chosen flavor:

Using clp-json

How to compress and search JSON logs.

quick-start/clp-json

Using clp-text

How to compress and search unstructured text logs.

quick-start/clp-text

Note

By default (allowHostAccessForSbinScripts: true), the database and results cache are exposed on NodePorts, allowing you to use sbin/ scripts from the CLP package. Download a release matching the chart’s appVersion, then update the following configurations in etc/clp-config.yaml:

database:
  port: 30306  # Match `clpConfig.database.port` in Helm values
results_cache:
  port: 30017  # Match `clpConfig.results_cache.port` in Helm values

Alternatively, use the Web UI (clp-json or clp-text) to compress logs and search interactively, or the API server to submit queries and view results programmatically.

Monitoring and debugging#

To check the status of pods:

kubectl get pods

To view logs for a specific pod:

kubectl logs -f <pod-name>

To execute commands in a pod:

kubectl exec -it <pod-name> -- /bin/bash

To debug Helm chart issues:

# For debugging the published chart from the repository
helm install clp clp/clp --devel --dry-run --debug

# For debugging local chart changes during development
helm install clp /path/to/local/chart --dry-run --debug

Managing releases#

This section covers how to manage your CLP Helm release.

Note

Upgrade and rollback are not yet supported. We plan to add support as we finalize the migration mechanism.

Uninstall CLP#

helm uninstall clp

Warning

Uninstalling the Helm release will delete all CLP pods and services. However, dynamically provisioned PersistentVolumeClaims (database, results cache, archives, streams) may be retained depending on the cluster’s reclaimPolicy. To completely remove all data, delete the PVCs manually.

Cleaning up#

To tear down a kubeadm cluster:

Uninstall Cilium (on the control-plane):

helm uninstall cilium --namespace kube-system

Reset each node (run on all worker nodes first, then the control-plane):

sudo kubeadm reset -f
sudo rm -rf /etc/cni/net.d/*
sudo umount /var/run/cilium/cgroupv2/
sudo rm -rf /var/run/cilium

Clean up kubeconfig (on the control-plane):
```
rm -rf ~/.kube
```

Kubernetes deployment#

Requirements#

Setting up a cluster#

Option 1: Local development with kind#

Option 2: Production Kubernetes cluster#

Setting up a cluster with kubeadm#

Installing the Helm chart#

Getting the chart#

Production cluster requirements (optional)#

Basic installation#

Multi-node deployment#

Installation with custom values#

Worker scheduling#

Dedicated node pools#

Shared node pool#

Verifying the deployment#

Check pod status#

Check initialization jobs#

Access the Web UI#

Using CLP#

Monitoring and debugging#

Managing releases#

Uninstall CLP#

Cleaning up#

Related guides#

Option 1: Local development with `kind`#

Setting up a cluster with `kubeadm`#