Kubernetes deployment#
This guide explains how to deploy CLP on Kubernetes using Helm. This provides an alternative to Docker Compose and enables deployment on Kubernetes clusters ranging from local development setups to production environments.
Note
For a detailed overview of CLP’s services and their dependencies, see the deployment orchestration design doc.
Requirements#
The following tools are required to deploy CLP on Kubernetes:
kubectl>= 1.30Helm >= 4.0
A Kubernetes cluster (see Setting up a cluster below)
When not using S3 storage, a shared filesystem accessible by all worker pods (e.g., NFS, SeaweedFS) or local storage for single-node deployments
Setting up a cluster#
You can deploy CLP on either a local development cluster or a production Kubernetes cluster.
Option 1: Local development with kind#
kind (Kubernetes in Docker) is ideal for testing and development. It runs a Kubernetes
cluster inside Docker containers on your local machine.
For single-host kind deployments, see the quick-start guides, which cover creating
a kind cluster and installing the Helm chart.
Option 2: Production Kubernetes cluster#
For production deployments, you can use any Kubernetes distribution:
Managed Kubernetes services: Amazon EKS, Azure AKS, Google GKE
Setting up a cluster with kubeadm#
kubeadm is the official Kubernetes tool for bootstrapping clusters. You can follow the
official kubeadm installation guide to install the prerequisites, container runtime,
and kubeadm on all nodes. Then follow the steps below to create a cluster.
Initialize the control plane (on the control-plane node only):
sudo kubeadm init --pod-network-cidr=10.244.0.0/16
Tip
Save the
kubeadm joincommand printed at the end of the output. You’ll need it to join worker nodes later.Note
The
--pod-network-cidrspecifies the IP range for pods. If10.244.0.0/16conflicts with your network, use a different private range as RFC 1918 specifies (e.g.,192.168.0.0/16,172.16.0.0/16, or10.200.0.0/16).To set up
kubectlfor your user:mkdir -p "$HOME/.kube" sudo cp -i /etc/kubernetes/admin.conf "$HOME/.kube/config" sudo chown "$(id -u):$(id -g)" "$HOME/.kube/config"
Install a CNI plugin (on the control-plane node):
A CNI plugin is required for pod-to-pod networking. The following installs Cilium, a high-performance CNI that uses eBPF:
helm repo add cilium https://helm.cilium.io/ helm repo update helm install cilium cilium/cilium --namespace kube-system \ --set ipam.operator.clusterPoolIPv4PodCIDRList=10.244.0.0/16
Note
The
clusterPoolIPv4PodCIDRListmust match the--pod-network-cidrused inkubeadm init.Join worker nodes (on each worker node):
Run the
kubeadm joincommand you saved from step 1. It should look something like:sudo kubeadm join <control-plane-ip>:6443 \ --token <token> \ --discovery-token-ca-cert-hash sha256:<hash>
If you need to regenerate the command, on the control-plane node, run:
kubeadm token create --print-join-command
Installing the Helm chart#
Once your cluster is ready, you can install CLP using the Helm chart.
Getting the chart#
The CLP Helm chart is located in the repository at
tools/deployment/package-helm/.
# Clone the repository (if you haven't already)
git clone --branch main https://github.com/y-scope/clp.git
cd clp/tools/deployment/package-helm
Production cluster requirements (optional)#
The following configurations are optional but recommended for production deployments. You can skip this section for testing or development.
Storage for CLP package services’ data and logs (optional, for centralized debugging):
The Helm chart creates static PersistentVolumes using local host paths by default, so no StorageClass configuration is required for basic deployments. For easier debugging, you can configure a centralized storage backend for the following directories:
data_directory- where CLP stores runtime datalogs_directory- where CLP services write logstmp_directory- where temporary files are stored
Note
We aim to improve the logging infrastructure so mapping log volumes will not be required in the future. See y-scope/clp#1760 for details.
Shared storage for workers (required for multi-node clusters using filesystem storage):
Tip
S3 storage is strongly recommended for multi-node clusters as it does not require shared local storage between workers. If you use S3 storage, you can skip this section.
For multi-node clusters using filesystem storage, the following directories must be accessible from all worker nodes at the same paths. Without shared storage, compressed logs created by one worker cannot be searched by other workers.
archive_output.storage.directory- where compressed archives are storedstream_output.storage.directory- where stream files are storedlogs_input.directory- where input logs are read from
Set up NFS, SeaweedFS, or another shared filesystem to provide this access. See the SeaweedFS section in the Docker Compose deployment guide for setup instructions.
External databases (recommended for production):
See the external database setup guide for using external MariaDB/MySQL and MongoDB databases
Basic installation#
Create the required directories on all worker nodes:
export CLP_HOME="/tmp/clp"
mkdir -p \
"$CLP_HOME/var/data/"{archives,streams,staged-archives,staged-streams} \
"$CLP_HOME/var/log/"{compression_scheduler,compression_worker,user} \
"$CLP_HOME/var/log/"{query_scheduler,query_worker,reducer} \
"$CLP_HOME/var/tmp"
Then on the control-plane node, create the required directories, generate credentials, and install CLP:
export CLP_HOME="/tmp/clp"
mkdir -p \
"$CLP_HOME/var/"{data,log}/{database,queue,redis,results_cache} \
"$CLP_HOME/var/data/"{archives,streams,staged-archives,staged-streams} \
"$CLP_HOME/var/log/"{compression_scheduler,compression_worker,user} \
"$CLP_HOME/var/log/"{query_scheduler,query_worker,reducer} \
"$CLP_HOME/var/log/"{garbage_collector,api_server,log_ingestor,mcp_server} \
"$CLP_HOME/var/tmp"
# Credentials (change these for production)
export CLP_DB_PASS="pass"
export CLP_DB_ROOT_PASS="root-pass"
export CLP_QUEUE_PASS="pass"
export CLP_REDIS_PASS="pass"
# Worker replicas (increase for multi-node clusters)
export CLP_COMPRESSION_WORKER_REPLICAS=1
export CLP_QUERY_WORKER_REPLICAS=1
export CLP_REDUCER_REPLICAS=1
helm install clp . \
--set clpConfig.data_directory="$CLP_HOME/var/data" \
--set clpConfig.logs_directory="$CLP_HOME/var/log" \
--set clpConfig.tmp_directory="$CLP_HOME/var/tmp" \
--set clpConfig.archive_output.storage.directory="$CLP_HOME/var/data/archives" \
--set clpConfig.stream_output.storage.directory="$CLP_HOME/var/data/streams" \
--set credentials.database.password="$CLP_DB_PASS" \
--set credentials.database.root_password="$CLP_DB_ROOT_PASS" \
--set credentials.queue.password="$CLP_QUEUE_PASS" \
--set credentials.redis.password="$CLP_REDIS_PASS" \
--set compressionWorker.replicas="$CLP_COMPRESSION_WORKER_REPLICAS" \
--set queryWorker.replicas="$CLP_QUERY_WORKER_REPLICAS" \
--set reducer.replicas="$CLP_REDUCER_REPLICAS"
Multi-node deployment#
For multi-node clusters with shared storage mounted on all nodes (e.g., NFS/CephFS via
/etc/fstab), enable distributed storage mode and configure multiple worker replicas:
helm install clp . \
--set distributedDeployment=true \
--set compressionWorker.replicas=3 \
--set queryWorker.replicas=3 \
--set reducer.replicas=3
Installation with custom values#
For highly customized deployments, create a values file instead of using many --set flags:
# Use a custom image. For local images, import to each node's container runtime first.
image:
clpPackage:
repository: "clp-package"
pullPolicy: "Never" # Use "Never" for local images, "IfNotPresent" for remote
tag: "latest"
# Adjust worker concurrency
workerConcurrency: 16
# Configure CLP settings
clpConfig:
# Use clp-text, instead of clp-json (default)
package:
storage_engine: "clp" # Use "clp-s" for clp-json, "clp" for clp-text
query_engine: "clp" # Use "clp-s" for clp-json, "clp" for clp-text, "presto" for Presto
# Configure archive output
archive_output:
target_archive_size: 536870912 # 512 MB
compression_level: 6
retention_period: 43200 # (in minutes) 30 days
# Enable MCP server
mcp_server:
port: 30800
logging_level: "INFO"
# Configure results cache
results_cache:
retention_period: 120 # (in minutes) 2 hours
# Override credentials (use secrets in production!)
credentials:
database:
username: "clp-user"
password: "your-db-password"
root_username: "root"
root_password: "your-db-root-password"
queue:
username: "clp-user"
password: "your-queue-password"
redis:
password: "your-redis-password"
Install with custom values:
helm install clp . -f custom-values.yaml
Tip
To preview the generated Kubernetes manifests before installing, use helm template:
helm template clp . -f custom-values.yaml
Worker scheduling#
You can control where workers are scheduled using standard Kubernetes scheduling primitives
(nodeSelector, affinity, tolerations, topologySpreadConstraints).
Dedicated node pools#
To run compression workers, query workers, and reducers in separate node pools:
Label your nodes:
# Label compression nodes kubectl label nodes node1 node2 yscope.io/nodeType=compression # Label query nodes kubectl label nodes node3 node4 yscope.io/nodeType=query
Configure scheduling:
dedicated-scheduling.yaml#compressionWorker: replicas: 2 scheduling: nodeSelector: yscope.io/nodeType: compression queryWorker: replicas: 2 scheduling: nodeSelector: yscope.io/nodeType: query reducer: replicas: 2 scheduling: nodeSelector: yscope.io/nodeType: query
Install:
helm install clp . -f dedicated-scheduling.yaml --set distributedDeployment=true
Verifying the deployment#
After installing the Helm chart, you can verify that all components are running correctly as follows.
Check pod status#
Wait for all pods to be ready:
# Watch pod status
kubectl get pods -w
# Wait for all pods to be ready
kubectl wait pods --all --for=condition=Ready --timeout=300s
The output should show all pods are in the Running state:
NAME READY STATUS RESTARTS AGE
clp-api-server-... 1/1 Running 0 2m
clp-compression-scheduler-... 1/1 Running 0 2m
clp-compression-worker-... 1/1 Running 0 2m
clp-database-0 1/1 Running 0 2m
clp-garbage-collector-... 1/1 Running 0 2m
clp-query-scheduler-... 1/1 Running 0 2m
clp-query-worker-... 1/1 Running 0 2m
clp-queue-0 1/1 Running 0 2m
clp-reducer-... 1/1 Running 0 2m
clp-redis-0 1/1 Running 0 2m
clp-results-cache-0 1/1 Running 0 2m
clp-webui-... 1/1 Running 0 2m
Check initialization jobs#
CLP runs initialization jobs on first deployment. Check that these jobs completed successfully:
# Check job completion
kubectl get jobs
# Expected output:
# NAME COMPLETIONS DURATION AGE
# clp-db-table-creator 1/1 5s 2m
# clp-results-cache-indices-creator 1/1 3s 2m
Access the Web UI#
Once all pods are ready, you access the CLP Web UI at: http://<node-ip>:30000 (the value of
clpConfig.webui.port)
Using CLP#
With CLP deployed on Kubernetes, you can compress and search logs using the same workflows as Docker Compose deployments. Refer to the quick-start guide for your chosen flavor:
Using clp-json
How to compress and search JSON logs.
Using clp-text
How to compress and search unstructured text logs.
Note
By default (allowHostAccessForSbinScripts: true), the database and results cache are exposed on
NodePorts, allowing you to use sbin/ scripts from the CLP package. Download a
release matching the chart’s appVersion, then update the following configurations
in etc/clp-config.yaml:
database:
port: 30306 # Match `clpConfig.database.port` in Helm values
results_cache:
port: 30017 # Match `clpConfig.results_cache.port` in Helm values
Alternatively, use the Web UI (clp-json or clp-text) to compress logs and search interactively, or the API server to submit queries and view results programmatically.
Monitoring and debugging#
To check the status of pods:
kubectl get pods
To view logs for a specific pod:
kubectl logs -f <pod-name>
To execute commands in a pod:
kubectl exec -it <pod-name> -- /bin/bash
To debug Helm chart issues:
helm install clp . --dry-run --debug
Managing releases#
This section covers how to manage your CLP Helm release.
Note
Upgrade and rollback are not yet supported. We plan to add support as we finalize the migration mechanism.
Uninstall CLP#
helm uninstall clp
Warning
Uninstalling the Helm release will delete all CLP pods and services. However, PersistentVolumes
with the Retain policy will preserve your data. To completely remove all data, delete the PVs and
the data directories manually.
Cleaning up#
To tear down a kubeadm cluster:
Uninstall Cilium (on the control-plane):
helm uninstall cilium --namespace kube-system
Reset each node (run on all worker nodes first, then the control-plane):
sudo kubeadm reset -f sudo rm -rf /etc/cni/net.d/* sudo umount /var/run/cilium/cgroupv2/ sudo rm -rf /var/run/cilium
Clean up kubeconfig (on the control-plane):
rm -rf ~/.kube