Multi-node deployment#
A multi-node deployment allows you to run CLP across a distributed set of hosts.
Requirements#
-
If you’re not running as root, ensure docker can be run without superuser privileges.
Python 3.8 or higher
One or more hosts networked together
A distributed filesystem (e.g. SeaweedFS) accessible by all worker hosts through a filesystem mount
See below for how to set up a simple SeaweedFS cluster.
Cluster overview#
The CLP package is composed of several components–controller components and worker components. In a cluster, there should be a single instance of each controller component and one or more instances of worker components. The tables below list the components and their functions.
Component |
Description |
---|---|
database |
Database for archive metadata, compression jobs, and query jobs |
queue |
Task queue for schedulers |
redis |
Task result storage for workers |
compression_scheduler |
Scheduler for compression jobs |
query_scheduler |
Scheduler for search/aggregation jobs |
results_cache |
Storage for the workers to return search results to the UI |
webui |
Web server for the UI |
Component |
Description |
---|---|
compression_worker |
Worker processes for compression jobs |
query_worker |
Worker processes for search/aggregation jobs |
reducer |
Reducers for performing the final stages of aggregation jobs |
Note
Running additional workers increases the parallelism of compression and search/aggregation jobs.
Configuring CLP#
Copy
etc/credentials.template.yml
toetc/credentials.yml
.Edit
etc/credentials.yml
:Uncomment the file.
Choose an appropriate username and password.
Note that these are new credentials that will be used by the components.
Choose which hosts you would like to use for the controller components.
You can use a single host for all controller components.
Edit
etc/clp-config.yml
:Uncomment the file.
Set the
host
config of each controller component to the host that you’d like to run them on.If desired, you can run different controller components on different hosts.
Change any of the controller components’ ports that will conflict with services you already have running.
Set
archive_output.directory
to a directory on the distributed filesystem.Ideally, the directory should be empty or should not yet exist (CLP will create it) since CLP will write several files and directories directly to the given directory.
Download and extract the package on all nodes.
Copy the
credentials.yml
andclp-config.yml
files that you created above and paste them intoetc
on all the hosts where you extracted the package.
Starting CLP#
Before starting each CLP component, note that some components must be started before others. We organize the components into groups below, where components in a group can be started in any order, but all components in a group must be started before starting a component in the next group.
Group 1 components:
database
queue
redis
results_cache
Group 2 components:
compression_scheduler
query_scheduler
Group 3 components:
compression_worker
query_worker
reducer
For each component, on the host where you want to run the component, run:
sbin/start-clp.sh <component>
Where <component>
is the name of the component in the groups above.
Using CLP#
Check out the compression and search guides to compress and search your logs.
Stopping CLP#
If you need to stop the cluster, run:
sbin/stop-clp.sh
Setting up SeaweedFS#
The instructions below are for running a simple SeaweedFS cluster on a set of hosts. For other use cases, see the SeaweedFS docs.
Install SeaweedFS.
Start the master and a filer on one of the hosts:
weed master -port 9333 weed filer -port 8888 -master "localhost:9333"
Start one or more volume servers on one or more hosts.
Create a directory where you want SeaweedFS to store data.
Start the volume server:
weed volume -mserver "<master-host>:9333" -dir <storage-dir> -max 0
<master-host>
is the hostname/IP of the master host.<storage-dir>
is the directory where you want SeaweedFS to store data.
Start a FUSE mount on every host that you want to run a CLP worker:
weed mount -filer "<master-host>:8888" -dir <mount-path>
<master-host>
is the hostname/IP of the master host.<mount-path>
is the path where you want the mount to be.