clp#
For unstructured (plain text) logs, you can compress, decompress, and search them using the clp
and clg
binaries described below.
Compression#
Usage:
./clp c [<options>] <archives-dir> <input-path> [<input-path> ...]
archives-dir
is the directory that archives should be written to.clp
will create a number of files and directories within, so it’s best if this directory is empty.You can use the same directory repeatedly and
clp
will add to the compressed logs within.
input-path
is any plain-text log file or directory containing such files.options
allow you to specify things like a path to a custom schema file (--schema-path <file-path>
).For a complete list, run
./clp c --help
Examples#
Compress /mnt/logs/log1.log
and output archives to /mnt/data/archives1
:
./clp c /mnt/data/archives1 /mnt/logs/log1.log
Compress /mnt/logs/log1.log
using a custom schema specified in /mnt/conf/schemas.txt
:
./clp c --schema-path /mnt/conf/schemas.txt /mnt/data/archives1 /mnt/logs/log1.log
Decompression#
Usage:
./clp x [<options>] <archives-dir> <output-dir> [<file-path>]
archives-dir
is a directory containing archives.output-dir
is the directory that decompressed logs should be written to.file-path
is an optional file path to decompress, in particular.
Examples#
Decompress all logs from /mnt/data/archives1
into /mnt/data/archives1-decomp
:
./clp x /mnt/data/archives1 /mnt/data/archives1-decomp
Decompress just /mnt/logs/file1.log
:
./clp x /mnt/data/archives1 /mnt/data/archives1-decomp /mnt/logs/file1.log
Search#
Usage:
Note
Search uses a different executable (clg
) than compression (clp
).
./clg [<options>] <archives-dir> <wildcard-query> [<file-path>]
archives-dir
is a directory containing archives.wildcard-query
is a wildcard query where:the
*
wildcard matches 0 or more characters;the
?
wildcard matches any single character.
options
allow you to specify things like a time-range filter.For a complete list, run
./clg --help
Examples#
Search /mnt/data/archives1
for specific ERROR logs and ignore case distinctions:
./clg --ignore-case /mnt/data/archives1 " ERROR * container "
Search for logs in a time range:
./clg /mnt/data/archives1 --tge 1546344654321 --tle 1546344912345 " user1 "
Note
Currently, timestamps must be specified as milliseconds since the UNIX epoch.
Search a single file:
./clg /mnt/data/archives1 " session closed " /mnt/logs/file1
Parallel Compression#
By default, clp
uses an embedded SQLite database, so each directory containing archives can only
be accessed by a single clp
instance.
To enable parallel compression to the same archives directory, clp
/clg
can be configured to use
a MySQL-type database (e.g., MariaDB) as follows:
Install and configure MariaDB using the instructions for your platform
Create a user that has privileges to create databases, create tables, insert records, and delete records.
Copy and change
config/metadata-db.yml
, setting the type tomysql
and uncommenting the MySQL parameters.Install the MariaDB and PyYAML Python packages
pip3 install mariadb PyYAML
This is necessary to run the database initialization script. If you prefer, you can run the SQL statements in
tools/scripts/db/init-db.py
directly.
Run
tools/scripts/db/init-db.py
with the updated config file. This will initialize the database CLP requires.Run
clp
orclg
as before, with the addition of the--db-config-file
option pointing at the updated config file.To compress in parallel, simply run another instance of
clp
concurrently.
Note that currently, decompression (clp x
) and search (clg
) can only be run with a single
instance. We are in the process of open-sourcing parallelized versions of these as well.