CLP for JSON logs#
For JSON logs, you can compress, decompress, and search them using the clp-s binary described
below.
Compression#
Usage:
./clp-s c [<options>] <archives-dir> <input-path> [<input-path> ...]
archives-diris the directory that archives should be written to.input-pathis any new-line-delimited JSON (ndjson) log file or directory containing such files.optionsallow you to specify things like which field should be considered as the log event’s timestamp (--timestamp-key <field-path>), or whether to fully parse array entries and encode them into dedicated columns (--structurize-arrays).For a complete list, run
./clp-s c --help
Examples#
Compress /mnt/logs/log1.json and output archives to /mnt/data/archives1:
./clp-s c /mnt/data/archives1 /mnt/logs/log1.json
Treat the field {"d": {"@timestamp": "..."}} as each log event’s timestamp:
./clp-s c --timestamp-key 'd.@timestamp' /mnt/data/archives1 /mnt/logs/log1.json
Tip
Specifying the timestamp-key will create a range-index for the timestamp column which can increase compression ratio and search performance.
Set the target encoded size to 1 GiB and the compression level to 6 (3 by default)
./clp-s c \
--target-encoded-size 1073741824 \
--compression-level 6 \
/mnt/data/archives1 \
/mnt/logs/log1.json
Decompression#
Usage:
./clp-s x [<options>] <archives-dir> <output-dir>
archives-diris a directory containing archives.output-diris the directory that decompressed logs should be written to.optionsallow you to specify things like a specific archive (from withinarchives-dir) to decompress (--archive-id <archive-id>).For a complete list, run
./clp-s x --help
Examples#
Decompress all logs from /mnt/data/archives1 into /mnt/data/archives1-decomp:
./clp-s x /mnt/data/archives1 /mnt/data/archives1-decomp
Search#
Usage:
./clp-s s [<options>] <archives-dir> <kql-query>
archives-diris a directory containing archives.kql-queryis a KQL query.optionsallow you to specify things like a specific archive (from withinarchives-dir) to search (--archive-id <archive-id>).For a complete list, run
./clp-s s --help
Examples#
Find all log events within a time range:
./clp-s s /mnt/data/archives1 'ts >= 1649923037 AND ts <= 1649923038'
or
./clp-s s /mnt/data/archives1 \
'ts >= date("2022-04-14T07:57:17") AND ts <= date("2022-04-14T07:57:18")'
Find log events with a given key-value pair:
./clp-s s /mnt/data/archives1 'id: 22149'
Find ERROR log events containing a substring:
./clp-s s /mnt/data/archives1 'level: ERROR AND message: "job*"'
Find FATAL or ERROR log events and ignore case distinctions between values in the query and the compressed data:
./clp-s s --ignore-case /mnt/data/archives1 'level: FATAL OR level: ERROR'
Current limitations#
clp-scurrently only supports valid JSON logs; it does not handle JSON logs with trailing commas or other JSON syntax errors.Time zone information is not preserved.
The order of log events is not preserved.
The input directory structure is not preserved and during decompression all files are written to the same file.
In addition, there are a few limitations, related to querying arrays, described in the search syntax reference.