glt#
GLT (Group-by Log Type) is a version of CLP specialized for enhanced search performance, at the cost of higher memory usage during compression. During compression, log events with the same log type are grouped together into tables that can be compressed and searched more efficiently. In our benchmarks, compared to CLP, we found that GLT’s compression ratio is 1.24x higher and searches are 7.8x faster on average.
You can use GLT to compress, decompress, and search unstructured (plain-text) logs using the glt
binary described below.
Compression#
Usage:
./glt c [<options>] <archives-dir> <input-path> [<input-path> ...]
- archives-diris the directory that archives should be written to.- gltwill create a number of files and directories within, so it’s best if this directory is empty.
- You can use the same directory repeatedly and - gltwill add to the compressed logs within.
 
- input-pathis any plain-text log file or directory containing such files.
- optionsallow you to specify things like the level of compression to apply.- For a complete list, run - ./glt c --help
 
Examples#
Compress /mnt/logs/log1.log and output archives to /mnt/data/archives1:
./glt c /mnt/data/archives1 /mnt/logs/log1.log
Decompression#
Usage:
./glt x [<options>] <archives-dir> <output-dir> [<file-path>]
- archives-diris a directory containing archives.
- output-diris the directory that decompressed logs should be written to.
- file-pathis an optional file path to decompress, in particular.
Examples#
Decompress all logs from /mnt/data/archives1 into /mnt/data/archives1-decomp:
./glt x /mnt/data/archives1 /mnt/data/archives1-decomp
Decompress just /mnt/logs/file1.log:
./glt x /mnt/data/archives1 /mnt/data/archives1-decomp /mnt/logs/file1.log
Search#
Usage:
./glt s [<options>] <archives-dir> <wildcard-query> [<file-path>]
- archives-diris a directory containing archives.
- wildcard-queryis a wildcard query where:- the - *wildcard matches 0 or more characters;
- the - ?wildcard matches any single character.
 
- optionsallow you to specify things like a time-range filter.- For a complete list, run - ./glt s --help
 
Tip
Adding spaces (when possible) at the beginning and the end of the wildcard-query can improve GLT’s search performance, since GLT won’t need to consider implicit wildcards during query processing. For example, the query “ ERROR * container “ is preferred to “ERROR * container”.
Examples#
Search /mnt/data/archives1 for specific ERROR logs:
./glt s /mnt/data/archives1 " ERROR * container "
Search for logs in a time range:
./glt s /mnt/data/archives1 --tge 1546344654321 --tle 1546344912345 " user1 "
Note
Currently, timestamps must be specified as milliseconds since the UNIX epoch.
Search a single file:
./clg /mnt/data/archives1 " session closed " /mnt/logs/file1
Current limitations#
- Timestamp format information is not preserved in search results. Instead, all search results use a default timestamp format. 
- Search results are not output in the same order that they were in the original log files. 
