Using CLP with AWS S3#
To compress logs from AWS S3, follow the steps in the section below. For all other operations (e.g., searching or viewing logs in the Web UI), use CLP as described in the clp-json quick-start guide.
Compressing logs from AWS S3#
To compress logs from AWS S3, use the sbin/compress-from-s3.sh script. The script supports two
modes of operation:
s3-object mode: Compress S3 objects specified by their full S3 URLs.
s3-key-prefix mode: Compress all S3 objects under a given S3 key prefix.
s3-object compression mode#
The s3-object mode allows you to specify individual S3 objects to compress by using their full
URLs. To use this mode, call the sbin/compress-from-s3.sh script as follows, and replace the
fields in angle brackets (<>) with the appropriate values:
sbin/compress-from-s3.sh \
--timestamp-key <timestamp-key> \
--dataset <dataset-name> \
s3-object \
<object-url> [<object-url> ...]
<object-url>is a URL identifying the S3 object to compress. It can be written in either of two formats:https://<bucket-name>.s3.<region-code>.amazonaws.com/<object-key>https://s3.<region-code>.amazonaws.com/<bucket-name>/<object-key>
The fields in
<object-url>are as follows:<bucket-name>is the name of the S3 bucket containing your logs.<region-code>is the AWS region code for the S3 bucket containing your logs.<object-key>is the object key of the log file object you wish to compress.Warning
There must be no duplicate object keys across all
<object-url>arguments.
For a description of other fields, see the clp-json quick-start guide.
Instead of specifying input object URLs explicitly in the command, you may specify them in a text
file and then pass the file into the command using the --inputs-from flag, like so:
sbin/compress-from-s3.sh \
--timestamp-key <timestamp-key> \
--dataset <dataset-name> \
s3-object \
--inputs-from <input-file>
<input-file>is a path to a text file containing one S3 object URL per line. The URLs must follow the same format as described above for<object-url>.
Note
The s3-object mode requires the input object keys to share a non-empty common prefix. If the input
object keys do not share a common prefix, they will be rejected and no compression job will be
created. This limitation will be addressed in a future release.
s3-key-prefix compression mode#
The s3-key-prefix mode allows you to compress all objects under a given S3 key prefix. To use this
mode, call the sbin/compress-from-s3.sh script as follows, and replace the fields in angle
brackets (<>) with the appropriate values:
sbin/compress-from-s3.sh \
--timestamp-key <timestamp-key> \
--dataset <dataset-name> \
s3-key-prefix \
<key-prefix-url>
<key-prefix-url>is a URL identifying the S3 key prefix to compress. It can be written in either of two formats:https://<bucket-name>.s3.<region-code>.amazonaws.com/<key-prefix>https://s3.<region-code>.amazonaws.com/<bucket-name>/<key-prefix>
The fields in
<key-prefix-url>are as follows:<bucket-name>is the name of the S3 bucket containing your logs.<region-code>is the AWS region code for the S3 bucket containing your logs.<key-prefix>is the prefix of all logs you wish to compress and must begin with the<all-logs-prefix>value from the compression IAM policy.
Note
s3-key-prefix mode only accepts a single <key-prefix-url> argument. This limitation will be
addressed in a future release.