Cycler is a tool for the rapid iteration and modification of google storage buckets. It allows the user to take advantage of object prefixes to parallelize massively over the simple GS tools. Typically this prefix is ‘/’. It uses a policy engine framework Rego on each object's attributes returned from list. It also has the ability to operate on various runtime and calculated values that can be passed to the policy engine. It also gathers statistics and produces a report in json or text.
Its configuration is specified via protobuf (or the corresponding json) and has multiple implemented possible actions.
Logs are delivered which stat each object that is touched. These are uploaded to google storage or placed locally in compressed JSONL format. A simple audit of a storage bucket can be achieved by using the Noop effect and the true.rego
policy.
Currently Cycler contains the following actions:
Additionally planned actions include (potentially):
./cycler --help Cycler is a tool for rapid iteration of google storage buckets. It is move effective in buckets that utilize a delimiter to indicate a hierarchical topology. The most common example is unix like path names where the delimiter is '/'. It provides an interface for generic effects to be mapped on to each discovered object. For instance, to find the 'du' like tree of object size, or to set acls, or even copy the object into another bucket. -alsologtostderr log to standard error as well as files -bucket string override the bucket name to operate on -iterJobs int max number of object iterator jobs (default 2000) -jsonOutFile string set if output should be written to a json file instead of plain text to stdout. -log_backtrace_at value when logging hits line file:N, emit a stack trace -log_dir string If non-empty, write log files in this directory -logtostderr log to standard error instead of files -mutationAllowed Must be set if the effect specified mutates objects. -prefixChannelDepth int Size of the object prefix channel. (default 125000000) -prefixRoot string the root prefix to iterate as path from root without decorations (e.g. asubdir/anotherone), defaults to root of bucket (the empty string) -retryCount int Number of retries for an operation on any given object. (default 5) -runConfigPath string the RunConfig input path (in binary or json representation). -stderrthreshold value logs at or above this threshold go to stderr -v value log level for V logs -vmodule value comma-separated list of pattern=N settings for file-filtered logging -workUnitChannelDepth int Size of the work unit channel. (default 4194304) -workerJobs int number of object consumer jobs (default 2000)
This invocation moves all the objects in a bucket that match a regex on their name as well as being of a certain age to another bucket.
{ "run_log_configuration": { "destination_url": "gs://engeg-testing-chromeos-releases-2/logs", "chunk_size_bytes": 104857600, "channel_size": 10000, "persist_retries": 100, "max_unpersisted_logs": 10 }, "policy_effect_configuration": { "move": { "destination_bucket": "engeg-testing-chromeos-releases-2", "destination_prefix": "last_change/" }, "policy_document_path": "examples/policies/unlikely_object_name.rego" }, "stats_configuration": { "prefix_report_max_depth": 1, "age_days_histogram_options": { "num_buckets": 16, "growth_factor": 1.0, "base_bucket_size": 1.0, "min_value": 0 }, "size_bytes_histogram_options": { "num_buckets": 16, "growth_factor": 4.0, "base_bucket_size": 1.0, "min_value": 0 } }, "mutation_allowed" : true, "bucket": "engeg-testing-chromeos-releases" }
not_firmware_policy.rego
# The cycler executable will always load the package data.cycler package cycler skipped_prefix := false { re_match('gs://' + input.Bucket + '-firmware', input.attr.Name) re_match('(.*-test-ap|.*-test-ap-tryjob)') } # 'act' binding is the ultimate bool determination of if we should trigger # the configuration supplied effect. act := true { skipped_prefix input.ageDays > 180 }
./cycler --runConfigPath ./examples/move_to_prefix.json --workerJobs 20000 --mutationAllowed -v 2