## Overview
`lucicfg` is a tool for generating low-level LUCI configuration files based on a
high-level configuration given as a [Starlark] script that uses APIs exposed by
`lucicfg`. In other words, it takes a \*.star file (or files) as input and
spits out a bunch of \*.cfg files (such us `cr-buildbucket.cfg` and
`luci-scheduler.cfg`) as outputs. A single entity (such as a {{$builder_ref}}
definition) in the input is translated into multiple entities (such as
Buildbucket's builder{...} and Scheduler's job{...}) in the output. This ensures
internal consistency of all low-level configs.
Using Starlark allows to further reduce duplication and enforce invariants in
the configs. A common pattern is to use Starlark functions that wrap one or
more basic rules (e.g. {{$builder_ref}} and {{$console_view_entry_ref}}) to
define more "concrete" entities (for example "a CI builder" or "a Try builder").
The rest of the config script then uses such functions to build up the actual
### Getting lucicfg
`lucicfg` is distributed as a single self-contained binary as part of
[depot_tools], so if you use them, you already have it. Additionally it is
available in PATH on all LUCI builders. The rest of this doc also assumes that
`lucicfg` is in PATH.
If you don't use depot_tools, `lucicfg` can be installed through CIPD. The
package is [infra/tools/luci/lucicfg/${platform}], and the canonical stable
version can be looked up in the depot_tools [CIPD manifest].
Finally, you can always try to build `lucicfg` from the source code. However,
the only officially supported distribution mechanism is CIPD packages.
### Getting started with a simple config
*** note
More examples of using `lucicfg` can be found [here](../examples).
Create a new directory, create a new `` file there with the following
#!/usr/bin/env lucicfg
name = "hello-world",
buildbucket = "",
swarming = "",
luci.bucket(name = "my-bucket")
name = "my-builder",
bucket = "my-bucket",
executable = luci.recipe(
name = "my-recipe",
cipd_package = "recipe/bundle/package",
Now run `lucicfg generate`. It will create a new directory `generated`
side-by-side with `` file. This directory contains `project.cfg` and
`cr-buildbucket.cfg` files, generated based on the script above.
Equivalently, make the script executable (`chmod a+x`) and then just
execute it (`./`). This is the exact same thing as running `generate`
Now make some change in `` (for example, rename the builder), but do
not regenerate the configs yet. Instead run `lucicfg validate`. It
will produce an error, telling you that files on disk (in `generated/*`) are
stale. Regenerate them (`./`), and run the validation again.
If you have never done this before or haven't used any other LUCI tools, you are
now asked to authenticate by running `lucicfg auth-login`. This is because
`lucicfg validate` in addition to checking configs locally also sends them for a
more thorough validation to the LUCI Config service, and this requires you to be
authenticated. Do `lucicfg auth-login` and re-run `lucicfg validate`.
It should succeed now. If it still fails with permissions issues, you are
probably not in `config-validation` group (this should be rare, please contact if this is happening).
`lucicfg validate` is meant to be used from presubmit tests. If you use
depot_tools' ``, there's a [canned check] that wraps
`lucicfg validate`.
This is it, your first generated config! It is not very functional yet (e.g.
builders without Swarming dimensions are useless), but a good place to start.
Keep iterating on it, modifying the script, regenerating configs, and examining
the output in `generated` directory. Once you are satisfied with the result,
commit **both** Starlark scripts and generated configs into the repository, and
then configure LUCI Config service to pull configuration from `generated`
directory (how to do it is outside the scope of this doc).
### Migrating from existing configs to lucicfg
This process is mostly manual, but it is aided by `lucicfg semantic-diff`
command that can be used to verify the generated configs match the original
ones. Roughly, the idea is to start with broad strokes, and then refine details
until old and new configs match:
1. Create new ``, add {{$project_ref}} and all {{$bucket_ref}}
definitions there. We assume `` is located in the same directory
as existing conifgs (like `cr-buildbucket.cfg`), and generated configs are
stored in `generated` subdirectory, which is not yet really used for
1. Add rough definitions of all existing builders, focusing on identifying
common patterns in the existing configs and representing them as Starlark
functions. At this stage we want to make sure the generated
`cr-buildbucket.cfg` contains all builders (but their details are not
necessarily are correct yet).
1. Run `lucicfg semantic-diff cr-buildbucket.cfg`. It will normalize
the original and the generated Buildbucket configs (by expanding all
mixins, sorting fields, etc) and run `git diff ...` to compare them. Our
goal is to reduce this diff to zero.
1. Keep iterating by modifying Starlark configs or, if appropriate, original
configs until the diff to `cr-buildbucket.cfg` is zero.
1. Do the same for the rest of the configs: `luci-scheduler.cfg`,
`luci-milo.cfg`, `commit-queue.cfg`, etc.
1. Eventually, all generated configs in `generated` directory are semantically
identical to the existing configs. Switch LUCI Config to use `generated` as
source of configs, deleted old configs.
[CIPD manifest]:
[canned check]:
## Concepts
*** note
Most of information in this section is specific to `lucicfg`, **not** a generic
Starlark interpreter. Also this is **advanced stuff**. Its full understanding is
not required to use `lucicfg` effectively.
### Modules and packages {#modules_and_packages}
Each individual Starlark file is called a module. Several modules under the same
root directory form a package. Modules within a single package can refer to each
other (in {{$load_ref}} and {{$exec_ref}}) using their root-relative paths that
start with `//`. The root of the main package is taken to be the directory that
contains the entry point script (usually ``) passed to `lucicfg`, i.e.
`` itself can be referred to as `//`.
TODO(vadimsh): Document existence of @stdlib package (and @<alias> syntax)
when @stdlib starts exposing public API.
Modules can either be "library-like" (executed via {{$load_ref}} statement) or
"script-like" (executed via {{$exec_ref}} function). Library-like modules can
load other library-like modules via {{$load_ref}}, but may not call
{{$exec_ref}}. Script-like modules may use both {{$load_ref}} and {{$exec_ref}}.
Dicts of modules loaded via {{$load_ref}} are reused, e.g. if two different
scripts load the exact same module, they'll get the exact same symbols as a
result. The loaded code always executes only once. The interpreter *may* load
modules in parallel in the future, libraries must not rely on their loading
order and must not have side effects.
On the other hand, modules executed via {{$exec_ref}} are guaranteed to be
processed sequentially, and only once. Thus 'exec'-ed scripts essentially form
a tree, traversed exactly once in the depth first order.
### Rules, state representation
All entities manipulated by `lucicfg` are represented by nodes in a directed
acyclic graph. One entity (such as a builder) can internally be represented by
multiple nodes. A function that adds nodes and edges to the graph is called
**a rule** (e.g. {{$builder_ref}} is a rule).
Each node has a unique hierarchical key, usually constructed from entity's
properties. For example, a builder name and its bucket name are used to
construct a unique key for this builder (roughly `<bucket>/<builder>`). These
keys are used internally by rules when adding edges to the graph.
To refer to entities from public API, one just usually uses strings (e.g.
a builder name to refer to the builder). Rules' implementation usually have
enough context to construct correct node keys from such strings. Sometimes they
need some help, see [Resolving naming ambiguities](#resolving_ambiguities).
Other times entities have no meaningful global names at all (for example,
{{$console_view_entry_ref}}). For such cases, one uses a return value of the
corresponding rule: rules return opaque pointer-like objects that can be passed
to other rules as an input in place of a string identifiers. This allows to
"chain" definitions, e.g.
entries = [
It is strongly preferred to either use string names to refer to entities **or**
define them inline where they are needed. Please **avoid** storing return values
of rules in variables to refer to them later. Using string names is as powerful
(`lucicfg` verifies referential integrity), and it offers additional advantages
(like referring to entities across file boundaries).
To aid in using inline definitions where makes sense, many rules allow entities
to be defines multiple times as long as all definitions are identical (this is
internally referred to as "idempotent nodes"). It allows following usage style:
def my_recipe(name):
return luci.recipe(
name = name,
cipd_package = 'my/recipe/bundle',
name = 'builder 1',
executable = my_recipe('some-recipe'),
name = 'builder 2',
executable = my_recipe('some-recipe'),
Here `some-recipe` is formally defined twice, but both definitions are
identical, so it doesn't cause ambiguities. See the documentation of individual
rules to see whether they allow such redefinitions.
### Execution stages
There are 3 stages of `lucicfg gen` execution:
1. **Building the state** by executing the given entry `` code and
all modules it exec's. This builds a graph in memory (via calls to rules),
and registers a bunch of generator callbacks (via {{$generator_ref}}) that
will traverse this graph in the stage 3.
- Validation of the format of parameters happens during this stage (e.g.
checking types, ranges, regexps, etc). This is done by rules'
implementations. A frozen copy of validated parameters is put into
the added graph nodes to be used from the stage 3.
- Rules can mutate the graph, but **may not** examine or traverse it.
- Nodes and edges can be added out of order, e.g. an edge may be added
before the nodes it connects. Together with the previous constraint, it
makes most lucicfg statements position independent.
- The stage ends after reaching the end of the entry `` code. At
this point we have a (potentially incomplete) graph and a list of
registered generator callbacks.
2. **Checking the referential consistency** by verifying all edges of the
graph actually connect existing nodes. Since we have a lot of information
about the graph structure, we can emit helpful error messages here, e.g
`luci.builder("name") refers to undefined luci.bucket("bucket") at <stack
trace of the corresponding luci.builder(...) definition>`.
- This stage is performed purely by `lucicfg` core code, not touching
Starlark at all. It doesn't need to understand the semantics of graph
nodes, and thus used for all sorts of configs (LUCI configs are just
one specific application).
- At the end of the stage we have a consistent graph with no dangling
edges. It still may be semantically wrong.
3. **Checking the semantics and generating actual configs** by calling all
registered generator callbacks sequentially. They can examine and traverse
the graph in whatever way they want and either emit errors or emit
generated configs. They **may not** modify the graph at this stage.
Presently all this machinery is mostly hidden from the end user. It will become
available in future versions of `lucicfg` as an API for **extending**
`lucicfg`, e.g. for adding new entity types that have relation to LUCI, or for
repurposing `lucicfg` for generating non-LUCI conifgs.
## Common tasks
### Resolving naming ambiguities {#resolving_ambiguities}
Builder names are scoped to buckets. For example, it is possible to have the
following definition:
# Runs pre-submit tests on Linux.
name = 'Linux',
bucket = 'try',
# Runs post-submit tests on Linux.
name = 'Linux',
bucket = 'ci',
Here `Linux` name by itself is ambiguous and can't be used to refer to the
builder. E.g. the following chunk of code will cause an error:
builder = 'Linux', # but which one?...
The fix is to prepend the bucket name:
builder = 'ci/Linux', # ah, the CI one
It is always correct to use "full" name like this. But in practice the vast
majority of real world configs do not have such ambiguities and requiring full
names everywhere is a chore. For that reason `lucicfg` allows to omit the bucket
name if the resulting reference is non-ambiguous. In the example above, if we
remove one of the builders, `builder = 'Linux'` reference becomes valid.
### Defining cron schedules {#schedules_doc}
{{$builder_ref}} and {{$gitiles_poller_ref}} rules have `schedule` field that
defines how often the builder or poller should run. Schedules are given as
strings. Supported kinds of schedules (illustrated via examples):
- `* 0 * * * *`: a crontab expression, in a syntax supported by (see its docs for full reference).
LUCI will attempt to start the job at specified moments in time (based on
**UTC clock**). Some examples:
- `0 */3 * * * *` - every 3 hours: at 12:00 AM UTC, 3:00 AM UTC, ...
- `0 */3 * * *` - the exact same thing (the last field is optional).
- `0 1/3 * * *` - every 3 hours but starting 1:00 AM UTC.
- `0 2,10,18 * * *` - at 2 AM UTC, 10 AM UTC, 6 PM UTC.
- `0 7 * * *` - at 7 AM UTC, once a day.
If a previous invocation is still running when triggering a new one,
an overrun is recorded and the new scheduled invocation is skipped. The next
attempt to start the job happens based on the schedule (not when the
currently running invocation finishes).
- `with 10s interval`: run the job in a loop, waiting 10s after finishing
an invocation before starting a new one. Moments when the job starts aren't
synchronized with the wall clock at all.
- `with 1m interval`, `with 1h interval`: same format, just using minutes and
hours instead of seconds.
- `continuously` is alias for `with 0s interval`, meaning to run the job in
a loop without any pauses at all.
- `triggered` schedule indicates that the job is only started via some
external triggering event (e.g. via LUCI Scheduler API), not periodically.
- in {{$builder_ref}} this schedule is useful to make lucicfg setup a
scheduler job associated with the builder (even if the builder is not
triggered by anything else in the configs). This exposes the builder in
LUCI Scheduler API.
- in {{$gitiles_poller_ref}} this is useful to setup a poller that polls
only on manual requests, not periodically.
## Interfacing with lucicfg internals
