tree: 61905a0941b97c432d59abcac96f0971dc492615 [path history] [tgz]
  1. .flake8
  2. conftest.py
  3. copybot.py
  4. README.md
  5. run_tests.sh
  6. test_copybot.py
contrib/copybot/README.md

CopyBot

CopyBot is a tool that automates commit copying from a third-party repository into Gerrit. Currently, it's used by the Zephyr and Coreboot projects.

CopyBot vs. Copybara

Google already has very complex infrastructure to copy code from one place to another called Copybara. So why does CopyBot exist? Copybara works at the “tree level” and does not know how to cherry-pick code between sources. Rightfully, adding support for cherry-picking comes with additional complexities, like figuring out how to handle merge conflicts.

However, many of our teams need the ability to either maintain patches on a temporary basis, or land commits in a different order to integrate dependencies. Thus, we need the ability to do cherry-picks, even if it does mean extra complexity.

Should Copybara support this in the future in a manner which works well for CopyBot's users, we should work to deprecate our usages of CopyBot in favor of unified infrastructure.

CopyBot's design

CopyBot is a single Python script: copybot.py.

copybot.py is what to run to copy code from one repository to another. Its general usage is:

./copybot.py [options...] <upstream_repo>:<upstream_branch>:<upstream_subtree> <downstream_repo>:<downstream_branch>:<downstream_subtree>

Run ./copybot.py --help for a complete list of options supported.

copybot.py will then:

  1. Search for the first commit in the downstream repo which has GitOrigin-RevId or Original-Commit-Id specified in the footers, or has the exact same hash as as an upstream commit.

  2. Identify the commits that need cherry-picked from that commit up to the upstream limit (keeping in mind file filtering pattens may cause certain commits to be skipped).

  3. Cherry-pick those commits onto the downstream repo.

  4. Push the changes to Gerrit.

Use --dry-run to test changes without pushing to the downstream repository.

Using CopyBot

CopyBot is intended to be run daily as a cron job. The Chromium OS deployment of CopyBot runs nightly at ~4:30 AM Mountain Time.

Your job as a downstreamer is to:

  • CR+2 and CQ+2 the commits uploaded by CopyBot.

  • Watch for CQ failures.

  • Manually handle commits with merge conflicts (if required). You‘ll know this happens because you’ll get an email from the CI failure: click the link for the output, and the list of commits with merge conflicts will be at the bottom of the page.

  • Move commits with external dependency (e.g., a Cq-Depend, or an API change that requires rework of other code) out of the CL stack and handle merging these manually, as required. See Managing Conflicted Commits below.

Conflicts

CopyBot Conflict Behavior

CopyBot supports the following conflict behaviors specified on the command line with the --merge-conflict-behavior option

  • FAIL: Fail the copybot run immediately; do not upload any changes.
  • SKIP: Skip the conflicted CL, continue cherry-picking, and upload the resulting CL stack.
  • STOP: Stop at the first conflicted CL, and upload the CLs which were cherry-picked before encountering the conflict. Copybot will exit with failure status.
  • ALLOW_CONFLICT: After fialing to cherry-pick a CL, commit the CL with unresolved conflicts. Copybot will exit with failure status.

Managing Conflicted Commits

On a rare occasion, it may be necessary to skip or modify certain commits to manage conflicts or so they can be merged later. Skipping and preserving can be accomplished by applying the corresponding Gerrit hashtag to a pending change. You may need to click the “More” button on the left panel to see and add the hashtags. The only requirement is that you include the full upstream commit hash somewhere in the commit message.

If you change your mind in the future about the conflicted behavior, just remove the corresponding hashtag.

Skipping Commits

To skip a commit, apply the Gerrit hashtag copybot-skip to a pending change in the Gerrit UI.

On CopyBot's next run, it will respect your wishes, and no longer upload any commits which came from this upstream revision.

Preserving Commits

To preserve a commit, apply the Gerrit hashtag copybot-preserve to a pending change in the Gerrit UI.

On CopyBot's next run, it will cherry-pick the pending change from the GoB instance associated with your downstream repo instead of overwriting it with a change from the upstream repo.

Ignore Listing Commits

It is sometimes necessary to ignore a change for the lifetime of a repository. In an effort to minimize the number of pending changes left in the downstream repository, an ignore list CL can be used to list all of the hashes which should be ignored within a repository. To accomplish this:

  • upload a CL to the corresponding downstream repo(if subtrees are used, it must also be within the desired subtree) with the list of commit hashes to be skipped in the CL.
  • Add the downstream topic and Copybot-Skip hashtag as you normally would from the instructions in skipping commits.

On GoB, it is highly encouraged to add Commit: false to the commit message to prevent the CL from merging.

See Ignore List Example.

Triggering CopyBot Manually

You can trigger CopyBot jobs from the LUCI Scheduler UI.

Adding a CopyBot Configuration

CopyBot jobs run and managed by LUCI. To add or modify a job configuration, modify the corresponding configuration object in infra/config/misc_builders/copybot.star.

CopyBot support for repositories

CopyBot supports local and external Git repositories.

Subtree paths

Copybot supports the selective filtering of files by explicit inclusion, exclusion, and subtree paths. Any of these options can be applied to the up or downstream branches and can be used to keep files or paths in sync in repositories that are otherwise unrelated.

Processing limits

CopyBot supports limiting both the up and down stream histories when evaluating which changes to cherry-pick. This can be especially useful when configuring repositories which have been previously manually synced. Setting the upstream limit ignores CLs in the upstream repo beyond the limit.

CopyBot will automatically increase the upstream limit to find the historical relationship between two repositories.

CopyBot also supports a maximum limit of CLs to downstream in a single stack. This can be useful when using a CLI/UI to trigger review/commit and downstream CI limits the number of changes which can be processed simultaneously.

Preserving fields in the commit message

CopyBot by default will prepend Original- to all pseudoheaders of the format Key: Value found in the original commit message. To prevent the addition of the Original- string to the key, use the --keep-pseudoheader command line argument to tell CopyBot which pseudoheaders should be preserved. Some examples of pseudoheaders which are often preserved are:

  • Cq-Depend
  • Change-Id

Change-Id is preserved by default when both the upstream and downstream are detected to be a GoB server.

Skipping CLs cherry-picked by another CopyBot job

CopyBot supports adding pseudoheadres through the --add-pseudoheader command line option. By default, a Copybot-Job-Name pseudoheader is added to CLs cherry-picked by the LUCI copybot jobs. If for example a CopyBot job is configured in which the upstream is also a downstream of another copybot job, the CLs cherry-picked by copybot to the upstream repository can be skipped using the --skip-job-name pseudoheader identifying the value in the Copybot-Job-Name pseudoheader of the other CopyBot job.

Additional CL Behaviors

Some additional features have been requested by groups using CopyBot and may be of use:

  • --add-signed-off-by: Sign-off the commit message. This entirely defeats the purpose of Signed-off-by as robots cannot legally sign anything, but you do you.
  • --prepend-subject: Prepend a string to the the commit message subject.
  • --push-option: Can be passed multiple times, updates push refspec.
  • Maintaining the Change-Id to show CL relationships.
    • This is the default for GoB and is reflected in the Gerrit UI.
  • Adding Hashtags.
  • Adding labels.
  • Skipping, rebasing, and preserving pending changes.
  • Adding reviewers.
  • Adding CC.

Establishing historical relationships with CopyBot

As alluded to in Copybot's Design, a downstream CL must reference the commit hash of an upstream CL. This can be accomplished by adding the hash to the commit message(preferably using the GitOrigin-RevId pseudoheader) and running CopyBot subsequently. CopyBot considers downstream pending changes as part of the repository history and will overwrite the commit and commit message with the change as it would cherry-pick it. See Preserving Commits if this is not the desired behavior.

CopyBot Status

To see the status of an individual CopyBot job, go to the corresponding LUCI builder in the LUCI Scheduler UI. For a more composite view, visit the CopyBot Status Dashboard.

Contributing to CopyBot

First of all, thanks! Your improvements are very much welcomed!

Python source code should be auto-formatted by black and isort. Run black . and isort . to do the formatting.

To run tests, use ./run_tests.sh.

Future Improvements

A number of improvements could be made to Copybot in the future to make it more friendly and help keep the bus number high. Possible ideas include:

  • Better email notifications on merge conflicts.

  • Automated copying and updates of FROMPULL PRs from GitHub when changes are pushed.

Tutorial Video

How to run it manually video: copybot - manual run demo (available only to Googlers for the moment).