v1.1.0~rc1 -- "He who controls the spice controls the universe."

This release is the first release candidate for the next minor release
following runc 1.0. It contains all of the bugfixes included in runc 1.0
patch releases (up to and including 1.0.3).

A fair few new features have been added, and several features have been
deprecated (with plans for removal in runc 1.2). At the moment we only
plan to do a single release candidate for runc 1.1, and once 1.1.0 is
released we will not continue updating the 1.0.z runc branch.

Deprecated:
 * runc run/start now warns if a new container cgroup is non-empty or frozen;
   this warning will become an error in runc 1.2. (#3132, #3223)
 * runc can only be built with Go 1.16 or later from this release onwards.
   (#3100, #3245)

Removed:
 * `cgroup.GetHugePageSizes` has been removed entirely, and been replaced with
   `cgroup.HugePageSizes` which is more efficient. (#3234)
 * `intelrdt.GetIntelRdtPath` has been removed. Users who were using this
   function to get the intelrdt root should use the new `intelrdt.Root`
   instead. (#2920, #3239)

Added:
 * Add support for RDMA cgroup added in Linux 4.11. (#2883)
 * runc exec now produces exit code of 255 when the exec failed.
   This may help in distinguishing between runc exec failures
   (such as invalid options, non-running container or non-existent
   binary etc.) and failures of the command being executed. (#3073)
 * runc run: new `--keep` option to skip removal exited containers artefacts.
   This might be useful to check the state (e.g. of cgroup controllers) after
   the container hasexited. (#2817, #2825)
 * seccomp: add support for `SCMP_ACT_KILL_PROCESS` and `SCMP_ACT_KILL_THREAD`
   (the latter is just an alias for `SCMP_ACT_KILL`). (#3204)
 * seccomp: add support for `SCMP_ACT_NOTIFY` (seccomp actions). This allows
   users to create sophisticated seccomp filters where syscalls can be
   efficiently emulated by privileged processes on the host. (#2682)
 * checkpoint/restore: add an option (`--lsm-mount-context`) to set
   a different LSM mount context on restore. (#3068)
 * runc releases are now cross-compiled for several architectures. Static
   builds for said architectures will be available for all future releases.
   (#3197)
 * intelrdt: support ClosID parameter. (#2920)
 * runc exec --cgroup: an option to specify a (non-top) in-container cgroup
   to use for the process being executed. (#3040, #3059)
 * cgroup v1 controllers now support hybrid hierarchy (i.e. when on a cgroup v1
   machine a cgroup2 filesystem is mounted to /sys/fs/cgroup/unified, runc
   run/exec now adds the container to the appropriate cgroup under it). (#2087,
   #3059)
 * sysctl: allow slashes in sysctl names, to better match `sysctl(8)`'s
   behaviour. (#3254, #3257)
 * mounts: add support for bind-mounts which are inaccessible after switching
   the user namespace. Note that this does not permit the container any
   additional access to the host filesystem, it simply allows containers to
   have bind-mounts configured for paths the user can access but have
   restrictive access control settings for other users. (#2576)
 * Add support for recursive mount attributes using `mount_setattr(2)`. These
   have the same names as the proposed `mount(8)` options -- just prepend `r`
   to the option name (such as `rro`). (#3272)
 * Add `runc features` subcommand to allow runc users to detect what features
   runc has been built with. This includes critical information such as
   supported mount flags, hook names, and so on. Note that the output of this
   command is subject to change and will not be considered stable until runc
   1.2 at the earliest. The runtime-spec specification for this feature is
   being developed in opencontainers/runtime-spec#1130. (#3296)

Changed:
 * system: improve performance of `/proc/$pid/stat` parsing. (#2696)
 * cgroup2: when `/sys/fs/cgroup` is configured as a read-write mount, change
   the ownership of certain cgroup control files (as per
   `/sys/kernel/cgroup/delegate`) to allow for proper deferral to the container
   process. (#3057)
 * docs: series of improvements to man pages to make them easier to read and
   use. (#3032)

Libcontainer API:
 * internal api: remove internal error types and handling system, switch to Go
   wrapped errors. (#3033)
 * New configs.Cgroup structure fields (#3177):
   * Systemd (whether to use systemd cgroup manager); and
   * Rootless (whether to use rootless cgroups).
 * New cgroups/manager package aiming to simplify cgroup manager instantiation.
   (#3177)
 * All cgroup managers' instantiation methods now initialize cgroup paths and
   can return errors. This allows to use any cgroup manager method (e.g.
   Exists, Destroy, Set, GetStats) right after instantiation, which was not
   possible before (as paths were initialized in Apply only). (#3178)

Fixed:
 * nsenter: do not try to close already-closed fds during container setup and
   bail on close(2) failures. (#3058)
 * runc checkpoint/restore: fixed for containers with an external bind mount
   which destination is a symlink. (#3047).
 * cgroup: improve openat2 handling for cgroup directory handle hardening.
   (#3030)
 * `runc delete -f` now succeeds (rather than timing out) on a paused
   container. (#3134)
 * runc run/start/exec now refuses a frozen cgroup (paused container in case of
   exec). Users can disable this using `--ignore-paused`. (#3132, #3223)
 * config: do not permit null bytes in mount fields. (#3287)

Thanks to the following people who made this release possible:

 * Adrian Reber <areber@redhat.com>
 * Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
 * Alban Crequy <alban@kinvolk.io>
 * Aleksa Sarai <cyphar@cyphar.com>
 * Dave Chen <dave.chen@arm.com>
 * flouthoc <flouthoc.git@gmail.com>
 * Fraser Tweedale <ftweedal@redhat.com>
 * Itamar Holder <iholder@redhat.com>
 * Kailun Qin <kailun.qin@intel.com>
 * Kang Chen <kongchen28@gmail.com>
 * Kir Kolyshkin <kolyshkin@gmail.com>
 * lifubang <lifubang@acmcoder.com>
 * Liu Hua <weldonliu@tencent.com>
 * Maksim An <maksiman@microsoft.com>
 * Markus Lehtonen <markus.lehtonen@intel.com>
 * Mauricio Vásquez <mauricio@kinvolk.io>
 * Mengjiao Liu <mengjiao.liu@daocloud.io>
 * Mrunal Patel <mrunal@me.com>
 * Neil Johnson <najohnsn@gmail.com>
 * Odin Ugedal <odin@uged.al>
 * Piotr Resztak <piotr.resztak@gmail.com>
 * Qiang Huang <h.huangqiang@huawei.com>
 * Rodrigo Campos <rodrigo@kinvolk.io>
 * Sascha Grunert <sgrunert@redhat.com>
 * Sebastiaan van Stijn <github@gone.nl>
 * Shengjing Zhu <zhsj@debian.org>
 * xiadanni <xiadanni1@huawei.com>

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
VERSION: release v1.1.0-rc.1

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2 files changed
tree: 396b44682ba9ecf3300d2421e726a36c228b14e1
  1. .github/
  2. contrib/
  3. docs/
  4. libcontainer/
  5. man/
  6. script/
  7. tests/
  8. types/
  9. vendor/
  10. .cirrus.yml
  11. .codespellrc
  12. .gitignore
  13. .golangci-extra.yml
  14. .golangci.yml
  15. CHANGELOG.md
  16. checkpoint.go
  17. CONTRIBUTING.md
  18. create.go
  19. delete.go
  20. Dockerfile
  21. EMERITUS.md
  22. events.go
  23. exec.go
  24. features.go
  25. go.mod
  26. go.sum
  27. init.go
  28. kill.go
  29. LICENSE
  30. list.go
  31. main.go
  32. MAINTAINERS
  33. MAINTAINERS_GUIDE.md
  34. Makefile
  35. NOTICE
  36. notify_socket.go
  37. pause.go
  38. PRINCIPLES.md
  39. ps.go
  40. README.md
  41. restore.go
  42. rlimit_linux.go
  43. rootless_linux.go
  44. run.go
  45. SECURITY.md
  46. signals.go
  47. spec.go
  48. start.go
  49. state.go
  50. tty.go
  51. update.go
  52. utils.go
  53. utils_linux.go
  54. Vagrantfile.fedora
  55. VERSION
README.md

runc

Go Report Card GoDoc CII Best Practices gha/validate gha/ci

Introduction

runc is a CLI tool for spawning and running containers on Linux according to the OCI specification.

Releases

You can find official releases of runc on the release page.

Security

The reporting process and disclosure communications are outlined here.

Security Audit

A third party security audit was performed by Cure53, you can see the full report here.

Building

runc only supports Linux. It must be built with Go version 1.16 or higher.

In order to enable seccomp support you will need to install libseccomp on your platform.

e.g. libseccomp-devel for CentOS, or libseccomp-dev for Ubuntu

# create a 'github.com/opencontainers' in your GOPATH/src
cd github.com/opencontainers
git clone https://github.com/opencontainers/runc
cd runc

make
sudo make install

You can also use go get to install to your GOPATH, assuming that you have a github.com parent folder already created under src:

go get github.com/opencontainers/runc
cd $GOPATH/src/github.com/opencontainers/runc
make
sudo make install

runc will be installed to /usr/local/sbin/runc on your system.

Build Tags

runc supports optional build tags for compiling support of various features, with some of them enabled by default (see BUILDTAGS in top-level Makefile).

To change build tags from the default, set the BUILDTAGS variable for make, e.g. to disable seccomp:

make BUILDTAGS=""
Build TagFeatureEnabled by defaultDependency
seccompSyscall filteringyeslibseccomp

The following build tags were used earlier, but are now obsoleted:

  • nokmem (since runc v1.0.0-rc94 kernel memory settings are ignored)
  • apparmor (since runc v1.0.0-rc93 the feature is always enabled)
  • selinux (since runc v1.0.0-rc93 the feature is always enabled)

Running the test suite

runc currently supports running its test suite via Docker. To run the suite just type make test.

make test

There are additional make targets for running the tests outside of a container but this is not recommended as the tests are written with the expectation that they can write and remove anywhere.

You can run a specific test case by setting the TESTFLAGS variable.

# make test TESTFLAGS="-run=SomeTestFunction"

You can run a specific integration test by setting the TESTPATH variable.

# make test TESTPATH="/checkpoint.bats"

You can run a specific rootless integration test by setting the ROOTLESS_TESTPATH variable.

# make test ROOTLESS_TESTPATH="/checkpoint.bats"

You can run a test using your container engine's flags by setting CONTAINER_ENGINE_BUILD_FLAGS and CONTAINER_ENGINE_RUN_FLAGS variables.

# make test CONTAINER_ENGINE_BUILD_FLAGS="--build-arg http_proxy=http://yourproxy/" CONTAINER_ENGINE_RUN_FLAGS="-e http_proxy=http://yourproxy/"

Dependencies Management

runc uses Go Modules for dependencies management. Please refer to Go Modules for how to add or update new dependencies.

# Update vendored dependencies
make vendor
# Verify all dependencies
make verify-dependencies

Using runc

Please note that runc is a low level tool not designed with an end user in mind. It is mostly employed by other higher level container software.

Therefore, unless there is some specific use case that prevents the use of tools like Docker or Podman, it is not recommended to use runc directly.

If you still want to use runc, here's how.

Creating an OCI Bundle

In order to use runc you must have your container in the format of an OCI bundle. If you have Docker installed you can use its export method to acquire a root filesystem from an existing Docker container.

# create the top most bundle directory
mkdir /mycontainer
cd /mycontainer

# create the rootfs directory
mkdir rootfs

# export busybox via Docker into the rootfs directory
docker export $(docker create busybox) | tar -C rootfs -xvf -

After a root filesystem is populated you just generate a spec in the format of a config.json file inside your bundle. runc provides a spec command to generate a base template spec that you are then able to edit. To find features and documentation for fields in the spec please refer to the specs repository.

runc spec

Running Containers

Assuming you have an OCI bundle from the previous step you can execute the container in two different ways.

The first way is to use the convenience command run that will handle creating, starting, and deleting the container after it exits.

# run as root
cd /mycontainer
runc run mycontainerid

If you used the unmodified runc spec template this should give you a sh session inside the container.

The second way to start a container is using the specs lifecycle operations. This gives you more power over how the container is created and managed while it is running. This will also launch the container in the background so you will have to edit the config.json to remove the terminal setting for the simple examples below (see more details about runc terminal handling). Your process field in the config.json should look like this below with "terminal": false and "args": ["sleep", "5"].

        "process": {
                "terminal": false,
                "user": {
                        "uid": 0,
                        "gid": 0
                },
                "args": [
                        "sleep", "5"
                ],
                "env": [
                        "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                        "TERM=xterm"
                ],
                "cwd": "/",
                "capabilities": {
                        "bounding": [
                                "CAP_AUDIT_WRITE",
                                "CAP_KILL",
                                "CAP_NET_BIND_SERVICE"
                        ],
                        "effective": [
                                "CAP_AUDIT_WRITE",
                                "CAP_KILL",
                                "CAP_NET_BIND_SERVICE"
                        ],
                        "inheritable": [
                                "CAP_AUDIT_WRITE",
                                "CAP_KILL",
                                "CAP_NET_BIND_SERVICE"
                        ],
                        "permitted": [
                                "CAP_AUDIT_WRITE",
                                "CAP_KILL",
                                "CAP_NET_BIND_SERVICE"
                        ],
                        "ambient": [
                                "CAP_AUDIT_WRITE",
                                "CAP_KILL",
                                "CAP_NET_BIND_SERVICE"
                        ]
                },
                "rlimits": [
                        {
                                "type": "RLIMIT_NOFILE",
                                "hard": 1024,
                                "soft": 1024
                        }
                ],
                "noNewPrivileges": true
        },

Now we can go through the lifecycle operations in your shell.

# run as root
cd /mycontainer
runc create mycontainerid

# view the container is created and in the "created" state
runc list

# start the process inside the container
runc start mycontainerid

# after 5 seconds view that the container has exited and is now in the stopped state
runc list

# now delete the container
runc delete mycontainerid

This allows higher level systems to augment the containers creation logic with setup of various settings after the container is created and/or before it is deleted. For example, the container's network stack is commonly set up after create but before start.

Rootless containers

runc has the ability to run containers without root privileges. This is called rootless. You need to pass some parameters to runc in order to run rootless containers. See below and compare with the previous version.

Note: In order to use this feature, “User Namespaces” must be compiled and enabled in your kernel. There are various ways to do this depending on your distribution:

  • Confirm CONFIG_USER_NS=y is set in your kernel configuration (normally found in /proc/config.gz)
  • Arch/Debian: echo 1 > /proc/sys/kernel/unprivileged_userns_clone
  • RHEL/CentOS 7: echo 28633 > /proc/sys/user/max_user_namespaces

Run the following commands as an ordinary user:

# Same as the first example
mkdir ~/mycontainer
cd ~/mycontainer
mkdir rootfs
docker export $(docker create busybox) | tar -C rootfs -xvf -

# The --rootless parameter instructs runc spec to generate a configuration for a rootless container, which will allow you to run the container as a non-root user.
runc spec --rootless

# The --root parameter tells runc where to store the container state. It must be writable by the user.
runc --root /tmp/runc run mycontainerid

Supervisors

runc can be used with process supervisors and init systems to ensure that containers are restarted when they exit. An example systemd unit file looks something like this.

[Unit]
Description=Start My Container

[Service]
Type=forking
ExecStart=/usr/local/sbin/runc run -d --pid-file /run/mycontainerid.pid mycontainerid
ExecStopPost=/usr/local/sbin/runc delete mycontainerid
WorkingDirectory=/mycontainer
PIDFile=/run/mycontainerid.pid

[Install]
WantedBy=multi-user.target

More documentation

License

The code and docs are released under the Apache 2.0 license.