tag | cdf9fb148d075932d8cfc4a06b92dbdca5f3244a | |
---|---|---|
tagger | Aleksa Sarai <asarai@suse.de> | Wed Nov 21 04:35:08 2018 |
object | ccb5efd37fb7c86364786e9137e22948751de7ed |
v1.0.0~rc6 This is the final feature release of runc before 1.0, rather than 1.0 itself. The reason for tihs is that, during the preparations for this release (which was originally meant to be 1.0) it was brought up that there were several spec-compliance problems. One of these was related to hook ordering, and upon trying to fix them it turns out that many users (notably the NVIDIA OCI hooks) make use of our incorrect hook ordering. Many of the proposed solutions to this problem all require a lot of time and co-ordination, and thus would stall this release indefinitely. So, the idea is to have an intermediate release which will mark a freeze-on-everything-except-spec-compliance-bugs. No other changes will be included pre-1.0 (aside from security patches obviously). Features: + Upgrade to using Go 1.10. #1711 + Upgrade to CRIU 3.11. #1711 #1864 #1935 #1936 + Allow for checkpoint-restore into a foreign network namespace. #1849 + The "type" field for bind-mounts is now ignored. This is important, because many users incorrectly assume that "type" defines a bind-mount and not "options". Previously you had to set both. #1753 #1845 + "setgroups=allow" is now possible in rootless mode, but requires the use of the privileged newgidmap helper (fully-rootless still requires "setgroups=deny"). #1693 + Rootless mode can now safely ignore a read-only cgroupfs. #1759 #1806 + Several aspects of rootless mode are now used inside user namespaces. This is necessary for a bunch of useful things (such as running Docker inside an user namespace), but did cause some breakages. We think they've all been fixed -- but if not please submit an issue! #1688 #1808 #1816 #1862 + Improve kernel.{domain,host}name sysctl handling, to allow the NIS domainname to be set from Docker or other callers without an OCI spec change. #1827 + Add documentation for one of the more confusion parts of runc, how terminals are handled (including an explanation of --console-socket). All the gory details and recommendations are available in docs/terminals.md. #1730 + Allow /proc to be bind-mounted over (useful for rootless containers). #1832 + Ignore ENOSYS for keyctl(2) operations. This is necessary to get Docker working with LXC under the default seccomp profile (which is what ChromeOS uses). #1893 + Add support for the Intel RDT/MBA resource control system. #1632 #1913 + Allow building with completely-disabled kmemcg support, to get around problems with broken kernels (RHEL 7.5 can oops with kmemcg accounting enabled). #1921 #1922 #1930 + Add support for cgroup namespaces, which in turn fixes a few other issues we encountered with the previous code (which could be moving us to a cgroup during Go execution). #1916 Fixes: * Namespace creation with user namespaces now plays a bit nicer with SELinux and IPC (which had a bug where the in-kernel mqueue mount would have the wrong tag if using unshare(CLONE_NEWUSER|CLONE_NEWIPC)). This is done to avoid future problems with broken kernel integration. #1562 * Mild refactor of libcontainer/user. #1749 * Fix null-pointer-exception when no cgroups were set. #1752 * Various DBus and systemd related changes for the systemd-cgroup driver. #1754 #1772 #1776 #1781 #1805 #1917 * Apply SELinux label to masked directories. #1756 * Obey the XDG spec and set the sticky bit on runc's root when using XDG_RUNTIME_DIR (in rootless mode). #1760 * Only configure network namespaces if we are creating them. #1777 * Fix race in runc-exec against a currently-exiting pid1. #1812 * Forward GOMAXPROCS to try to reduce the number of threads started by 'runc init'. Unforunately there's no way to stop Go from spawning new threads so this is more of a recommendation. #1830 * Fix tmpcopyup in cases where /tmp is not a private mount. #1873 * Whitelist /proc/loadavg for bind-mounting. #1882 * Protect against deletion of runc state directory with a containerid of "..", as well as the addition of other path hardening code. #1883 * Handle duplicated cgroupfs mountpoint entries more sanely, to make runc work on distributions that use-and-abuse shared subtrees. #1817 * Fix console hanging in several cases. #1895 #1897 * Lock-to-a-thread during 'runc init' to ensure that that we don't switch threads and run within a different SELinux label. #1814 * Respect cgroupPath when trying to find the cgroupfs mountpoint (which can happen in cases where containers are given different cgroupfs mounts). #1872 * And many other minor changes, many from first-time contributors! #1746 #1748 #1749 #1784 #1779 #1785 #1796 #1819 #1825 #1836 #1824 #1820 #1838 #1840 #1841 #1867 #1871 #1855 #1854 #1874 #1868 #1886 #1892 #1858 #1894 #1908 #1880 #1910 #1915 #1903 #1922 #1926 #1928 #1925 #1911 Fixes (for spec violations): * Don't set a container to "running" when exec-ing into it (because it might be in the "created" state). #1771 * oom_score_adj is now no longer modified if it was unspecified in config.json (this was a spec violation). #1759 * Set "status" in hook stdin, as well as switch to using *spec.State to avoid JSON-representation drift. #1741 Thanks to all of the contributors that made this release possible: * Ace-Tang <aceapril@126.com> * Adrian Reber <areber@redhat.com> * Akihiro Suda <suda.akihiro@lab.ntt.co.jp> * Alban Crequy <alban@kinvolk.io> * Aleksa Sarai <asarai@suse.de> * Alex Glikson <alex.glikson@gmail.com> * Andrei Vagin <avagin@virtuozzo.com> * Antonio Murdaca <runcom@redhat.com> * Bin Chen <nk@devicu.com> * ChangFeng <changfeng@pinduoduo.com> * Chris Aniszczyk <caniszczyk@gmail.com> * Danail Branekov <danailster@gmail.com> * Daniel, Dao Quang Minh <dqminh89@gmail.com> * Daniel J Walsh <dwalsh@redhat.com> * Denys Smirnov <denys@sourced.tech> * Derek Carr <decarr@redhat.com> * dlorenc <lorenc.d@gmail.com> * Dmitry Smirnov <onlyjob@member.fsf.org> * Dominik Süß <dominik@suess.wtf> * Filipe Brandenburger <filbranden@google.com> * Giuseppe Scrivano <gscrivan@redhat.com> * Harald Nordgren <haraldnordgren@gmail.com> * Jay Kamat <jaygkamat@gmail.com> * Jonathan Marler <johnnymarler@gmail.com> * Kenta Tada <Kenta.Tada@sony.com> * Kir Kolyshkin <kolyshkin@gmail.com> * Lifubang <lifubang@acmcoder.com> * Lin Yang <lin.a.yang@intel.com> * Marco Vedovati <mvedovati@suse.com> * Michael Crosby <crosbymichael@gmail.com> * Mike Brown <brownwm@us.ibm.com> * Mrunal Patel <mrunalp@gmail.com> * Nalin Dahyabhai <nalin@redhat.com> * Qiang Huang <h.huangqiang@huawei.com> * Sebastien Boeuf <sebastien.boeuf@intel.com> * Sergio Lopez <slp@redhat.com> * Tamal Saha <tamal@appscode.com> * Tibor Vass <tibor@docker.com> * vikaschoudhary16 <choudharyvikas16@gmail.com> * Vincent Batts <vbatts@hashbangbash.com> * W. Trevor King <wking@tremily.us> * Xiaochen Shen <xiaochen.shen@intel.com> * Yan Zhu <yanzhu@alauda.io> * Yuanhong Peng <pengyuanhong@huawei.com> Signed-off-by: Aleksa Sarai <asarai@suse.de>
commit | ccb5efd37fb7c86364786e9137e22948751de7ed | [log] [tgz] |
---|---|---|
author | Aleksa Sarai <asarai@suse.de> | Wed Nov 21 02:54:59 2018 |
committer | Aleksa Sarai <asarai@suse.de> | Wed Nov 21 02:54:59 2018 |
tree | c42d7ba1ef3a67b876ee4fe6c2f1c22bd0ec62b3 | |
parent | 73856f6d6fc8e03be1794c2146c25f2b545f95cf [diff] |
VERSION: release v1.0.0~rc6 Signed-off-by: Aleksa Sarai <asarai@suse.de>
runc
is a CLI tool for spawning and running containers according to the OCI specification.
runc
depends on and tracks the runtime-spec repository. We will try to make sure that runc
and the OCI specification major versions stay in lockstep. This means that runc
1.0.0 should implement the 1.0 version of the specification.
You can find official releases of runc
on the release page.
If you wish to report a security issue, please disclose the issue responsibly to security@opencontainers.org.
runc
currently supports the Linux platform with various architecture support. It must be built with Go version 1.6 or higher in order for some features to function properly.
In order to enable seccomp support you will need to install libseccomp
on your platform.
e.g.
libseccomp-devel
for CentOS, orlibseccomp-dev
for Ubuntu
Otherwise, if you do not want to build runc
with seccomp support you can add BUILDTAGS=""
when running make.
# create a 'github.com/opencontainers' in your GOPATH/src cd github.com/opencontainers git clone https://github.com/opencontainers/runc cd runc make sudo make install
You can also use go get
to install to your GOPATH
, assuming that you have a github.com
parent folder already created under src
:
go get github.com/opencontainers/runc cd $GOPATH/src/github.com/opencontainers/runc make sudo make install
runc
will be installed to /usr/local/sbin/runc
on your system.
runc
supports optional build tags for compiling support of various features. To add build tags to the make option the BUILDTAGS
variable must be set.
make BUILDTAGS='seccomp apparmor'
Build Tag | Feature | Dependency |
---|---|---|
seccomp | Syscall filtering | libseccomp |
selinux | selinux process and mount labeling | |
apparmor | apparmor profile support | |
ambient | ambient capability support | kernel 4.3 |
nokmem | disable kernel memory account |
runc
currently supports running its test suite via Docker. To run the suite just type make test
.
make test
There are additional make targets for running the tests outside of a container but this is not recommended as the tests are written with the expectation that they can write and remove anywhere.
You can run a specific test case by setting the TESTFLAGS
variable.
# make test TESTFLAGS="-run=SomeTestFunction"
You can run a specific integration test by setting the TESTPATH
variable.
# make test TESTPATH="/checkpoint.bats"
You can run a test in your proxy environment by setting DOCKER_BUILD_PROXY
and DOCKER_RUN_PROXY
variables.
# make test DOCKER_BUILD_PROXY="--build-arg HTTP_PROXY=http://yourproxy/" DOCKER_RUN_PROXY="-e HTTP_PROXY=http://yourproxy/"
runc
uses vndr for dependencies management. Please refer to vndr for how to add or update new dependencies.
In order to use runc you must have your container in the format of an OCI bundle. If you have Docker installed you can use its export
method to acquire a root filesystem from an existing Docker container.
# create the top most bundle directory mkdir /mycontainer cd /mycontainer # create the rootfs directory mkdir rootfs # export busybox via Docker into the rootfs directory docker export $(docker create busybox) | tar -C rootfs -xvf -
After a root filesystem is populated you just generate a spec in the format of a config.json
file inside your bundle. runc
provides a spec
command to generate a base template spec that you are then able to edit. To find features and documentation for fields in the spec please refer to the specs repository.
runc spec
Assuming you have an OCI bundle from the previous step you can execute the container in two different ways.
The first way is to use the convenience command run
that will handle creating, starting, and deleting the container after it exits.
# run as root cd /mycontainer runc run mycontainerid
If you used the unmodified runc spec
template this should give you a sh
session inside the container.
The second way to start a container is using the specs lifecycle operations. This gives you more power over how the container is created and managed while it is running. This will also launch the container in the background so you will have to edit the config.json
to remove the terminal
setting for the simple examples here. Your process field in the config.json
should look like this below with "terminal": false
and "args": ["sleep", "5"]
.
"process": { "terminal": false, "user": { "uid": 0, "gid": 0 }, "args": [ "sleep", "5" ], "env": [ "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "TERM=xterm" ], "cwd": "/", "capabilities": { "bounding": [ "CAP_AUDIT_WRITE", "CAP_KILL", "CAP_NET_BIND_SERVICE" ], "effective": [ "CAP_AUDIT_WRITE", "CAP_KILL", "CAP_NET_BIND_SERVICE" ], "inheritable": [ "CAP_AUDIT_WRITE", "CAP_KILL", "CAP_NET_BIND_SERVICE" ], "permitted": [ "CAP_AUDIT_WRITE", "CAP_KILL", "CAP_NET_BIND_SERVICE" ], "ambient": [ "CAP_AUDIT_WRITE", "CAP_KILL", "CAP_NET_BIND_SERVICE" ] }, "rlimits": [ { "type": "RLIMIT_NOFILE", "hard": 1024, "soft": 1024 } ], "noNewPrivileges": true },
Now we can go through the lifecycle operations in your shell.
# run as root cd /mycontainer runc create mycontainerid # view the container is created and in the "created" state runc list # start the process inside the container runc start mycontainerid # after 5 seconds view that the container has exited and is now in the stopped state runc list # now delete the container runc delete mycontainerid
This allows higher level systems to augment the containers creation logic with setup of various settings after the container is created and/or before it is deleted. For example, the container's network stack is commonly set up after create
but before start
.
runc
has the ability to run containers without root privileges. This is called rootless
. You need to pass some parameters to runc
in order to run rootless containers. See below and compare with the previous version. Run the following commands as an ordinary user:
# Same as the first example mkdir ~/mycontainer cd ~/mycontainer mkdir rootfs docker export $(docker create busybox) | tar -C rootfs -xvf - # The --rootless parameter instructs runc spec to generate a configuration for a rootless container, which will allow you to run the container as a non-root user. runc spec --rootless # The --root parameter tells runc where to store the container state. It must be writable by the user. runc --root /tmp/runc run mycontainerid
runc
can be used with process supervisors and init systems to ensure that containers are restarted when they exit. An example systemd unit file looks something like this.
[Unit] Description=Start My Container [Service] Type=forking ExecStart=/usr/local/sbin/runc run -d --pid-file /run/mycontainerid.pid mycontainerid ExecStopPost=/usr/local/sbin/runc delete mycontainerid WorkingDirectory=/mycontainer PIDFile=/run/mycontainerid.pid [Install] WantedBy=multi-user.target
The code and docs are released under the Apache 2.0 license.