blob: 786ba0c6961aa3c87c47cbaf2ec572f6733cb5aa [file] [log] [blame]
# Safe filesystem access
## Background
Native Client is a useful library for restricted code execution that is useful
beyond the browser. There’s all sorts of cases where one might want to run
untrusted code in a system disconnected from Chromium or PPAPI.
In these cases, it is additionally useful to allow restricted filesystem
access, but to still run untrusted code.
## Constraints
These constraints are NOT GUARANTEED TO BE ENFORCED BY SEL_LDR, and must
be guaranteed by the caller:
* The mounted directory is assumed to not include any symlinks.
These constraints will be enforced by sel_ldr:
* Access to filesystem within sel_ldr will behave as if as if the mounted
directory is root.
## Requirements
* Very low overhead File I/O.
* Give users the ability to have selective read-only or read/write access.
* Give users control over the entire filesystem layout presented to the
untrusted application.
## Feature user experience
This feature adds one additional commandline argument `-m` to `sel_ldr`. `-m`
takes a path to the directory to be used as the root by the untrusted
application.
All of the given requirements can be satisfied by this `chroot`-style
interface:
* The only overhead for file I/O is in adding (and sanitizing) a path prefix
to absolute paths passed through to the host.
* Read-only or read/write access can be controlled using normal filesystem
permissions for the user running the `sel_ldr` process.
* Using host-side filesystem primitives such as Linux bind mounts, users can
map disparate paths from the host into the untrusted process's root.
## Implementation
For the most part, this feature is simple and relatively straightforward to
add. When `-m` is given (even if `-a` is not), file operations will be enabled
and the provided chroot path will be added as a path prefix to all absolute
paths passed through to the host. Paths will be sanitized to make sure no
escapes are made with `..` path elements.
Given that strategy, the following syscall changes were straightforward:
* `open`
* `stat`
* `mkdir`
* `rmdir`
* `unlink`
* `truncate`
* `link`
* `rename`
* `chmod`
* `access`
* `utimes`
### Path sanitization
Path sanitization does the following:
* It checks that the cwd is within the mounted directory (set at
initialization).
* If the user's path is relative, it prefixes the cwd (within the mounted
directory) to the relative path, making the path absolute.
* It canonicalizes the absolute path ("..", ".", and "//" parts are handled and
removed).
* It prefixes the canonicalized path with the path to the mounted directory.
As a consequence of relying on the cwd, if the cwd changes during a filename
call on a relative path, the filename access can fail because the getcwd()-based
path could become invalid.
### Symlinks
Ideally, symlinks should behave as if we had just chrooted into the path given
to `-m` (which means they would resolve relative to the untrusted root).
However, it's not straightforward to have `sel_ldr` call `chroot` itself to
make the kernel do appropriate symlink resolution in a cross-platform way. So
our other option to support symlinks is to do an Lstat on every path load and
try to resolve symlinks ourselves. This would be a pretty big performance hit
and a significant amount of complication needing auditing. We wouldn't be able
to just pass through sanitized paths to the host's open call without making
sure the untrusted app hadn't created a symlink pointing elsewhere to jump out
of jail.
So instead, for v1 we're just not supporting symlinks.
The following syscalls relate to symlinks.
* `lstat` - identical to stat, since there should be no symlinks inside mounted
directory.
* `symlink` - always fails.
* `readlink` - always fails.
### Current working directory
Being able to read the current working directory (aside from `readlink`,
previously discussed) is the only syscall that has a path going from the host
back to the untrusted application. All other syscalls are one way - paths go
from the untrusted application out to the host.
This affects the following two syscalls:
* `chdir` - always adds the path prefix.
* `getcwd` - checks if the path is sanitized and starts with the path
prefix. If it doesn't start with the path prefix this syscall fails.
### File descriptor syscalls
These syscalls require no changes and are safe to just enable like they would
be with `-a` when `-m` is passed:
* `getdents` - `..` might reference an inode outside of the untrusted
root, but that's okay because the untrusted application can't do anything
with it.
* `fstat` - no different than `stat`, which is already allowed.
For both of these calls, we might need to make sure there's no file descriptors
in the untrusted application that reference files outside of the untrusted
root, but even if we don't there's very little the untrusted application could
do with the data these calls return.