This page permanently redirects to gemini://complete.org/isolating-data-from-your-own-processes-with-linux-namespaces/.
Back in my 2019 article "The Desktop Security Nightmare"[1], I noted that on most of our desktops, we don't have good control of what data a program can access and when.
=> 1: https://changelog.complete.org/archives/10006-the-desktop-security-nightmare
I noted that we have things like AppArmor, which is something, but not the entire picture. SELinux is so extremely complicated that even Ted T'so had a comment about never getting some of his life back.
I don't like complexity, especially when it comes to security.
One of my goals is what I'm going to call context-sensitive security. For instance, I would like the PDF of my taxes to be unavailable to all software... except when I'm working on my taxes. So, the okular PDF viewer shouldn't be able to access my tax files, except when I explicitly say it's OK.
One way to accomplish this, of course, would be to just mount the filesystem containing my taxes when I'm working on taxes, and leave it unmounted. However, besides the obvious convenience drawback, this has another one: either the files are inaccessible entirely, or they're accessible to the 5000 programs I have in /usr/bin, the untold number of npm packages a person may have installed, and so forth.
What I really want is to be able to say: "make this directory tree available only to this process and its children." And that's what I'm going to lay out in this article.
You are probably already familiar with containers in the sense that they're behind Docker and LXC. A container uses a bunch of Linux[2] namespaces[3] to give the illusion of a separate machine. The namespace types include cgroup, IPC, network, mount, PID, time, user, and UTS. So if, for instance, a process has a separate PID namespace, then the process IDs within it may not show the entire system's PID table, may map to other "real" PIDs, etc. Likewise, with a distinct mount namespace, it may have different filesystems mounted.
=> 2: /linux/ | 3: https://en.wikipedia.org/wiki/Linux_namespaces
The trick I'm going to use here is this: you don't have to use all of these as separate namespaces. You can just use a couple, and achieve some nice separation without having a fully-isolated container! And it can be done entirely without root permissions.
For this demonstration, I'm going to use gocryptfs[4]. It is an encrypted filesystem in FUSE[5], which means no root is necessary. You could use anything, though, from a traditional filesystem to other FUSEs, or even bind mounts.
=> 4: https://nuetzlich.net/gocryptfs/ | 5: https://en.wikipedia.org/wiki/Filesystem_in_Userspace
I should note, however, that the in-kernel keyring (used by fscrypt and e4crypt) is not separated out by namespaces, so you can't just unlock a certain tree with e4crypt and expect it to be only unlocked in one namespace.
First, we're going to enter a different namespace. The unshare
command will create a separate user namespace (-U
, necessary for the mount namespace), a separate mount namespace with -m
, and populate the user namespace with our current user with -c
. Since I don't give it an explicit command to run, it will run a shell. Here we go:
$ echo $$ 873411 $ unshare --keep-caps -Umc $ echo $$ 887896
So you can see we're in a different PID, at least. Now let's set up gocryptfs:
$ mkdir crypt plain $ gocryptfs -init crypt Choose a password for protecting your files. Password: Repeat: ... The gocryptfs filesystem has been created successfully. You can now mount it using: gocryptfs crypt MOUNTPOINT
OK. We've made two directories, crypt
which holds the encrypted data, and plain
which holds the plaintext (decrypted) view. We also initialized crypt
. Now let's mount it -- remember, we're still in the new namespace:
$ gocryptfs crypt plain Password: Decrypting master key Filesystem mounted and ready.
OK! Now how about creating a file in plain:
$ echo Testing > plain/test
Now, we can see that there's an encrypted file representing it in crypt:
$ ls -l crypt total 6 -rw-r--r-- 1 jgoerzen jgoerzen 58 Dec 10 06:24 C1kX7S1Lq423tp7QVwdNfA -r-------- 1 jgoerzen jgoerzen 385 Dec 10 06:24 gocryptfs.conf -r--r----- 1 jgoerzen jgoerzen 16 Dec 10 06:24 gocryptfs.diriv
And in plain, we have the file:
$ ls -l plain total 1 -rw-r--r-- 1 jgoerzen jgoerzen 8 Dec 10 06:24 test $ cat plain/test Testing
Now, keep this terminal open. Open another one (but not by starting it from this shell). From the other terminal, you can see:
$ ls -l plain total 0
Yes! The plain directory was completely empty here, because it was mounted only in the other namespace!
Now, back in the namespace, let's clean up:
$ fusermount -u plain $ exit
It's important to unmount plain before exiting the namespace. If you don't, you can't directly umount it from the parent namespace. You would have to either kill the gocryptfs process.
Let's create a script called nsrun
to make this easier.
#!/bin/bash # Pass the command to run in the namespace, # and any parameters, on the command-line. if [ -z "$1" ]; then echo "Syntax: $0 command [args]" exit 5 fi gocryptfs crypt plain || exit "$?" "$@" RETVAL="$?" fusermount -u plain exit "$RETVAL"
Now, run it:
$ chmod a+x nsrun $ unshare --keep-caps -Umc ./nsrun ls -l plain Password: Decrypting master key Filesystem mounted and ready. total 1 -rw-r--r-- 1 jgoerzen jgoerzen 8 Dec 10 06:24 test
Excellent! And our script make sure to unmount the plaintext view before exiting. So now, I could type unshare --keep-caps -Umc ./nsrun okular plain/taxes.pdf
or something to view a file that's otherwise unavailable - and it will be only available to the okular process started this way (and any of its child processes)! No other process on the system can see it.
What if we want to run multiple programs to have access to the data? Note that most filesystems, including gocryptfs, don't really like to have the same data mounted multiple times. There are a couple of options.
We could run something like unshare --keep-caps -Umc ./nsrun bash
and launch them all from that shell.
Or, we can simultaneously enter the same namespace multiple times.
#!/bin/bash # Pass the command to run in the namespace, # and any parameters, on the command-line. if [ -z "$1" ]; then echo "Syntax: $0 command [args]" exit 5 fi IDENTIFIER="BLOGDEMO" until TARGETPID=`pgrep -u "$(id -u)" -n -f "^/usr/bin/gocryptfs.* -fsname $IDENTIFIER "`; do echo "$IDENTIFIER not mounted; mounting." unshare --keep-caps -Umc /usr/bin/gocryptfs -fsname "$IDENTIFIER" crypt plain done echo "Entering namespace at PID $TARGETPID" # gocryptfs likes to see at least one read before it permits writes, so do that here. nsenter --preserve-credentials -U -m -t "$TARGETPID" ls "$(pwd)/plain" > /dev/null exec nsenter --preserve-credentials -U -m -t "$TARGETPID" /usr/bin/env "--chdir=$(pwd)" "$@"
So this is working a bit differently. It's going to first mount the filesystem in its own namespace, then just let it hang there.
Then, we figure out the PID of the gocryptfs command, using a (presumably-unique) identifier to differentiate it from other potential gocryptfs instances. Now, by using nsenter
, we can launch a new command in the namespace we created earlier, which is the only way we can access the files.
In this case, we keep reusing the existing mount until we're done with it. Note that it will be necessary to kill the gocryptfs process in the end when we're done, since nothing here is going to unmount it.
Watch how it works:
$ ./nsrunenter bash BLOGDEMO not mounted; mounting. Password: Decrypting master key Filesystem mounted and ready. Entering namespace at PID 919929 $ cat plain/test Testing $ exit exit $ cat plain/test cat: plain/test: No such file or directory $ ./nsrunenter bash Entering namespace at PID 919929 $ cat plain/test Testing $ exit exit $ cat plain/test cat: plain/test: No such file or directory
So here, the first time we called our new script, it mounted the gocryptfs filesystem, and then ran bash
inside the namespace we created for it. After exiting from that namespace, of course we couldn't see our test file.
The second time we called the script, it detected the existing namespace and joined it. Again, the command worked.
You might be thinking, "well, if I can just nsenter the namespace, what good is this?" One of the principles of Computer Security[6] is defense in depth[7]; that is, multiple lines of defenses.
=> 6: /computer-security/ | 7: https://en.wikipedia.org/wiki/Defense_in_depth_(computing)
The premise of this whole post is to add protections in case malicious code is executed in your account. That is, one of your lines of defenses has already failed. Here's what we're adding:
You could bolster this further by running the unshare
and nsenter
under sudo
, so that the local user wouldn't be able to enter the namespace without authenticating. This has some tradeoffs (greater complexity for sure), and raises the bar towards an attacker having to fool the user into authenticating to sudo.
So, while this approach isn't absolutely perfect, it is another line in the other defenses you should already have.
sudo unshare
will expose all of root's files - probably not what you want! Do this carefully!)
gocryptfs -extpass ssh-askpass
Here are some (potentially) interesting topics you can find here:
=> Homepage
=> Interesting Topics
=> How This Site is Built
=> About John Goerzen
=> Web version of this site
(c) 2022-2024 John Goerzen
text/gemini; charset=utf-8; lang=en; size=11362
This content has been proxied by September (3851b).