Ancestors

Written by Lennart Poettering on 2024-12-13 at 10:10

3️⃣3️⃣ Here's the 33rd post highlighting key new features of the current v257 release of systemd. #systemd257

systemd's service logic provides the RuntimeDirectory=, StateDirectory=, CacheDirectory=, LogsDirectory= to declare clear directories that services can place their data into.

This is nice because it means there's a clear association between a service and its mutable resources, so that they can be reasonably life-cycle bound together. "systemctl clean" for example can be used…

=> More informations about this toot | More toots from pid_eins@mastodon.social

Written by Lennart Poettering on 2024-12-13 at 10:14

…to empty out a specific directory of these for a unit.

But more importantly: there's a security angle to it. Because the service manager runs privileged it can set up these directories on service activation and chown() them appropriately so that the an unprivileged service can then make use of them.

This works great. But because we wanted life to be exciting we complicated the whole thing: back in v232 we added the DynamicUser=1 concept to service management.

=> More informations about this toot | More toots from pid_eins@mastodon.social

Written by Lennart Poettering on 2024-12-13 at 10:17

If set, a UID is dynamically allocated for a service when it starts and released again when it stops. This is fantastic for many payloads, as it means UID-based security isolation is available cheaply without having to pre-allocate everything statically. You can just fire off a quick service with its own UID here and there, and this does not result in "sticky" UID allocations in /etc/passwd.

RuntimeDirectory=/StateDirectory=/CacheDirectory=/LogsDirectory= are really useful in the…

=> More informations about this toot | More toots from pid_eins@mastodon.social

Written by Lennart Poettering on 2024-12-13 at 10:19

…context of DynamicUser=1: they allow such services to have persistent directories on disk, that are properly owned by the short-lived UID. Implementing this comes with some ugliness however: what happens if I reference some state directory from a DynamicUser=1 service today where it will get UID X assigned, and tomorrow when I start it again it will get UID Y assigned?

Our solution was pragmatic: when this happens we recursively re-chown() the referenced directories…

=> More informations about this toot | More toots from pid_eins@mastodon.social

Toot

Written by Lennart Poettering on 2024-12-13 at 10:26

…on service activation if needed. It has worked like that ever since.

Re-chown()-ing is not ideal though. Effectively, in most cases it's sufficiently fast to not be annoying, but for services with very complex directory trees with millions of inodes this might come at a prohibitive time penalty.

On modern kernels there's new functionality to make this situation better: idmapped mounts. They permit that the UIDs/GIDs stored on disk are remapped before being made visible to applications.

=> More informations about this toot | More toots from pid_eins@mastodon.social

Descendants

Written by Lennart Poettering on 2024-12-13 at 10:28

And that's awesome in this context: it means we never have to chown() anything: we can leave the inodes as is, but dynamically mount them to the right ownership in a trivial operation. Yay!

With v257 this is now hooked up. This not only brings efficiency, but also security: we made it so that the files on disk are now owned by the "nobody" user/group, i.e. the special UID/GID that the kernel uses for "unmapped" users/groups. Only during lifetime of the DynamicUser=1 service they…

=> More informations about this toot | More toots from pid_eins@mastodon.social

Written by Lennart Poettering on 2024-12-13 at 10:31

…will be mapped transiently to the right dynamic UID/GID.

This also opens another door for us: we can eventually allow sharing of such directories between two DynamicUser=1 services that run with distinct UIDs: on disk all their files will be owned by "nobody", but each service they are associated with will see them as if they own them personally, even though all these services run under a different UID.

For compatibility with old kernels we retain the chown() logic for now.

=> More informations about this toot | More toots from pid_eins@mastodon.social

Written by Lennart Poettering on 2024-12-13 at 10:33

But hopefully sooner or later we never have to rely on it again, and can instead use idmapped mounts in all cases.

Features like this are just wonderful: they simplify the logic, speed it up, increase security and enable new functionality. How often do you get the chance to tick all those four boxes at once?

=> More informations about this toot | More toots from pid_eins@mastodon.social

Written by cesarb on 2024-12-13 at 22:46

@pid_eins I thought we were supposed to avoid having any files owned by nobody/nogroup for some NFS reason (something about it making the files unexpectedly accessible to unknown NFS users)?

=> More informations about this toot | More toots from cesarb@fosstodon.org

Written by Lennart Poettering on 2024-12-14 at 07:11

@cesarb don't run code as "nobody", that'd be a terrible idea.

But I think for this case "nobody" makes sense, because the files on disk shall not be accessible to anyone without the idmapping applied. Only once they are remapped to some reasonably uid they shall be accessible. And that is kinda what "nobody" is typically about: files that aren't mappable locally are owned by that.

=> More informations about this toot | More toots from pid_eins@mastodon.social

Written by Lennart Poettering on 2024-12-14 at 07:12

@cesarb "nobody" has at least two uses right now: unmapped ownership in NFS id mapping context, and unmapped ownership in userns mapping context. This here would be a 3rd use, and really close to the userns one in fact.

=> More informations about this toot | More toots from pid_eins@mastodon.social

Written by Elias Probst on 2024-12-13 at 10:46

@pid_eins TIL about idmapped mounts 🎊🥳

=> More informations about this toot | More toots from eliasp@mastodon.social

Written by Ihor Kalnytskyi on 2024-12-13 at 11:05

@pid_eins This is nice. What about BindPaths= ? Is there going to be a way to remap owners when binding directories from the host?

=> More informations about this toot | More toots from ihor@fosstodon.org

Written by Lennart Poettering on 2024-12-14 at 07:19

@ihor you mean having a knob to apply arbitrary remappings? do you have a specific usecase for that?

I am not a fan of too finegrained controls for things like this i must say...

=> More informations about this toot | More toots from pid_eins@mastodon.social

Written by Ihor Kalnytskyi on 2024-12-14 at 12:32

@pid_eins Yeah, that's what I meant. My use case is a simple one: I want to sandbox my backup solution, and only grant access to files and/or directories it needs to backup.

Today, I either have to run this solution as 'root' (so it has access to everything) or set it up multiple times for different users on my system (which grows linearly with backup targets number).

It'd be nice if I can use DynamicUser instead, but grant access to backup targets mounted via BindPaths.

=> More informations about this toot | More toots from ihor@fosstodon.org

Written by Elias Probst on 2024-12-16 at 08:48

@ihor exactly my use-case as well... Tried hacking around this limitation but couldn't find a proper solution so far.

@pid_eins

=> More informations about this toot | More toots from eliasp@mastodon.social

Written by bluca on 2024-12-16 at 10:21

@eliasp @ihor @pid_eins we really want to restrict this automagic stuff to well modeled and strict options. It gets too weird with completely arbitrary settings like bindpaths. For your specific use case you can use capabilities instead, and you'll be able to drop root permissions and use cap_dac_search and so on instead

=> More informations about this toot | More toots from bluca@fosstodon.org

Proxy Information
Original URL
gemini://mastogem.picasoft.net/thread/113645031509917542
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
353.754384 milliseconds
Gemini-to-HTML Time
6.313691 milliseconds

This content has been proxied by September (ba2dc).