Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootc: Keep /usr read-only after transient transactions ("re-locking") #2195

Open
wants to merge 3 commits into
base: bootc
Choose a base branch
from

Conversation

evan-goode
Copy link
Member

To keep /usr read-only after DNF is finished with a transient transaction, we can call ostree admin unlock --transient to mount the /usr overlay as read-only by default. Then, we create a private mount namespace for DNF and its child processes and remount the /usr overlayfs as read/write in the private mountns. See also this upstream OSTree test: https://github.com/ostreedev/ostree/blob/main/tests/kolainst/destructive/unlock-transient.sh.

Some ugly things/remaining points of uncertainty in this PR:

  • os.unshare is unfortunately only available in Python >= 3.12, so we have to call libc.unshare via Python ctypes here and hardcode the CLONE_NEWNS flag that we need to pass.
  • Using libostree adds a new runtime dependency on python3-gobject and ostree-libs. I added these to dnf.spec behind a %bcond_with ostree conditional.
  • Lazy import of python3-gobject (gi) and OSTree libraries in _BootcSystem.__init__
  • We still call ostree admin unlock --transient via subprocess instead of doing the unlock with libostree. This keeps some complexity and implementation details out of DNF. On the other hand, the OSTree CLI may be less stable than libostree, and we can't customize or translate its output.

Using libostree gives us more detail about the current state of the
deployment than only checking whether /usr is writable.
@pep8speaks
Copy link

pep8speaks commented Jan 16, 2025

Hello @evan-goode! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2025-01-21 22:41:24 UTC

@evan-goode evan-goode requested a review from dcantrell January 16, 2025 20:22
Copy link
Contributor

@dcantrell dcantrell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. My only real issue is with the libc_so constant, but I'm not going to die on that hill. I'm just in the mood for some good yak shaving.

dnf/const.py.in Outdated Show resolved Hide resolved
dnf/util.py Outdated Show resolved Hide resolved
@cgwalters
Copy link
Contributor

On the other hand, the OSTree CLI may be less stable than libostree

The ostree CLI is stable, we will never break it.

and we can't customize or translate its output.

I'd just redirect its output to /dev/null and keep the text here instead.

@cgwalters
Copy link
Contributor

Thanks for working on this! We need to expose this feature in bootc too actually.

That said see also ostreedev/ostree#3177 - it's temping to enable that by default in the base image.

@cgwalters
Copy link
Contributor

Unless we get to the point of deeper integration, I'd vote for just forking the CLI over linking to libostree.

An advantage of forking is that we get "loose coupling" - I think we probably shouldn't pull in libostree into every system using dnf in rhel9 and 10 today.

@evan-goode
Copy link
Member Author

Thanks for taking a look!

The ostree CLI is stable, we will never break it.
I'd just redirect its output to /dev/null and keep the text here instead.

Roger.

Unless we get to the point of deeper integration, I'd vote for just forking the CLI over linking to libostree.

For reading the unlocked state, I suppose we could grep ostree admin status for Unlocked: (unlocked state nick) instead of using libostree.

An advantage of forking is that we get "loose coupling" - I think we probably shouldn't pull in libostree into every system using dnf in rhel9 and 10 today.

Since this is Python, we can achieve sort of the same loose coupling by lazily importing the libostree GObject API like this. The spec file changes here/splitting the DNF builds isn't strictly necessary since we can check at runtime whether libostree is available. But thinking ahead to DNF 5, there we'll probably want to stick to forking the CLI, so we might as well here too.

ostreedev/ostree#3177

I think this would still work fine if the transient RO usroverlay were there by default, as long as ostree_deployment_get_unlocked still correctly reports transient.

@cgwalters
Copy link
Contributor

cgwalters commented Jan 17, 2025

For reading the unlocked state, I suppose we could grep ostree admin status for Unlocked: (unlocked state nick) instead of using libostree.

Oh sorry yes...I hadn't thought about the state reading part. Yeah, ostree really needs json output, but this leads to the next thing that we likely want this state in the bootc status.

EDIT: filed containers/bootc#1044

@cgwalters
Copy link
Contributor

Only very tangential to this but not quite sure where to put this (another issue here? JIRA?) at some point we are going to need to fix the problem that dnf upgrade should probably error out if an attempt is made to change nontrivial things outside of /usr and probably /etc. The key ones here are kernel (needs deeper integration, ref coreos/rpm-ostree#5135 ) and selinux-policy (ref https://bugzilla.redhat.com/show_bug.cgi?id=1290659 ) and shim/grub (ref https://github.com/coreos/bootupd/ )

Ideally we error before we even start an install; if we download filelists this would be pretty easy; without filelists maybe we should have a Provides that flags these special cases? Or we could inject into the base image something that denylists them in the dnf config?

Copy link
Contributor

@dcantrell dcantrell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. I would like to see an elimination of the LIBC_SONAME constant.

dnf/util.py Outdated Show resolved Hide resolved
@evan-goode evan-goode force-pushed the evan-goode/bootc-relock branch from f7da712 to 6324c65 Compare January 21, 2025 22:35
To keep /usr read-only after DNF is finished with a transient
transaction, we call `ostree admin unlock --transient` to mount the /usr
overlay as read-only by default. Then, we create a private mount
namespace for DNF and its child processes and remount the /usr overlayfs
as read/write in the private mountns.

os.unshare is unfortunately only available in Python >= 3.12, so we have
to call libc.unshare via Python ctypes here and hardcode the CLONE_NEWNS
flag that we need to pass.
Add a disabled-by-default build conditional for ostree dependencies
@evan-goode evan-goode force-pushed the evan-goode/bootc-relock branch from 6324c65 to 297fa23 Compare January 21, 2025 22:41
@evan-goode
Copy link
Member Author

Overall looks good. I would like to see an elimination of the LIBC_SONAME constant.

Done.

I think we can keep the libostree usage in for now. Maybe I'll take a stab at implementing containers/bootc#1044 so we can call that instead.

The main problem I see with using libostree is making sure python3-gobject is available (pretty sure we can count on ostree-libs being available on bootc systems, right?). subscription-manager already requires python3-gobject-base, so it might not cost any extra space on RHEL systems to have python3-dnf require it too. And we don't really need to worry about Fedora going forward because of DNF 5.

@evan-goode
Copy link
Member Author

Only very tangential to this but not quite sure where to put this (another issue here? JIRA?) at some point we are going to need to fix the problem that dnf upgrade should probably error out if an attempt is made to change nontrivial things outside of /usr and probably /etc. The key ones here are kernel (needs deeper integration, ref coreos/rpm-ostree#5135 ) and selinux-policy (ref https://bugzilla.redhat.com/show_bug.cgi?id=1290659 ) and shim/grub (ref https://github.com/coreos/bootupd/ )

Ideally we error before we even start an install; if we download filelists this would be pretty easy; without filelists maybe we should have a Provides that flags these special cases? Or we could inject into the base image something that denylists them in the dnf config?

Yes, good ideas. I created #2199.

@evan-goode
Copy link
Member Author

Some Image Mode tests that would be good to have:

  • Changes under /usr made by dnf --transient should NOT persist across reboots (see https://github.com/ostreedev/ostree/blob/main/tests/kolainst/destructive/unlock-transient.sh), changes under /etc, /var, other directories should persist.
  • /usr should remain read-only to other processes
  • Setting persistence=transient should be equivalent to passing --transient. Setting persistence=persist on a bootc system should fail regardless of the DeploymentUnlockedState.
  • persistence=transient should error when using an installroot other than / or using --downloadonly
  • Transient transactions should work when the deployment is already in a DEVELOPMENT, TRANSIENT, or HOTFIX state. (Technically, changes during HOTFIX are not really "transient" since they'll persist across reboots, but I don't think we should error if the user passes --transient. Maybe a warning would be good?)
  • The "A transient overlay will be created on /usr..." message should only be printed if the system is in DeploymentUnlockedState.NONE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants