Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't revalidate Python interpreter cache entry with --upgrade #10361

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 22 additions & 25 deletions crates/uv-python/src/interpreter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ use serde::{Deserialize, Serialize};
use thiserror::Error;
use tracing::{trace, warn};

use uv_cache::{Cache, CacheBucket, CachedByTimestamp, Freshness};
use uv_cache::{Cache, CacheBucket, CachedByTimestamp};
use uv_cache_info::Timestamp;
use uv_cache_key::cache_digest;
use uv_fs::{write_atomic_sync, PythonExt, Simplified};
Expand Down Expand Up @@ -762,34 +762,31 @@ impl InterpreterInfo {
})?;

// Read from the cache.
if cache
.freshness(&cache_entry, None)
.is_ok_and(Freshness::is_fresh)
{
if let Ok(data) = fs::read(cache_entry.path()) {
match rmp_serde::from_slice::<CachedByTimestamp<Self>>(&data) {
Ok(cached) => {
if cached.timestamp == modified {
trace!(
"Cached interpreter info for Python {}, skipping probing: {}",
cached.data.markers.python_full_version(),
executable.user_display()
);
return Ok(cached.data);
}

// We ignore the freshness of the cache entry and use only our own logic, otherwise
// `--upgrade` always implies querying Python.
Copy link
Member

@charliermarsh charliermarsh Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to respect --refresh though?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is --refresh intended to also revalidate the python interpreter querying? Querying Python is kinda costly, so I'd like to find a way to not do it unless we have, e.g. not query python with just --upgrade.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I think it's the only way to revalidate it, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the internal timestamp check, so when the file changes we revalidate. It's different from packages because python is local, so we're always performing the "revalidation request" by checking the timestamp on the file. Are there scenarios where the python interpreter metadata changes while the timestamp would indicate it's fresh?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I... guess not. I mean, for symlinks yes, but we already don't cache for symlinks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not avoid revalidation with --refresh, since we're also revalidating the python interpreter by checking the timestamp. It does however not update if the python interpreter changes without the timestamp changing (i don't know how that would be possible except for the user manually changing timestamps, breaking our caching assumptions). This is mirroring the behavior for packages, where we're also trusting the remote with If-Modified-Since and only reload the full information with --refresh when either the revalidation request fails or the user clears the cache.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. On main, --refresh will force us to re-query here always. On this branch, --refresh will no longer re-query here unless the timestamp has changed. So it does avoid refetching the interpreter data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't the interpreter depend on something else that changes?

This is mirroring the behavior for packages, where we're also trusting the remote with If-Modified-Since and only reload the full information with --refresh when either the revalidation request fails or the user clears the cache.

I think this is a reasonable argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't the interpreter depend on something else that changes?

If something like that exists, it would break this PR and i'd change strategies to more aggressively invalidating the python interpreter cache instead. I'm not aware of any such cases, if you have any let me know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a great example #10832

if let Ok(data) = fs::read(cache_entry.path()) {
match rmp_serde::from_slice::<CachedByTimestamp<Self>>(&data) {
Ok(cached) => {
if cached.timestamp == modified {
trace!(
"Ignoring stale interpreter markers for: {}",
"Cached interpreter info for Python {}, skipping probing: {}",
cached.data.markers.python_full_version(),
executable.user_display()
);
return Ok(cached.data);
}
Err(err) => {
warn!(
"Broken interpreter cache entry at {}, removing: {err}",
cache_entry.path().user_display()
);
let _ = fs_err::remove_file(cache_entry.path());
}

trace!(
"Ignoring stale interpreter markers for: {}",
executable.user_display()
);
}
Err(err) => {
warn!(
"Broken interpreter cache entry at {}, removing: {err}",
cache_entry.path().user_display()
);
let _ = fs_err::remove_file(cache_entry.path());
}
}
}
Expand Down
Loading