Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gc-based snapshots #2125

Merged
merged 1 commit into from
Nov 15, 2022
Merged

Gc-based snapshots #2125

merged 1 commit into from
Nov 15, 2022

Conversation

icristescu
Copy link
Contributor

@icristescu icristescu commented Oct 21, 2022

After discussions with @vect0r-vicall, I'm working on a new snapshot export that:

  • calls gc on the commit for which we want to create the snapshot;
  • copies the resulting prefix, mapping (and dict is needed too) to a new directory;
  • that directory is the new snapshot format

The import:

  • opens a new store from a prefix, mapping and a dict, by creating a new control file;
  • add the commit to a fresh index.

This is WIP, but I would appreciate any feedback, in case I missed something. Things that I see to improve are:

  • instead of manipulating the type snap in the code, we could write to a file;
  • the control file after a snapshot import looks a lot like after a gc - we could introduce an extra status to differentiate the two
  • a better management for dict?

@icristescu icristescu added the no-changelog-needed No changelog is needed here label Oct 21, 2022
Copy link
Contributor

@Ngoguey42 Ngoguey42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deadly simple!

prefix_commit_offset : int63;
}

val export : repo -> commit_key -> snap Lwt.t
Copy link
Contributor

@Ngoguey42 Ngoguey42 Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following an offline discussion with @icristescu, a good story for that export function could be that export takes the path where a directory will be created containing a new fully functional irmin-pack store which contains only the commit (and it's tree).

This way, no import function is needed.

Copy link
Contributor

@art-w art-w Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to separate the export from running the GC? (= copy the existing prefix/mapping with first_commit as commit_key, without running a (destructive) GC to build the snapshot)
Nevermind I missread the finalize!

Copy link
Contributor

@Ngoguey42 Ngoguey42 Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this code runs the worker part of the normal GC but doesn't run the finalise part of the normal GC.

The worker part only creates new files, it doesn't modify the existing files in any way.

Does this answer your question?

Sure

src/irmin-pack/unix/io.ml Outdated Show resolved Hide resolved
@icristescu icristescu mentioned this pull request Oct 24, 2022
34 tasks
@vect0r-vicall
Copy link

If I understand correctly, the start_exn function checks whether or not the GC is already running when called. This may trigger two corner cases, as a snapshot export can be called while the context is opened in RW:

  • snapshot export is cancelled as a "real context GC" is already running -> not that bad,
  • a "real context GC" is cancelled as a snapshot export is running in background -> this is quite bad.

How could we avoid that?
For the snapshot export cancellation, I guess that we could easily check the status of the GC is_finished to prevent the snapshot export or delay it until the GC completion.
Regarding the context GC cancellation, if a "GC export" is already running in background, maybe we could delay it as well and resume it as soon as the preceding snapshot export finishes. However, it means that we may need to distinguish "GC for export" and "real GC", as the delayed behaviour is not desired when "real GC" calls are overlapping.
What do you think?

@icristescu
Copy link
Contributor Author

as a snapshot export can be called while the context is opened in RW

Indeed, I have not though about this case. Isn't it the case that currently we don't run a node and do a snapshot export at the same time?

To allow a concurrent snapshot export I think it could work as follow (I'm rephrasing a bit your suggestions):

  • snapshot export is not launched if a gc is running - already the case with this PR.
  • before launching a "real gc", check if one is running, cancel it and then run the "real gc"
    • right now, a "real gc" is skipped if a "real gc" is running. If we want to keep this behaviour, we need to distinguish between export and gc.
    • if we are ok with always cancelling previous gc (if the "real gcs" are separated enough that they don't interleave for example) then the current API allows us to do this from lib_context.

So either way, cancelling or skipping is maybe not too complicated. What is less trivial I think is the "postponing" part.

@vect0r-vicall
Copy link

Indeed, I have not though about this case. Isn't it the case that currently we don't run a node and do a snapshot export at the same time?

No. Snapshots can be exported while the octez node is running. We want to keep that feature and it should not interfere too much with the node's life.

if we are ok with always cancelling previous gc (if the "real gcs" are separated enough that they don't interleave for example) then the current API allows us to do this from lib_context.

There should be no interliving, indeed. However, it seems a good property not to cancel a running GC. Thus you will always perform a GC as some point. That is not guaranteed if the GC may be cancelled.

To me, skipping the "real GC" if a "snapshot GC" is running is not fine. That is the main issue.
Can we consider a "?(force = false)" for the gc function such as we chose whether or not the call must:

  • true - cancel the running gc and start a new one : "real GC" use case
  • false - fails/skip if a gc is running (the current behaviour) : "snapshot GC" use case

@icristescu icristescu force-pushed the snap branch 2 times, most recently from 67cf48a to 78ee6a8 Compare November 4, 2022 19:01
@icristescu icristescu changed the title (RFC)(WIP) Gc-based snapshots Gc-based snapshots Nov 4, 2022
@icristescu
Copy link
Contributor Author

this is ready for another round of reviews

Copy link
Member

@metanivek metanivek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this solution to updating snapshot as a simple export of a "one commit store". I left some thoughts on naming/code organization.

src/irmin-pack/unix/ext.ml Outdated Show resolved Hide resolved
src/irmin-pack/unix/io_intf.ml Outdated Show resolved Hide resolved
src/irmin-pack/unix/ext.ml Show resolved Hide resolved
src/irmin-pack/unix/file_manager.ml Outdated Show resolved Hide resolved
src/irmin-pack/unix/ext.ml Show resolved Hide resolved
src/irmin-pack/unix/file_manager_intf.ml Show resolved Hide resolved
@codecov-commenter
Copy link

Codecov Report

Merging #2125 (ba7ff96) into main (f40ce19) will increase coverage by 0.04%.
The diff coverage is 80.85%.

@@            Coverage Diff             @@
##             main    #2125      +/-   ##
==========================================
+ Coverage   68.17%   68.21%   +0.04%     
==========================================
  Files         134      134              
  Lines       16042    16085      +43     
==========================================
+ Hits        10937    10973      +36     
- Misses       5105     5112       +7     
Impacted Files Coverage Δ
src/irmin-pack/unix/errors.ml 28.88% <ø> (ø)
src/irmin-pack/unix/gc_worker.ml 6.06% <0.00%> (ø)
src/irmin-pack/unix/io_errors.ml 58.33% <ø> (ø)
src/irmin-pack/unix/gc.ml 77.88% <62.50%> (-1.29%) ⬇️
src/irmin-pack/unix/io.ml 65.21% <75.00%> (+0.35%) ⬆️
src/irmin-pack/unix/ext.ml 63.97% <81.25%> (+1.56%) ⬆️
src/irmin-pack/unix/file_manager.ml 84.75% <100.00%> (+0.83%) ⬆️
src/irmin-fs/irmin_fs.ml 81.21% <0.00%> (-1.22%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@icristescu icristescu removed the no-changelog-needed No changelog is needed here label Nov 15, 2022
@icristescu icristescu merged commit c11bd50 into mirage:main Nov 15, 2022
@icristescu icristescu deleted the snap branch November 15, 2022 10:48
metanivek added a commit to metanivek/opam-repository that referenced this pull request Dec 14, 2022
…ils, irmin-test, irmin-pack, irmin-mirage, irmin-mirage-graphql, irmin-mirage-git, irmin-http, irmin-graphql, irmin-git, irmin-fs, irmin-containers, irmin-cli, irmin-chunk and irmin-bench (3.5.0)

CHANGES:

### Added

- **irmin-pack**
  - Add `Irmin_pack_unix.Stats.Latest_gc` which is now the parameter of GC's
    `finished` callback (mirage/irmin#2089, @Ngoguey42)
  - Add `Gc.oldest_live_commit` which returns the key of the commit on which the
    latest gc was called on. (mirage/irmin#2110, @icristescu)
  - Add `split` to create a new suffix chunk. Subsequent writes will append to
    this chunk until `split` is called again. (mirage/irmin#2118, @icristescu)
  - Add `create_one_commit_store` to create a new store from the existing one,
    containing only one commit. (mirage/irmin#2125, @icristescu)

### Changed

- **irmin-pack**
  - Upgraded on-disk format to version 4. (mirage/irmin#2110, @icristescu)
  - Detecting control file corruption with a checksum (mirage/irmin#2119, @art-w)
  - Change on-disk layout of the suffix from a single file to a multiple,
    chunked file design (mirage/irmin#2115, @metanivek)
  - Modify GC to work with new chunked suffix. See `examples/gc.ml` for a
    demonstration of how it works with the new `split` function. (mirage/irmin#2126,
    @metanivek)

### Fixed
metanivek added a commit to metanivek/opam-repository that referenced this pull request Dec 14, 2022
…ils, irmin-test, irmin-pack, irmin-mirage, irmin-mirage-graphql, irmin-mirage-git, irmin-http, irmin-graphql, irmin-git, irmin-fs, irmin-containers, irmin-cli, irmin-chunk and irmin-bench (3.5.0)

CHANGES:

### Added

- **irmin-pack**
  - Add `Irmin_pack_unix.Stats.Latest_gc` which is now the parameter of GC's
    `finished` callback (mirage/irmin#2089, @Ngoguey42)
  - Add `Gc.oldest_live_commit` which returns the key of the commit on which the
    latest gc was called on. (mirage/irmin#2110, @icristescu)
  - Add `split` to create a new suffix chunk. Subsequent writes will append to
    this chunk until `split` is called again. (mirage/irmin#2118, @icristescu)
  - Add `create_one_commit_store` to create a new store from the existing one,
    containing only one commit. (mirage/irmin#2125, @icristescu)

### Changed

- **irmin-pack**
  - Upgraded on-disk format to version 4. (mirage/irmin#2110, @icristescu)
  - Detecting control file corruption with a checksum (mirage/irmin#2119, @art-w)
  - Change on-disk layout of the suffix from a single file to a multiple,
    chunked file design (mirage/irmin#2115, @metanivek)
  - Modify GC to work with new chunked suffix. See `examples/gc.ml` for a
    demonstration of how it works with the new `split` function. (mirage/irmin#2126,
    @metanivek)

### Fixed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants