Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compression for leaf Mixed arrays of integers and collection in mixed. #7501

Merged

Conversation

nicola-cab
Copy link
Member

@nicola-cab nicola-cab commented Mar 20, 2024

What, How & Why?

Enable support for compressing Mixed and Collection in Mixed.

☑️ ToDos

  • 📝 Changelog update
  • 🚦 Tests (or not relevant)
  • C-API, if public C++ API changed
  • bindgen/spec.yml, if public C++ API changed

@nicola-cab nicola-cab force-pushed the nc/array_compression_with_nested_collections branch from 0ec1a67 to 51a8480 Compare March 21, 2024 19:18
@nicola-cab nicola-cab marked this pull request as ready for review March 22, 2024 12:23
Copy link

coveralls-official bot commented Mar 22, 2024

Pull Request Test Coverage Report for Build nicola.cabiddu_1546

Details

  • 109 of 109 (100.0%) changed or added relevant lines in 3 files are covered.
  • 1390 unchanged lines in 48 files lost coverage.
  • Overall coverage decreased (-0.5%) to 91.601%

Files with Coverage Reduction New Missed Lines %
test/fuzz_tester.hpp 1 59.58%
test/test_dictionary.cpp 1 99.85%
src/realm/array_blobs_small.cpp 2 92.61%
src/realm/array_blobs_small.hpp 2 80.77%
src/realm/array_integer_tpl.hpp 2 74.36%
src/realm/impl/output_stream.cpp 2 86.67%
src/realm/object-store/sync/sync_session.cpp 2 93.43%
src/realm/sync/changeset.hpp 2 73.49%
src/realm/sync/noinst/client_impl_base.cpp 2 86.4%
src/realm/util/file.cpp 2 80.98%
Totals Coverage Status
Change from base Build nicola.cabiddu_1541: -0.5%
Covered Lines: 245078
Relevant Lines: 267550

💛 - Coveralls

@nicola-cab nicola-cab force-pushed the nc/array_compression_with_nested_collections branch from b98ccfb to 45402aa Compare March 22, 2024 15:00
Copy link
Contributor

@finnschiermer finnschiermer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some changes needed....

src/realm/bplustree.cpp Outdated Show resolved Hide resolved
// This flag is used to differentiate this while descending the
// cluster.
const bool collection_in_mixed = true;
const bool compress = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't simply set this to true. We must pass in the surrounding value instead.

// temporary disable mixed. in order to re-enable them in a separate PR

/*
In order to know how Mixed stores things, we need to take in consideration this enum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to say we need to take into account the use of the following sub-arrays (and not the enum)

};
Note:
1. entry at index 0 is the composite array (the main information about the data is stored, it can
either be a small int <32 bit or a ref to the array where the actual data is stored).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the word 'ref' here is misleading. It is the offset into the relevant other array.

}
else if (i == 5) { // unique keys associated to collections in mixed
written_leaf.set_as_ref(
i, Array::write(rot.get_as_ref(), m_alloc, out, only_modified, true));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know if these keys are signed integers we can compress? Or is it an unsigned integer array (which we don't compress at the moment)..

also: we should never compress if the compress argument is false, so we can't just pass in true here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are int64_t … we can compress them, but obviously only if we got the compress flag set to true… nice catch!!!.

if (i < 3) { // composite, int, and pair_int
// integer arrays
written_leaf.set_as_ref(
i, Array::write(rot.get_as_ref(), m_alloc, out, only_modified, true));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't simply compress here. Only if "compress" is true.

Copy link
Contributor

@finnschiermer finnschiermer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@nicola-cab nicola-cab merged commit 4f311b7 into nc/merge_all_together Mar 25, 2024
34 of 38 checks passed
@nicola-cab nicola-cab deleted the nc/array_compression_with_nested_collections branch March 25, 2024 10:33
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 24, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants