-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DRAFT] Restructure Repo for Table Format specific modules #618
base: main
Are you sure you want to change the base?
[DRAFT] Restructure Repo for Table Format specific modules #618
Conversation
…resource files for tests
<groupId>org.apache.hudi</groupId> | ||
<artifactId>hudi-spark${spark.version.prefix}-bundle_${scala.binary.version}</artifactId> | ||
<scope>test</scope> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this be a problem for running tests around HFile since the Hudi Spark bundle shades dependencies, particularly HBase ones, in a different way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should not if the dependencies are properly configured within the bundle, that is the main reason behind the shading. We can have different versions on the same path because they will be completely self contained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true except for org.apache.hudi.io.storage.HoodieHBaseKVComparator
, which is not relocated. In the Hudi Spark bundle, org.apache.hudi.io.storage.HoodieHBaseKVComparator
extends from org.apache.hudi.org.apache.hadoop.hbase.CellComparatorImpl
after Hudi bundle relocation. I'm wondering if that causes a clash here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm adding a new framework for testing where I essentially execute java -cp ...
style command with the bundles on the classpath to validate that the jars are not relying on any dependencies brought onto the test path by our test setup to do the conversion. This test also asserts that the results can be read back properly.
xtable-hudi/pom.xml
Outdated
<excludes> | ||
<exclude>org.apache.hudi.io.storage.HoodieHBaseKVComparator</exclude> | ||
</excludes> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way to validate the conjecture above is to remove this exclusion to see if the test passes or a different error is thrown (by removing the exclusion, there should be no clash any more, and the org.apache.xtable.shade.org.apache.hudi.io.storage.HoodieHBaseKVComparator
is written as the comparator in the HFile if the test table is prepared using the XTable bundle). Still, we need to figure out how to properly handle org.apache.hudi.io.storage.HoodieHBaseKVComparator
as the comparator class since the fully-qualified class name needs to be preserved for reading HFile format in Hudi.
Important Read
What is the purpose of the pull request
Brief change log
Verify this pull request
TODO