Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YARN-11745: Fix TimSort contract violation in PriorityQueueComparator Class #7309

Open
wants to merge 1 commit into
base: branch-3.3
Choose a base branch
from

Conversation

Hean-Chhinling
Copy link
Contributor

Description of PR

This PR addresses the TimSort contract violation issue in the PriorityQueueComparator class, which was identified when sorting queue with resource (0, 0) and queue resource(any number, any number). The comparator previously failed to maintain transitivity of TimSort, leading to java.lang.IllegalArgumentException: Comparison method violates its general contract! during sorting.

Root Cause

The issue occurred due to inconsistent comparison logic that violated the transitivity rules required by TimSort.

Specifically, at the following code lines the AND condition that only compare the resources if both queues' resources are not none. However, when one of the queue resource is none or (0, 0) and the other queue resource is not none it skips this condition and go to compare based on the absoluteCapacity and when both of the queues' absoluteCapacity are the same, it leads to both queues equal each other even though their resources are different.

For more detail example of how this behaviour break the TimSort algorithm please see this attachment. ExampleZeroQueueResourceProblem

   if (!minEffRes1.equals(Resources.none()) && !minEffRes2.equals(
            Resources.none())) {
          return minEffRes2.compareTo(minEffRes1);
        }
        
    float abs1 = q1Sort.absoluteCapacity;
    float abs2 = q2Sort.absoluteCapacity;
    return Float.compare(abs2, abs1);

Solution

Instead of checking if both queues' resource are not none. We should only check if one of the queue's resource is not none. This way to avoid skipping the queue resource comparison when we have one queue resource is not none and the other one is none. Specifically, change from AND condition to OR condition at the following codes:

if (!minEffRes1.equals(Resources.none()) || !minEffRes2.equals( Resources.none())) { return minEffRes2.compareTo(minEffRes1); }

Testing

  • Added a unit test testPriorityQueueComparatorClassDoesNotViolateTimSortContract to verify that sorting no longer throws java.lang.IllegalArgumentException: Comparison method violates its general contract!.

  • The test includes setting resource instance (0, 0) and resource(any number, any number) then shuffle the repeated queues that were created and then sort in using the PriorityQueueComparator class. It also mock the necessary elements, for example, priority label, absoluteUsedCapacity, usedCapacity and absoluteCapacity. These elements can be of any number.

Impact

  • Resolves the TimSort violation, ensuring stable and predictable sorting for PriorityQueueComparator class.
  • The PriorityQueueComparator sorting algorithm may change in some behaviour when the queue resource is (0, 0).

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 44s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ branch-3.3 Compile Tests _
+1 💚 mvninstall 32m 34s branch-3.3 passed
+1 💚 compile 0m 34s branch-3.3 passed
+1 💚 checkstyle 0m 27s branch-3.3 passed
+1 💚 mvnsite 0m 39s branch-3.3 passed
+1 💚 javadoc 0m 31s branch-3.3 passed
+1 💚 spotbugs 1m 15s branch-3.3 passed
+1 💚 shadedclient 20m 14s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 31s the patch passed
+1 💚 compile 0m 28s the patch passed
+1 💚 javac 0m 28s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 18s the patch passed
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 javadoc 0m 21s the patch passed
+1 💚 spotbugs 1m 10s the patch passed
+1 💚 shadedclient 19m 58s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 76m 47s /patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt hadoop-yarn-server-resourcemanager in the patch passed.
+1 💚 asflicense 0m 25s The patch does not generate ASF License warnings.
157m 17s
Reason Tests
Failed junit tests hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesSchedulerActivities
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7309/1/artifact/out/Dockerfile
GITHUB PR #7309
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 7ac5ba255649 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision branch-3.3 / 6d5350f
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu118.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7309/1/testReport/
Max. process+thread count 960 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7309/1/console
versions git=2.17.1 maven=3.6.0 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants