Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify DataScienceCluster conditions #1501

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lburgazzoli
Copy link
Contributor

@lburgazzoli lburgazzoli commented Jan 15, 2025

Description

This commit aims to simpplify the DataScienceCluster status conditions
by removing transient conditions (Reconciling, Deploying, etc) and
replacing them with two conditsion:

  • Available that is used to reflects the work done by the controller, so
    an Available condition with status False would signal that the
    controller encountered some troubles while reconciling the status of
    the cluster
  • Ready that is used to reflect the status of the component managed by
    the controller.

It is perfectly possible to be in a state where Available=False and
Ready=True which can happen in case the controller has troubles to
perform some reconciliation logic because of i.e. a transient kubernetes
error, but at the same time all the components are running.

The object would then result in soimething like:

apiVersion: datasciencecluster.opendatahub.io/v1
kind: DataScienceCluster
metadata:
    name: default-dsc
spec: {}
status:
    conditions:
    - lastHeartbeatTime: "2025-01-15T12:54:20Z"
        lastTransitionTime: "2025-01-15T12:54:20Z"
        message: DataScienceCluster resource reconciled successfully
        reason: Available
        status: "True"
        type: Available
    - lastHeartbeatTime: "2025-01-15T12:54:20Z"
        lastTransitionTime: "2025-01-15T12:54:20Z"
        message: Ready
        reason: Ready
        status: "True"
        type: Ready

It is possible to wait for the DataScienceCluster to be ready by using
the kubectl wait command:

kubectl wait --for=condition=ready datascienceclusters.datasciencecluster.opendatahub.io default-dsc

How Has This Been Tested?

Screenshot or short clip

Merge criteria

  • You have read the contributors guide.
  • Commit messages are meaningful - have a clear and concise summary and detailed explanation of what was changed and why.
  • Pull Request contains a description of the solution, a link to the JIRA issue, and to any dependent or related Pull Request.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Copy link

openshift-ci bot commented Jan 15, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from lburgazzoli. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@lburgazzoli lburgazzoli force-pushed the cleanup-dsc-conditions branch from b40b2d8 to dfae715 Compare January 15, 2025 13:08
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is not easy to inject fault behaviors in existing components, I've refactored some DataScienceCluster methods to pure functions so they can be tested in isolation.

@lburgazzoli
Copy link
Contributor Author

/test opendatahub-operator-e2e

@lburgazzoli
Copy link
Contributor Author

/retest

Copy link

codecov bot commented Jan 15, 2025

Codecov Report

Attention: Patch coverage is 73.60000% with 33 lines in your changes missing coverage. Please review.

Project coverage is 20.74%. Comparing base (8622753) to head (fd42704).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...cecluster/datasciencecluster_controller_support.go 86.86% 9 Missing and 4 partials ⚠️
...atasciencecluster/datasciencecluster_controller.go 0.00% 9 Missing ⚠️
pkg/utils/test/matchers/jq/jq_support.go 0.00% 6 Missing ⚠️
pkg/utils/test/matchers/jq/jq_transform.go 57.14% 2 Missing and 1 partial ⚠️
...lers/components/modelcontroller/modelcontroller.go 0.00% 1 Missing ⚠️
...rs/components/modelmeshserving/modelmeshserving.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1501      +/-   ##
==========================================
+ Coverage   19.89%   20.74%   +0.85%     
==========================================
  Files         160      161       +1     
  Lines       10824    10846      +22     
==========================================
+ Hits         2153     2250      +97     
+ Misses       8444     8364      -80     
- Partials      227      232       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lburgazzoli lburgazzoli force-pushed the cleanup-dsc-conditions branch from 852babe to ff4ee0c Compare January 16, 2025 07:12
This commit aims to simpplify the DataScienceCluster status conditions
by removing transient conditions (Reconciling, Deploying, etc) and
replacing them with two conditsion:

- Available that is used to reflects the work done by the controller, so
  an Available condition with status False would signal that the
  controller encountered some troubles while reconciling the status of
  the cluster
- Ready that is used to reflect the status of the component managed by
  the controller.

It is perfectly possible to be in a state where Available=False and
Ready=True which can happen in case the controller has troubles to
perform some reconciliation logic because of i.e. a transient kubernetes
error, but at the same time all the components are running.

The object would then result in soimething like:

    apiVersion: datasciencecluster.opendatahub.io/v1
    kind: DataScienceCluster
    metadata:
        name: default-dsc
    spec: {}
    status:
        conditions:
        - lastHeartbeatTime: "2025-01-15T12:54:20Z"
            lastTransitionTime: "2025-01-15T12:54:20Z"
            message: DataScienceCluster resource reconciled successfully
            reason: Available
            status: "True"
            type: Available
        - lastHeartbeatTime: "2025-01-15T12:54:20Z"
            lastTransitionTime: "2025-01-15T12:54:20Z"
            message: Ready
            reason: Ready
            status: "True"
            type: Ready

It is possible to wait for the DataScienceCluster to be ready by using
the kubectl wait command:

    kubectl wait --for=condition=ready datascienceclusters.datasciencecluster.opendatahub.io default-dsc
@lburgazzoli lburgazzoli force-pushed the cleanup-dsc-conditions branch from ff4ee0c to fd42704 Compare January 16, 2025 07:46
@lburgazzoli lburgazzoli requested a review from ykaliuta January 16, 2025 09:17
@ykaliuta
Copy link
Contributor

Fine for me

@lburgazzoli lburgazzoli requested review from asanzgom and CFSNM January 16, 2025 13:36
@lburgazzoli
Copy link
Contributor Author

lburgazzoli commented Jan 16, 2025

@CFSNM @asanzgom can you have a look ? I don't know if this has any impact in testing (moved to draft to avoid any issue)

@asanzgom
Copy link
Contributor

@CFSNM @asanzgom can you have a look ? I don't know if this has any impact in testing (moved to draft to avoid any issue)

@lburgazzoli let me trigger tests against a build from your branch to check whether t breaks any test

@CFSNM
Copy link
Member

CFSNM commented Jan 16, 2025

@asanzgom I think the disruptive tests can be affected due to this change. We'll see

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

4 participants