Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8s/Deployment: Feature optimization #24

Closed
caoxianfei1 opened this issue Apr 11, 2023 · 5 comments
Closed

K8s/Deployment: Feature optimization #24

caoxianfei1 opened this issue Apr 11, 2023 · 5 comments
Assignees
Labels
activities enhancement New feature or request good first issue Good for newcomers

Comments

@caoxianfei1
Copy link
Collaborator

Is your feature request related to a problem? (你需要的功能是否与某个问题有关?)

Now the operator deploy Etcd, MDS, chunkserver and snapshotclone server managed by deployment, the deployment must follow a certain process. Sometimes it is necessary to wait for the previous service to start successfully before starting the next process.

But now we don't implement this step, or simply wait a few seconds by sleep function.

Describe the solution you'd like (描述你期望的解决方法)

Therefore, we need a waitStartUp general function. The main work of this function is to wait for a deployment pod to enter the running state. Then call the function where appropriate.

@zhanghuidinah
Copy link
Member

@zhanghuidinah
Copy link
Member

/assign @anurnomeru

@anurnomeru
Copy link
Contributor

👍

@anurnomeru
Copy link
Contributor

There are two common implementation ways:

The first one is to wait by marking a specified label on the pod template. This implementation is relatively simple and accepts key-value pairs as parameters: waitStartUp(lables Lables)

  • Simple implementation
  • Need to add specific labels to the resources that can be waited for
  • May need to consider forward compatibility with previously deployed resources

The other method is to use ownerRef for recursive lookup. This method is more flexible and does not intrude on the original logic. However, it is more memory-intensive because it requires a complete listAndWatch of RS and pod, and then classifies resources according to ownerRef. It receives a specific resource as a parameter, obtains the GVK mapping based on its GVR, and then establishes an index based on GVK: waitStartUp(un unstructured.Unstructured)

  • Non-intrusive
  • Supports any resource that controlling Pod either through RS or directly.

@caoxianfei1
Copy link
Collaborator Author

caoxianfei1 commented Apr 18, 2023

you are considerful, but it maybe not as complicated as you think. Now we manager each application pod by each deployment controller, and every one deployment has unique name, so you don't need to List and Watch all pod and distinguish them.

Only one method may be effective, for example, WaitForDeploymentToStart(deployment *appsv1.Deployment). you can get the status of the deployment at a fixed interval (such as 3s), and set the maximum number of retries, such as 100. If there is still not Ready status within this time period, it will be considered that the startup failed. The conditions for judging activation may be as follows:

if d.Status.ObservedGeneration != deployment.Status.ObservedGeneration && d.Status.UpdatedReplicas > 0 && d.Status.ReadyReplicas > 0

Call this function when a deployment is created.

anurnomeru added a commit to anurnomeru/curve-operator that referenced this issue Apr 19, 2023
anurnomeru added a commit to anurnomeru/curve-operator that referenced this issue Apr 19, 2023
anurnomeru added a commit to anurnomeru/curve-operator that referenced this issue Apr 19, 2023
anurnomeru added a commit to anurnomeru/curve-operator that referenced this issue Apr 19, 2023
anurnomeru added a commit to anurnomeru/curve-operator that referenced this issue Apr 19, 2023
anurnomeru added a commit to anurnomeru/curve-operator that referenced this issue Apr 21, 2023
anurnomeru added a commit to anurnomeru/curve-operator that referenced this issue Apr 21, 2023
anurnomeru added a commit to anurnomeru/curve-operator that referenced this issue Apr 21, 2023
anurnomeru added a commit to anurnomeru/curve-operator that referenced this issue Apr 21, 2023
anurnomeru added a commit to anurnomeru/curve-operator that referenced this issue Apr 27, 2023
anurnomeru added a commit to anurnomeru/curve-operator that referenced this issue May 5, 2023
anurnomeru added a commit to anurnomeru/curve-operator that referenced this issue May 5, 2023
caoxianfei1 added a commit that referenced this issue May 9, 2023
feat(#24): supports a common way for deployment waiting
caoxianfei1 added a commit that referenced this issue May 9, 2023
…ent-waiting

Revert "feat(#24): supports a common way for deployment waiting"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
activities enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants