KEP-4471: kubeadm: make a control-plane's kubelet talk to the local API Server on kubeadm join. #4496

chrischdi · 2024-02-08T09:49:42Z

One-line PR description: adding new KEP

Issue link: kubeadm: make a control-plane's kubelet talk to the local API Server on kubeadm join. #4471

Other comments:

chrischdi · 2024-02-08T09:50:08Z

cc @neolit123 @fabriziopandini @sbueringer :-) PTAL

neolit123

👍
i will have a look tomorrow.

chrischdi · 2024-02-08T18:13:17Z

FYI: I'm on PTO the next days and will be back on 15th. Will start addressing feedback then :-)

sbueringer · 2024-02-09T10:06:24Z

Sounds reasonable overall, but Fabrizio and Lubomir are way more familiar with the kubeadm details

neolit123

thanks @chrischdi
i did a first pass and the KEP seems good, minus some typos and some missing minor clarifications.

keps/sig-cluster-lifecycle/kubeadm/4471-cp-join-kubelet-local-apiserver/README.md

neolit123

/cc @pacoxu @SataQiu
PTAL when you have time. might be a bit too late for us to apply this in 1.30, but if there are no objections we can try.

SataQiu · 2024-02-25T03:09:22Z

Pointing the control-plane's kubelet to the local kube-apiserver may cause some stability issues. When kube-apiserver fails, the local kubelet will crash, and the node will become NotReady, the entire control-plane node cannot handle the new Pod. However, using LB will not cause the entire control-plane node to be NotReady, just some components can not work well.
The second point is that upgrading the kubelet binary version seems to be user control, and kubeadm only modifies the kubelet configurations and does not immediately update the binary version of kubelet. https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/#upgrade-kubelet-and-kubectl

When kubeadm join is executed, version skew does not occur, and that only occurs during the upgrade phase.

chrischdi · 2024-02-26T07:57:49Z

Pointing the control-plane's kubelet to the local kube-apiserver may cause some stability issues. When kube-apiserver fails, the local kubelet will crash, and the node will become NotReady, the entire control-plane node cannot handle the new Pod.

I agree, I will add that information.
One thought: in case of using kubeadm, kube-apiserver is run as static pod anyway, so shouldn't the kubelet be able to bring it up again?

An improvement may be to only use the local endpoint during TLS Bootstrap and make kubeadm change the config afterwards back to the load balancer 🤔

However, using LB will not cause the entire control-plane node to be NotReady, just some components can not work well. The second point is that upgrading the kubelet binary version seems to be user control, and kubeadm only modifies the kubelet configurations and does not immediately update the binary version of kubelet. kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/#upgrade-kubelet-and-kubectl

When kubeadm join is executed, version skew does not occur, and that only occurs during the upgrade phase.

In case of immutable upgrades (e.g. Cluster API) where you update the whole image and the image contains the binary versions: when correctly following the version skew you would have to (example: v1.x -> v1.y):

have one image A which has v1.y kubeadm + v1.x kubelet
have one image B which has v1.y kubeadm + v1.y kubelet

And upgrade via:

rolling update control-plane nodes to image A
rolling update control-plane nodes to image B

Everyone nowadays skips step 1 with CAPI (because normally it works).

neolit123 · 2024-02-26T08:11:30Z

Pointing the control-plane's kubelet to the local kube-apiserver may cause some stability issues. When kube-apiserver fails, the local kubelet will crash, and the node will become NotReady, the entire control-plane node cannot handle the new Pod.

I agree, I will add that information.
One thought: in case of using kubeadm, kube-apiserver is run as static pod anyway, so shouldn't the kubelet be able to bring it up again?

this is something that i mentioned/asked on the kubeadm office hours zoom call we had.
i think the new way is less HA, and less resilient to apiserver component faliures.

currently with a kubelet pointing at the LB if a local apiserver fails (e.g. due to an internal apiserver bug) the kubelet will try to find a healthy kube-apiserver and the Node will remain healthy, while in parallel it will try to restart the local filing pod. with the new way if the apiserver fails the node will become not ready.

An improvement may be to only use the local endpoint during TLS Bootstrap and make kubeadm change the config afterwards back to the load balancer 🤔

bootstrap to local apiserver makes sense.
but if it then points to the LB it can still violate the skew policy, which according to the prior discussions from api-machinery can cause problems. we haven't seen such problems yet, mainly because of k8s kubelet / apiserver stability.

i was thinking that maybe we can let the user control this behavior with an option in the config API. i personally don't like this idea, but it can be added in v1beta4 if people vote +1 for that.

alternatively, the default behavior can be - point to local, but they can manually edit to point to LB if they want more HA.

neolit123 · 2024-04-29T06:26:03Z

@SataQiu @pacoxu any more comments here?

i was thinking that maybe we can let the user control this behavior with an option in the config API. i personally don't like this idea, but it can be added in v1beta4 if people vote +1 for that.

still not a fan of making it configurable as pointing to the LB can violate kubelet / apiserver skew, but to make it configurable by kubeadm we can do one of these things:

add a new patchtarget kubeletkubeconfig. problem with this is that the user will likely only use this patch target for changing server and it will be a rare use case.
add clusterconfiguration.pointKubeletAtControlPlaneEndpoint false by default, still a rare use case.
document somewhere (maybe in the kubeadm HA docs at k/website) that users can edit a node kubelet.conf kubeconfig file to point to LB and kubeadm or kubelet will not modify the contents of this file for e.g. cert rotation. i.e. the server value will persist.

my vote goes to 3.
WDYT?

EDIT: and yes we should still have a FG while the feature graduates.

SataQiu · 2024-04-29T10:33:44Z

also vote to 3
When designing the kubeadm upgrade apply phases, we should support skipping the kubelet upgrade phase.
Then users can upgrade the kubelet version after ensuring that all the control-plane apiserver instances have been upgraded.

Just like kubeadm upgrade node --skip-phases kubelet-config

FYI: It seems that kubeadm just configures kubelet, but not restart the kubelet service. We can state in the documentation that please restart the kubelet service in turn after all the control-plane instances have been upgraded.

neolit123 · 2024-04-29T11:46:40Z

also vote to 3

👍

When designing the kubeadm upgrade apply phases, we should support skipping the kubelet upgrade phase.
Then users can upgrade the kubelet version after ensuring that all the control-plane apiserver instances have been upgraded.

upgrade does not touch "etc/kubernetes/kubelet.conf" today, so if the user modifies the "server" in there it will persist.

FYI: It seems that kubeadm just configures kubelet, but not restart the kubelet service. We can state in the documentation that please restart the kubelet service in turn after all the control-plane instances have been upgraded.

yes, a kubelet restart will be required.
they have to do it one time after changing the value of "server"

pacoxu · 2024-04-30T08:21:26Z

3. document somewhere (maybe in the kubeadm HA docs at k/website) that users can edit a node kubelet.conf kubeconfig file to point to LB and kubeadm or kubeadm will not modify the contents of this file for e.g. cert rotation. i.e. the server value will persist.

IIUC, kubelet.conf change is a behavior change or a bugfix.

Need we notice cluster-api or other kubeadm adapters about this?

vote to 2 and 3.
3 may be the right direction, but 2 is also a good choice to me.

neolit123 · 2024-04-30T09:57:22Z

document somewhere (maybe in the kubeadm HA docs at k/website) that users can edit a node kubelet.conf kubeconfig file to point to LB and kubeadm or kubeadm will not modify the contents of this file for e.g. cert rotation. i.e. the server value will persist.

IIUC, kubelet.conf change is a behavior change or a bugfix.

we can say it's a bug fix for potential skew problems that introduces a change in behavior.

Need we notice cluster-api or other kubeadm adapters about this?

vote to 2 and 3. 3 may be the right direction, but 2 is also a good choice to me.

we could just document it as our first choice (3), but if users really need better flexibility, we can think about alternatives like 2 or something else.

keps/sig-cluster-lifecycle/kubeadm/4471-cp-join-kubelet-local-apiserver/kep.yaml

keps/sig-cluster-lifecycle/kubeadm/4471-cp-join-kubelet-local-apiserver/README.md

pacoxu · 2024-04-30T10:08:30Z

Need we notice cluster-api or other kubeadm adapters about this?
vote to 2 and 3. 3 may be the right direction, but 2 is also a good choice to me.
we could just document it as our first choice (3), but if users really need better flexibility, we can think about alternatives like 2 or something else.

3 is ok to go. +1
2 can be a later choice if many users really want that.

neolit123

please squash all commits, then lgtm
/approve

k8s-ci-robot · 2024-05-07T14:01:37Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chrischdi, neolit123

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/sig-cluster-lifecycle/OWNERS~~ [neolit123]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…erver

neolit123 · 2024-05-07T16:42:00Z

/lgtm

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 8, 2024

k8s-ci-robot requested review from CecileRobertMichon and neolit123 February 8, 2024 09:49

k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 8, 2024

neolit123 reviewed Feb 8, 2024

View reviewed changes

neolit123 reviewed Feb 13, 2024

View reviewed changes

neolit123 mentioned this pull request Feb 23, 2024

kubeadm: make a control-plane's kubelet talk to the local API Server on kubeadm join. #4471

Open

13 tasks

neolit123 reviewed Feb 23, 2024

View reviewed changes

k8s-ci-robot requested review from pacoxu and SataQiu February 23, 2024 19:27

chrischdi force-pushed the pr-kubeadm-4471 branch from 01acae6 to 963bd7b Compare March 8, 2024 09:45

neolit123 requested review from neolit123 and removed request for CecileRobertMichon April 29, 2024 06:14

neolit123 reviewed Apr 30, 2024

View reviewed changes

keps/sig-cluster-lifecycle/kubeadm/4471-cp-join-kubelet-local-apiserver/kep.yaml Outdated Show resolved Hide resolved

keps/sig-cluster-lifecycle/kubeadm/4471-cp-join-kubelet-local-apiserver/README.md Show resolved Hide resolved

neolit123 reviewed May 7, 2024

View reviewed changes

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 7, 2024

kubeadm: add KEP for joining control plane nodes using the local apis…

bcd6c46

…erver

chrischdi force-pushed the pr-kubeadm-4471 branch from 4bc4947 to bcd6c46 Compare May 7, 2024 14:22

k8s-ci-robot assigned neolit123 May 7, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 7, 2024

k8s-ci-robot merged commit 060cedf into kubernetes:master May 7, 2024
4 checks passed

k8s-ci-robot added this to the v1.31 milestone May 7, 2024

chrischdi deleted the pr-kubeadm-4471 branch May 10, 2024 15:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KEP-4471: kubeadm: make a control-plane's kubelet talk to the local API Server on kubeadm join. #4496

KEP-4471: kubeadm: make a control-plane's kubelet talk to the local API Server on kubeadm join. #4496

chrischdi commented Feb 8, 2024

chrischdi commented Feb 8, 2024

neolit123 left a comment

chrischdi commented Feb 8, 2024

sbueringer commented Feb 9, 2024

neolit123 left a comment

neolit123 left a comment

SataQiu commented Feb 25, 2024 •

edited

Loading

chrischdi commented Feb 26, 2024

neolit123 commented Feb 26, 2024

neolit123 commented Apr 29, 2024 •

edited

Loading

SataQiu commented Apr 29, 2024 •

edited

Loading

neolit123 commented Apr 29, 2024

pacoxu commented Apr 30, 2024

neolit123 commented Apr 30, 2024

pacoxu commented Apr 30, 2024

neolit123 left a comment

k8s-ci-robot commented May 7, 2024

neolit123 commented May 7, 2024

KEP-4471: kubeadm: make a control-plane's kubelet talk to the local API Server on kubeadm join. #4496

KEP-4471: kubeadm: make a control-plane's kubelet talk to the local API Server on kubeadm join. #4496

Conversation

chrischdi commented Feb 8, 2024

chrischdi commented Feb 8, 2024

neolit123 left a comment

Choose a reason for hiding this comment

chrischdi commented Feb 8, 2024

sbueringer commented Feb 9, 2024

neolit123 left a comment

Choose a reason for hiding this comment

neolit123 left a comment

Choose a reason for hiding this comment

SataQiu commented Feb 25, 2024 • edited Loading

chrischdi commented Feb 26, 2024

neolit123 commented Feb 26, 2024

neolit123 commented Apr 29, 2024 • edited Loading

SataQiu commented Apr 29, 2024 • edited Loading

neolit123 commented Apr 29, 2024

pacoxu commented Apr 30, 2024

neolit123 commented Apr 30, 2024

pacoxu commented Apr 30, 2024

neolit123 left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented May 7, 2024

neolit123 commented May 7, 2024

SataQiu commented Feb 25, 2024 •

edited

Loading

neolit123 commented Apr 29, 2024 •

edited

Loading

SataQiu commented Apr 29, 2024 •

edited

Loading