Skip to content

Commit

Permalink
Merge pull request #145 from clobrano/update-readme-out-of-service-ta…
Browse files Browse the repository at this point in the history
…int-0

Update docs with the new remediationStrategy spec
  • Loading branch information
clobrano authored May 22, 2024
2 parents 83e25ea + 4f9eb8d commit fb0a3ed
Show file tree
Hide file tree
Showing 5 changed files with 47 additions and 0 deletions.
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,9 @@ The CR includes the following parameters:
* `retrycount` - number of times to retry the fence agent in case of failure. The default is 5.
* `retryinterval` - interval between retries in seconds. The default is "5s".
* `timeout` - timeout for the fence agent in seconds. The default is "60s".
* `remediationStrategy` - either `OutOfServiceTaint` or `ResourceDeletion`:
* `OutOfServiceTaint`: This remediation strategy implicitly causes the deletion of the pods and the detachment of the associated volumes on the node. It achieves this by placing the [`OutOfServiceTaint` taint](https://kubernetes.io/docs/reference/labels-annotations-taints/#node-kubernetes-io-out-of-service) on the node.
* `ResourceDeletion`: This remediation strategy deletes the pods on the node.

The FenceAgentsRemediation CR is created by the administrator and is used to trigger the fence agent on a specific node. The CR includes an *agent* field for the fence agent name, *sharedparameters* field with all the shared, not specific to a node, parameters, and a *nodeparameters* field to specify the parameters for the fenced node.
For better understanding please see the below example of FenceAgentsRemediation CR for node `worker-1` (see it also as the [sample FAR](https://github.com/medik8s/fence-agents-remediation/blob/main/config/samples/fence-agents-remediation_v1alpha1_fenceagentsremediation.yaml)):
Expand Down Expand Up @@ -220,6 +223,7 @@ spec:
worker-0: "6233"
worker-1: "6234"
worker-2: "6235"
remediationStrategy: ResourceDeletion
```

## Tests
Expand Down
1 change: 1 addition & 0 deletions api/v1alpha1/fenceagentsremediation_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ type FenceAgentsRemediationSpec struct {
// that enables automatic deletion of pv-attached pods on failed nodes, "out-of-service" taint is only supported on clusters with k8s version 1.26+ or OCP/OKD version 4.13+.
// +kubebuilder:default:="ResourceDeletion"
// +kubebuilder:validation:Enum=ResourceDeletion;OutOfServiceTaint
// +operator-sdk:csv:customresourcedefinitions:type=spec
RemediationStrategy RemediationStrategyType `json:"remediationStrategy,omitempty"`
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ metadata:
"worker-2": "6235"
}
},
"remediationStrategy": "ResourceDeletion",
"retrycount": 5,
"retryinterval": "5s",
"sharedparameters": {
Expand Down Expand Up @@ -83,6 +84,16 @@ spec:
node that is fenced, since they are node specific
displayName: Node Parameters
path: nodeparameters
- description: RemediationStrategy is the remediation method for unhealthy nodes.
Currently, it could be either "OutOfServiceTaint" or "ResourceDeletion".
ResourceDeletion will iterate over all pods related to the unhealthy node
and delete them. OutOfServiceTaint will add the out-of-service taint which
is a new well-known taint "node.kubernetes.io/out-of-service" that enables
automatic deletion of pv-attached pods on failed nodes, "out-of-service"
taint is only supported on clusters with k8s version 1.26+ or OCP/OKD version
4.13+.
displayName: Remediation Strategy
path: remediationStrategy
- description: RetryCount is the number of times the fencing agent will be executed
displayName: Retry Count
path: retrycount
Expand Down Expand Up @@ -129,6 +140,16 @@ spec:
node that is fenced, since they are node specific
displayName: Node Parameters
path: template.spec.nodeparameters
- description: RemediationStrategy is the remediation method for unhealthy nodes.
Currently, it could be either "OutOfServiceTaint" or "ResourceDeletion".
ResourceDeletion will iterate over all pods related to the unhealthy node
and delete them. OutOfServiceTaint will add the out-of-service taint which
is a new well-known taint "node.kubernetes.io/out-of-service" that enables
automatic deletion of pv-attached pods on failed nodes, "out-of-service"
taint is only supported on clusters with k8s version 1.26+ or OCP/OKD version
4.13+.
displayName: Remediation Strategy
path: template.spec.remediationStrategy
- description: RetryCount is the number of times the fencing agent will be executed
displayName: Retry Count
path: template.spec.retrycount
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,16 @@ spec:
node that is fenced, since they are node specific
displayName: Node Parameters
path: nodeparameters
- description: RemediationStrategy is the remediation method for unhealthy nodes.
Currently, it could be either "OutOfServiceTaint" or "ResourceDeletion".
ResourceDeletion will iterate over all pods related to the unhealthy node
and delete them. OutOfServiceTaint will add the out-of-service taint which
is a new well-known taint "node.kubernetes.io/out-of-service" that enables
automatic deletion of pv-attached pods on failed nodes, "out-of-service"
taint is only supported on clusters with k8s version 1.26+ or OCP/OKD version
4.13+.
displayName: Remediation Strategy
path: remediationStrategy
- description: RetryCount is the number of times the fencing agent will be executed
displayName: Retry Count
path: retrycount
Expand Down Expand Up @@ -85,6 +95,16 @@ spec:
node that is fenced, since they are node specific
displayName: Node Parameters
path: template.spec.nodeparameters
- description: RemediationStrategy is the remediation method for unhealthy nodes.
Currently, it could be either "OutOfServiceTaint" or "ResourceDeletion".
ResourceDeletion will iterate over all pods related to the unhealthy node
and delete them. OutOfServiceTaint will add the out-of-service taint which
is a new well-known taint "node.kubernetes.io/out-of-service" that enables
automatic deletion of pv-attached pods on failed nodes, "out-of-service"
taint is only supported on clusters with k8s version 1.26+ or OCP/OKD version
4.13+.
displayName: Remediation Strategy
path: template.spec.remediationStrategy
- description: RetryCount is the number of times the fencing agent will be executed
displayName: Retry Count
path: template.spec.retrycount
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,4 @@ spec:
worker-0: "6233"
worker-1: "6234"
worker-2: "6235"
remediationStrategy: ResourceDeletion

0 comments on commit fb0a3ed

Please sign in to comment.