Recover from a Leaf Node Failure
Use the following steps to reintroduce a leaf node to a cluster where redundancyLevel
is set to 2 with redundant leaf nodes.
-
Scale down the Operator.
kubectl scale deployment [operator deployment name] --replicas=0 -
Scale down the STS with the pod you need to replace.
For a leaf node in either of the availability groups, scale down the STS to 0
and then scale it back up to the number of pods in the group.kubectl scale statefulsets [StatefulSet Name] --replicas=0 -
Delete the
PersistentVolumeClaim
(PVC) of the problematic pod.kubectl delete pvc pv-storage-node-ccd487dc-3b15-4c3b-88a2-a984dc0245ca-leaf-ag1-0 -
Delete the pod.
kubectl delete pod node-ccd487dc-3b15-4c3b-88a2-a984dc0245ca-leaf-ag1-0 -
Scale the STS and Operator back up.
kubectl scale statefulsets [StatefulSet Name] --replicas=[num of pods in this StatefulSet]
Last modified: August 2, 2024