Info
In Distributed Systems running on kubernetes, operators play a vital role in orchestrating a safe and efficient rollout.
Step 1: Prepare node
- Drain the node of connections
- Remove leadership if Leader Election was fine before
Step 2: Partition
If working with StatefulSets, use partitioned rolling update. Update one by one and wait for stability before continuing the update. Even taking a minute or so between each update, so the nodes have time to replicate missed data is important.
Step 3: Ensure system stability by performing checks
Monitor and wait for the distributed system to report healthy and caught up before continuing updates.
Step 4: Continue Partitioning
Repeat steps 1-3 for the next node.