Cloud Recovery
DR Drills for Kubernetes Stateful Workloads
Sequence etcd backups, CSI snapshots, and application-level checkpoints without pretending clusters are stateless.
920,000 KRW · 24 hours · Instructor-led labs
Program description
Teams practice namespace-scoped restores, storage class mismatches, and traffic cutovers using service meshes conservatively. Labs emphasize labeling discipline and honest rollback timers.
What is included
- Namespace restore checklist with storage class traps
- etcd backup verification steps
- Traffic cutover script with progressive weight shifts
- Labeling scheme for backup ownership and cost centers
- Runbook for CSI driver version skew
- Observability pack for post-restore saturation tests
- Retrospective template for drill facilitators
Outcomes you can evidence
- Execute a namespace restore with storage class validation documented.
- Capture etcd verification evidence suitable for change records.
- List two stateful pitfalls your platform team will track per release.
Mentor
Evelyn Park
Program director aligning platform releases with rehearsal calendars for regulated teams.
FAQ
Which Kubernetes distros?
Labs target upstream-compatible clusters; OpenShift specifics are callouts only.
Service mesh required?
Optional. We provide a non-mesh path for traffic validation.
Exclusions?
Windows containers and bare-metal orchestrators are not exercised.
Participant notes
Storage class trap list caught a vendor default that would have wrecked our drill.
etcd verification felt tedious until it wasn’t—worth the slog.