Disaster Recovery (DR)

Everything fails eventually. A meteor hits the datacenter. A hacker deletes the database. DR is the plan for how fast you come back online.

âąī¸ The Two "R"s of DR

RTO (Time)

Recovery Time Objective.
How long can you be offline?

Analogy: If you fall down, how many seconds does it take you to stand back up?

RPO (Point)

Recovery Point Objective.
How much data can you lose?

Analogy: If your essay crashes, did you save 1 minute ago, or 1 hour ago?

đŸ•šī¸ Doomsday Simulator

Mission: Choose a recovery strategy, then destroy the main datacenter.

SELECT A STRATEGY
đŸ“Ļ Backup & Restore
Cheap & Slow
đŸ•¯ī¸ Pilot Light
Database Ready, App Off
đŸ”Ĩ Warm Standby
Running but Small
⚡ Active / Active
Instant & Expensive
đŸ™ī¸
Primary (NY)
ONLINE
đŸœī¸
DR Site (LA)
OFFLINE
Cost to Build: $
Est. Recovery Time: 24 Hours
Status: Ready

📋 The Strategies Explained

1. Backup & Restore

Data is saved to S3. To recover, you must manually build new servers and download data.
Cost: $
RTO: Hours/Days

2. Pilot Light

Data is live in Region B (syncing). Servers are "turned off" (only the pilot light is on). To recover, you switch the servers on.
Cost: $$
RTO: ~10-30 Mins

3. Warm Standby

Everything is running in Region B, but small (miniature scale). To recover, you just scale it up (add more servers).
Cost: $$$
RTO: Minutes

4. Active / Active

Both regions take traffic 24/7. If one fails, the other just takes the extra load. Zero downtime.
Cost: $$$$
RTO: Near Zero