The OODA Loop
The site is down. Users can't log in. Your phone is buzzing with alerts.
OODA — Observe, Orient, Decide, Act — came from fighter pilots managing life-or-death decisions faster than the enemy could respond. It maps cleanly to a production incident.
You start by observing: raw data only. Logs, error rates, recent deploys. What's actually happening, not what you suspect. Then you orient — the step most people skip. Orienting means updating your mental model with what you just observed. Is this a DDoS or a bad migration? A database connection issue or an application error? You're not just reading the data; you're deciding what it means. Then you decide on a hypothesis — "rolling back the last deploy should fix this" — and act on it.
Then you go back to Observe. The loop isn't four steps and done — it's a cycle, and the cycling is the point. Each pass eliminates hypotheses. The rollback didn't help, so the deploy probably wasn't the cause. The error rate spiked before the deploy, so look upstream. You're not trying to be right on the first try. You're trying to narrow the space of possibilities faster than the system degrades.
What makes the loop powerful isn't any single decision — it's the feedback each action generates. A wrong hypothesis, tested quickly, still tells you something. The picture gets clearer with every pass. You converge on the actual problem not by thinking harder, but by moving through the loop faster.