How-To Playbook

Interactive troubleshooting and step-by-step guides for common engineering disasters.

🧙‍♂️ Troubleshoot-o-Matic

What is the symptom?
Diagnosis: Latency or Resource Starvation

1. Check CPU/RAM Usage:

top

2. Check Disk I/O (is disk slow?):

iotop

3. Check Network Traffic:

iftop -n
Diagnosis: Application Error

1. Check System Logs:

journalctl -xe

2. Check Application Logs (Standard):

tail -f /var/log/nginx/error.log

3. Did you run out of memory? (OOM Killer):

dmesg | grep -i "killed"
Diagnosis: Network or Firewall

1. Is the service running?

systemctl status nginx

2. Is the port listening?

netstat -tulpn | grep 80

3. Is the Firewall blocking it?

sudo ufw status
Diagnosis: No Space Left on Device

1. Find which partition is full:

df -h

2. Find the biggest folders (This takes a while):

du -h --max-depth=1 / | sort -hr

3. Clean package cache (Ubuntu/Debian):

sudo apt clean

📚 Standard Procedures

GIT

Undo the last commit

Keeps your changes in files.

git reset --soft HEAD~1
SSH

Generate SSH Key

For GitHub or Server access.

ssh-keygen -t ed25519 -C "me@email.com"
DOCKER

Kill all containers

Emergency cleanup.

docker rm -f $(docker ps -aq)