Backup & Recovery
Backup with pg_dump and physical methods; configure replication and PITR.
Logical Backups
pg_dump exports SQL or custom format for single database. pg_dumpall includes globals (roles, tablespaces). Custom format (-Fc) enables parallel restore with pg_restore.
Logical backups portable across versions but slow for very large databases compared to physical methods.
pg_dump -Fc myapp > myapp.dump pg_restore -d myapp_restored myapp.dump
Physical Backups and WAL
Base backups copy data directory while PostgreSQL runs with archive_mode capturing WAL segments. Continuous archiving enables point-in-time recovery to any moment between backups.
Tools: pg_basebackup, Barman, pgBackRest, WAL-G. Managed clouds automate this.
- wal_level replica minimum for replication and PITR
- Test restore to unused instance regularly
- Encrypt backups at rest and in transit
pg_basebackup -D /backup/base -Fp -Xs -P
Point-in-Time Recovery
Restore base backup, copy WAL archives, create recovery.signal, set recovery_target_time in postgresql.conf, start PostgreSQL to replay to target and promote.
PITR recovers from operator error deleting data—RPO equals WAL archiving granularity.
- Document promotion steps to avoid split-brain with old primary
- Retain WAL long enough to bridge from last base backup
- Simulate disaster recovery in staging quarterly
Replication
Streaming replication sends WAL to standby servers synchronously or asynchronously. Hot standbys accept read queries. Logical replication replicates table subsets for upgrades or cross-version migration.
Monitor pg_stat_replication lag_bytes and replay lag on standbys.
SELECT client_addr, state, sent_lsn, replay_lsn FROM pg_stat_replication;
High Availability
Patroni, repmgr, and cloud HA automate failover. Connection poolers and VIPs route traffic to current primary. Synchronous commit trades latency for zero data loss on single failure.
Split-brain prevention requires fencing or consensus (etcd) in self-managed HA.
- Use pg_switchover for planned maintenance failovers
- Cascading replicas offload backup and reporting load
- Document RTO/RPO and validate against drill results