Backups and Recovery¶
This document describes how to back up and recover Papyra persistence safely in production environments.
Papyra persistence is append-only, deterministic, and replayable. This makes backups simpler than traditional databases, but only if you respect the correct operational rules described here.
What Needs to Be Backed Up¶
Papyra persistence backends store system history, not live state.
Depending on your backend, this includes:
JSON / Rotating File Backends¶
- Event logs (
*.ndjson) - Audit logs
- Dead-letter logs
- Rotation segments (if enabled)
Redis Backend¶
- Redis Streams keys
- Consumer group metadata
- Pending entries (PEL)
Warning
⚠️ You do not back up in-memory state. Actors rebuild state by replaying persisted history.
Backup Principles¶
1. Backups Are Snapshot-Based¶
Backups should capture a point-in-time snapshot of persistence files or Redis data.
Do not:
- Copy files while compaction is running
- Snapshot mid-recovery
- Truncate logs manually
2. Persistence Is Append-Only¶
This guarantees:
- Partial backups are still readable
- Recovery can skip corrupted tails
- Older backups remain valid indefinitely
File-Based Backups (JSON / Rotation)¶
Recommended Strategy¶
-
Pause Writes (Optional but Best)
- Stop the ActorSystem
- Or route traffic away
-
Copy Persistence Files
cp -r persistence/ backups/papyra-$(date +%F)/ -
Resume System
Live Backups (Advanced)¶
If stopping is not possible:
- Copy files anyway
- Rely on recovery tools to clean partial writes
Papyra's scanner is designed for this.
Redis Backups¶
Option 1: Redis RDB Snapshots (Recommended)¶
Use Redis-native snapshotting:
SAVEBGSAVE- Cloud-managed Redis snapshots
This preserves:
- Streams
- Consumer groups
- Pending entries
Option 2: AOF (Advanced)¶
If using AOF:
- Ensure
appendfsync everysec - Verify replay time during recovery
Verifying Backups¶
Always validate backups before trusting them.
Use the CLI¶
papyra persistence scan --path backups/papyra-2026-01-01/events.ndjson
Expected results:
- Clean scan → backup is valid
- Anomalies → backup still usable after recovery
Recovery Scenarios¶
Scenario 1: Clean Restore¶
- Stop system
- Restore files / Redis snapshot
- Start system normally
papyra persistence startup-check
Scenario 2: Corrupted Backup¶
Run recovery manually:
papyra persistence recover --path events.ndjson
Or quarantine:
papyra persistence recover \
--mode quarantine \
--quarantine-dir ./quarantine \
--path events.ndjson
Scenario 3: Redis Partial Failure¶
- Restore Redis snapshot
- Recreate consumer groups if needed
- Pending messages may reappear — this is expected
Disaster Recovery Guarantees¶
Papyra guarantees:
- ✔ Deterministic replay
- ✔ Safe truncation of corrupted tails
- ✔ Idempotent recovery
- ✔ No hidden state
What it does not guarantee:
- Zero message duplication (at-least-once semantics apply)
- Automatic recovery without operator intent
Operational Checklist¶
Before production:
- [ ] Automated daily backups
- [ ] Weekly restore test
- [ ]
persistence scanin CI or cron - [ ] Documented restore procedure
- [ ] Quarantine directory configured
Key Takeaways¶
- Persistence is history, not state
- Backups are cheap and robust
- Recovery is explicit and observable
- Operator intent is always required
Papyra favors correctness and transparency over magic.