Advanced retention strategies¶
Retention is deceptively hard.
Most systems treat “retention” as “delete old data.” In Papyra, retention is intentionally designed as a read-time filter first, with an explicit, opt-in physical compaction step for storage that can shrink.
This page explains how to build safe, production-grade retention strategies that match how Papyra persistence actually behaves.
What retention means in Papyra¶
Papyra persistence stores system facts (events, audits, dead letters) as an append-only log. Retention defines what should be considered in scope when reading these facts.
Logical retention¶
Logical retention is enforced when you read.
list_events(...),list_audits(...), andlist_dead_letters(...)return a view of the log.- If a
RetentionPolicyis configured, older/out-of-scope records are filtered out. - The underlying storage may still contain the dropped records.
This default is deliberate:
- It is safe (no automatic destructive behavior).
- It is predictable (reads always apply the same policy).
- It keeps the write path fast.
Physical compaction¶
Physical compaction is the explicit operation that makes retention real on disk / in storage.
- On file backends, compaction rewrites files and atomically replaces the original.
- On Redis Streams, compaction can use
XTRIMwhen a record-count limit exists. - On in-memory backends, compaction is a no-op.
You must call compaction explicitly:
ActorSystem.compact()- CLI:
papyra persistence compact
Papyra will never run compaction automatically at startup.
RetentionPolicy is immutable¶
RetentionPolicy is a frozen dataclass.
That means you cannot do this:
rp = RetentionPolicy()
# ❌ raises FrozenInstanceError
rp.max_records = 1000
Instead, you create a new policy:
rp = RetentionPolicy(max_records=1000)
This matters in production because:
- your runtime configuration is stable and hashable
- you avoid accidental mutation that changes retention semantics mid-flight
The three retention dimensions¶
Papyra retention can be expressed in three independent dimensions:
- Count-based:
max_records - Time-based:
max_age_seconds - Size-based:
max_total_bytes
You can use any single constraint or combine them.
Count-based retention¶
Count-based retention is the most predictable and the most commonly used.
Use it when you want:
- bounded storage growth
- stable read performance
- an easy mental model (“keep the last N records”)
Example:
from papyra.persistence.json import JsonFilePersistence
from papyra.persistence.backends.retention import RetentionPolicy
persistence = JsonFilePersistence(
path="./events.ndjson",
retention_policy=RetentionPolicy(max_records=50_000),
)
Operational note:
- On file backends,
compact()is what reclaims disk. - On Redis Streams, compaction can leverage
XTRIM MAXLEN.
Time-based retention¶
Time-based retention is useful when your operational questions are time-scoped:
- "show me the last 7 days"
- "keep audits for 30 days"
Example:
from papyra.persistence.backends.rotating import RotatingFilePersistence
from papyra.persistence.backends.retention import RetentionPolicy
# Keep ~7 days of data
persistence = RotatingFilePersistence(
path="./rot.log",
max_bytes=10_000_000,
max_files=10,
retention_policy=RetentionPolicy(max_age_seconds=7 * 24 * 60 * 60),
)
Caveat:
- Time-based retention relies on timestamps in the stored records.
- If you ingest historical data (older timestamps), those records may be dropped immediately at read time.
Size-based retention¶
Size-based retention is the hardest to get perfectly right, because the “true” size depends on backend representation.
Use it when you have:
- strict disk budgets
- very high throughput systems
- extremely large payloads
In Papyra, size-based retention is best treated as:
- a safety net
- a secondary constraint
You should still cap by count or age to keep the system predictable.
Strategy patterns¶
This section gives strategies you can implement today, with tradeoffs.
Strategy 1: Safety-first (read retention only)¶
Recommended for early production.
- Configure retention policy.
- Do not schedule compaction initially.
- Use
doctorand startup checks to ensure integrity.
Pros:
- zero risk of accidental data loss
- simplest operations
Cons:
- disk/stream size may keep growing until you compact
Strategy 2: Count retention + periodic compaction¶
Recommended for most production deployments.
max_recordsin policy- scheduled compaction (daily / weekly depending on volume)
Why it works:
- predictable volume
- predictable read cost
- bounded storage
Example operational schedule:
- small / medium systems: weekly compaction
- high throughput: daily compaction
Strategy 3: Hybrid (age + count)¶
Use this when you want both:
- a time window (e.g. last 30 days)
- a hard upper bound if load spikes (e.g. 2 million records)
RetentionPolicy(
max_age_seconds=30 * 24 * 60 * 60,
max_records=2_000_000,
)
This is often the best “production default.”
Strategy 4: Cold storage + short hot retention¶
If you want long-term history without keeping it in Papyra's primary persistence:
- keep a short retention window in Papyra (hot data)
- ship the stream elsewhere (cold data)
Typical pattern:
- Redis Streams persistence
- external consumer group reads events
- consumer exports into your analytics / data lake
Papyra supports this by exposing consumer group primitives on the Redis backend.
Backend-specific guidance¶
Retention policy is universal, but compaction is backend-specific.
JSON file backend¶
- Writes: append-only NDJSON
- Logical retention: applied on reads
- Physical compaction: rewrites the file and uses atomic replace
Recommended:
- use count-based retention
- compact periodically
- run
scanat startup in strict environments
Rotating file backend¶
- Writes: append-only with rotation
- Logical retention: applied on reads across rotated files
- Physical compaction: can rewrite rotation set to remove expired records
Recommended:
- pick rotation parameters (
max_bytes,max_files) for worst-case throughput - use retention + compaction only if disk pressure matters
Redis Streams backend¶
- Writes:
XADDinto per-kind streams - Logical retention: applied on reads
- Physical compaction: can leverage
XTRIM MAXLENwhenmax_recordsis configured
Recommended:
max_recordsas the primary limiter- consider approximate trimming for performance (
~)
Startup guarantees vs retention¶
Startup scanning/recovery is about integrity, not retention.
scan()detects structural anomaliesrecover()repairs or quarantines- startup modes decide whether to ignore, fail, or recover
Retention does not fix corruption.
If you want a robust production posture:
- run
doctorin CI / deployment pipelines - use
persistence startup-checkfor pre-flight checks - configure ActorSystem startup mode to FAIL_ON_ANOMALY or RECOVER
Operational playbooks¶
When should you compact?¶
Compact when:
- disk usage keeps growing and must be reclaimed
- you have predictable low-traffic windows
- you want to physically enforce retention for compliance
Don't compact when:
- you don't need disk reclamation yet
- your storage backend already enforces a size bound (e.g., Redis trimming)
- you can't tolerate any rewrite pressure during peak time
How to verify retention is doing what you think¶
- Confirm policy in your runtime configuration.
- Use CLI:
papyra persistence inspect
- Observe read counts (sampled): events / audits / dead letters.
- If using metrics:
papyra metrics persistence
- Run compaction explicitly and compare before/after.
How to debug “missing data”¶
Common causes:
- retention policy filtering records at read time
- timestamps older than your
max_age_seconds max_recordstoo small for your query horizon
Checklist:
- print retention settings (
persistence inspect) - temporarily run with
RetentionPolicy()(no limits) - re-run the same read and compare
Real-world examples¶
Example: local development¶
Goal: keep things simple.
- Use
InMemoryPersistence()(default) - No retention needed
Example: a single-node production service¶
Goal: stable disk usage and readable logs.
JsonFilePersistenceRetentionPolicy(max_records=200_000, max_age_seconds=14 days)- schedule
compact()weekly
Example: high throughput service with external analytics¶
Goal: durable write path + external consumption.
RedisStreamsPersistenceRetentionPolicy(max_records=5_000_000)- consumer group reads
eventsand ships them to analytics - periodic
XTRIMviacompact()
Summary¶
- Retention is logical by default.
- Compaction is explicit and makes retention physical.
- Prefer count-based retention for predictability.
- Combine age + count for a strong production default.
- Use Redis consumer groups when you need external tooling.