Skip to content

fix: reduce steady-state memory footprint on fully synced validators#653

Draft
skylar-simoncelli wants to merge 2 commits intomainfrom
skylar/fix-memory-growth-steady-state
Draft

fix: reduce steady-state memory footprint on fully synced validators#653
skylar-simoncelli wants to merge 2 commits intomainfrom
skylar/fix-memory-growth-steady-state

Conversation

@skylar-simoncelli
Copy link
Contributor

@skylar-simoncelli skylar-simoncelli commented Feb 11, 2026

Overview

This PR optimises the node for a maximum memory allocation of 8Gi, optimising all caching parameters around this memory budget

  • trie_cache_size: 1 GiB → 128 MiB — The trie cache fills gradually and never shrinks. At 1 GiB it was the single largest
    memory consumer. 128 MiB is sufficient for 6-second block validators and frees ~900 MiB for the 8G budget.
  • storage_cache_size: 1M nodes (~80 MiB) → 512K nodes (~40 MiB) — Halved the midnight-ledger arena LRU cache. Synced validators rarely need more than 512K nodes hot; the tradeoff is slightly more disk reads during sync. Saves 40 MiB that matters when every MiB counts toward 8G.
  • max_runtime_instances: 8 → 2 — Each pooled WASM instance reserves 128 MiB of heap that is allocated on demand and never released. Default of 8 = up to 1 GiB. Validators only need 1 instance for block import and 1 for authoring. Saves ~768 MiB which is critical for fitting in 8G.
  • TX_VALIDATION_CACHE_MAX_CAPACITY: 1000 (no TTL) → 200 entries + 5-minute time-to-idle — Each VerifiedTransaction entry holds ZK proof data (50-200 KiB). At 1000 entries with no TTL, stale entries for old state hashes accumulated up to ~200 MiB. Reduced to 200 entries (plenty for low-traffic validator networks) and added idle eviction so quiet periods actively reclaim memory. Keeps worst-case under ~40 MiB.
  • pool-limit: 8192 → 1024 — Default Substrate tx pool of 8192 is sized for public networks. Our 6-validator testnets
    generate minimal transactions. 1024 is more than sufficient and saves ~30 MiB of pooled transaction overhead.

These values target an estimated ~1.2 GiB baseline with ~6.8 GiB headroom for allocator fragmentation and ParityDB mmap growth — enough for multiple days of uptime before memory pressure.

Background

Fully synced validators exhibit continuous memory growth, approaching pod limits (12 Gi) within days. Heaptrack profiling confirmed the midnight_storage arena as the dominant allocator (94% of heap never freed), which was previously addressed by bounding storage_cache_size. However, three additional memory sources continue growing on synced nodes:

Change Before After Savings
trie_cache_size 1 GiB 256 MiB ~768 MiB
max_runtime_instances 8 (default) 2 ~768 MiB
TX validation cache TTI none (count-only) 5 min idle eviction variable

Combined: ~1.5 GiB reduction in steady-state memory per validator.

1. Substrate trie cache: 1 GiB → 256 MiB (res/cfg/default.toml)

The trie cache fills gradually as state is accessed and never shrinks. At 1 GiB it consumed a fixed ~1 GiB on every node. 256 MiB is sufficient for validator workloads with 6-second block times.

2. WASM runtime instances: 8 → 2 (res/cfg/default.toml)

Each pooled WASM instance reserves 128 MiB of heap that is allocated on demand and never released. The Substrate default of 8 instances = up to 1 GiB. Validators only need 1 concurrent instance for block import and 1 for authoring.

3. Moka cache TTI: add 5-minute time_to_idle (ledger/src/versions/common/mod.rs)

The STRICT_TX_VALIDATION_CACHE stores VerifiedTransaction objects (50–200 KiB each containing ZK proof data) keyed by (state_hash, tx_hash). Since state_hash changes every block, the same transaction gets a new cache key each block — old entries for stale state hashes accumulated until the 1000-entry cap forced LRU eviction. On low-traffic networks, entries persisted indefinitely. Adding time_to_idle(5 min) evicts stale entries during quiet periods.

Current production data (qanet, 12 Gi limit)

Validator Memory Uptime Growth Rate
V1 7,399 Mi (60%) 2d4h ~3.5 Gi/day
V2 7,378 Mi (60%) 2d4h ~3.5 Gi/day
V4 7,418 Mi (61%) 2d4h ~3.5 Gi/day

At this rate, pods would OOM within ~4 days.

TODO before merging

  • CI build + test pass
  • Verify --max-runtime-instances 2 doesn't cause runtime instance contention under load

Submission Checklist

  • This is backward-compatible (config/CLI changes only; no runtime migration needed)
  • I have self-reviewed the diff
  • A change file has been added (changes/changed/reduce-steady-state-memory-footprint.md)
  • No version bump needed (config-only change)
  • AGENTS.md does not need updating

Testing Evidence

Deploy to qanet first and monitor memory over 48h. Expected: memory stabilizes around 3-4 Gi instead of growing past 7 Gi.

  • Additional tests are not needed (no logic changes, only cache configuration)

Fork Strategy

  • N/A

Links

Related commit: 69fe806 (fix-unbounded-ledger-storage-cache)

Three changes to cut ~1.5 GiB of steady-state memory per node:

- Reduce trie_cache_size from 1 GiB to 256 MiB. The trie cache fills
  gradually and never shrinks; 256 MiB is sufficient for 6s block times.

- Set --max-runtime-instances 2 (default 8). Each pooled WASM instance
  reserves 128 MiB of heap that is never released. Validators only need
  1 for import and 1 for authoring.

- Add 5-minute time_to_idle to moka transaction validation caches.
  VerifiedTransaction objects (50-200 KiB each, containing ZK proof data)
  were only evicted by count. On low-traffic networks stale entries for
  old state hashes persisted indefinitely, contributing to memory growth.
@github-actions
Copy link
Contributor

github-actions bot commented Feb 11, 2026

kics-logo

KICS version: v2.1.16

Category Results
CRITICAL CRITICAL 0
HIGH HIGH 0
MEDIUM MEDIUM 96
LOW LOW 12
INFO INFO 83
TRACE TRACE 0
TOTAL TOTAL 191
Metric Values
Files scanned placeholder 31
Files parsed placeholder 31
Files failed to scan placeholder 0
Total executed queries placeholder 73
Queries failed to execute placeholder 0
Execution time placeholder 9

@skylar-simoncelli skylar-simoncelli marked this pull request as draft February 11, 2026 17:07
- trie_cache_size: 256 MiB → 128 MiB
- storage_cache_size: 1M → 512K nodes (~80 MiB → ~40 MiB)
- TX validation cache: 1000 → 200 entries
- Add --pool-limit 1024 (default was 8192)

Brings estimated baseline to ~1.2 GiB with ~6.8 GiB headroom for
allocator fragmentation and ParityDB mmap growth.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant