Merge pull request #427 from input-output-hk/shd/comments-on-epoch-timing

sandtreader · web-flow · commit 30d1e2087f5d · 2025-12-02T16:35:49.000Z
Added comments about epoch timing
diff --git a/docs/epoch-timing.md b/docs/epoch-timing.md
@@ -0,0 +1,81 @@
+# Rewards and pools timing.
+
+The following notes is the result of discussions with ledger team
+and of Haskell code analysis.
+
+## Anikett Deshpande's comment on the epoch boundary transition:
+
+In summary, at the epoch boundary:
+
+* `TICK` is called, calls
+* `NEWEPOCH` which forces the existing rewards pulser to complete and distributes the rewards and then calls
+* `EPOCH`, which calls
+* `SNAP` to rotate the snapshots: now new -> mark, mark -> set , and set -> go
+* `SNAP` returns to
+* `EPOCH`, which returns to
+* `NEWEPOCH`, which returns to
+* `TICK`, which calls
+* `RUPD`, which in turn sets off the new rewards pulser using the newly rotated go snapshot (after stability window ~1.5 days), and returns to
+* `TICK`
+
+In short:
+
+* `TICK` calls `NEWEPOCH`
+* `NEWEPOCH` forces pulser and distributes rewards from the go snapshot (we are about to deallocate) which was rotated and marked as go at the previous boundary and was originally snapshotted as mark 2 epoch boundaries before that.
+* `SNAP` rotates the snapshots and takes a new one for mark
+* `RUPD` sets of the new pulser with the newly rotated go snapshot, which was marked as set in the previous epoch and was used for leader schedule processing.
+
+I hope this answers the question much better than before. :blush: (edited) 
+
+## Rewards distribution timing
+
+As we can conclude from that and from the Haskell node, the sequence of events is the following:
+
+* Rewards calculated during epoch E, are calculated based on Go (E-3), and applied to Ledger state:
+
+   ```"NEWEPOCH" rule: es' <- ... updateRewards es eNo ru'```
+
+   ```updateRewards: let !(!es', filtered) = applyRUpdFiltered ru' es```
+
+   ```applyRUpdFiltered:
+         ls' =
+           ls
+             & lsUTxOStateL . utxosFeesL %~ (`addDeltaCoin` deltaF ru)
+             & lsCertStateL . certDStateL . dsUnifiedL .~ (rewards dState UM.∪+ registeredAggregated)```
+
+* Current ledger state is converted then into new Mark.
+
+   ```
+      es' <- case ru of
+        SJust (Complete ru') -> updateRewards es eNo ru'
+      es'' <- trans @(EraRule "MIR" era) $ TRC ((), es', ())
+      es''' <- trans @(EraRule "EPOCH" era) $ TRC ((), es'', eNo)
+      let adaPots = totalAdaPotsES es'''
+      ...
+      let pd' = ssStakeMarkPoolDistr (esSnapshots es)
+    ```
+
+* However, new Mark Pool distribution field does not include rewards.
+* Rewards for epoch E first appear in snapshot in epoch E+3 (as mark in EpochState).
+* Rewards for epoch E first used for leader scheduling in epoch E+4 (when it becomes set).
+
+Conclusion: rewards, earned by block validation during epoch 209 (TODO: double-check,
+add code) and evaluated in epoch 210 (based on epoch 207 stake distribution: 'go' for 210), 
+appear in snapshot in epoch 211 (as mark), and first used in epoch 212 for scheduling.
+
+## Pool retirement timing
+
+Each epoch boundary has a set of rules, concerning pool retirement. So, if we have epoch E-1 to E transition:
+
+* Rule "EPOCH", called for epoch (E-1) => E transition, which calls "SNAP" and then "POOLREAP":
+* Rule "SNAP" rotates epochs (so, we have Set snapshot for E-1)
+* Rule "POOLREAP" removes all pools, retiring in epoch E (so, all pools, 
+retiring in E, are not there from the start of the epoch)
+
+So, next iteration of "EPOCH" rule (E=>E+1 transition) would make 
+Set snapshot of epoch E (first shapshot without pools).
+
+One more iteration (E+1=>E+2) makes shapshot Mark without pools.
+
+Conclusion: if pool retires in epoch E, it disappears from current Mark 
+(VRF active stake) in the beginning of epoch E+2.
diff --git a/docs/ledger-verification.md b/docs/ledger-verification.md
@@ -14,7 +14,7 @@ But the reaction to the errors can be different.
 We have basically two variants of behaviour if the error in application/verification occurred:
 
 1. Print an error message. This happens when an incorrect block apply happened, which probably
-   broke the state. We can only hope that further blockchain rollbacks will correct it.
+   broke the state. Here we can no automatic response here.
 
 2. Send a ValidationStatus::NoGo message via corresponding channel (each module has a validation
    outcome channel). These channels are listened by Consensus module, and if at least one of them
@@ -28,8 +28,8 @@ We have basically two variants of behaviour if the error in application/verifica
    * Mithril/other trusted source --- blocks, already accepted by the blockchain.
      If everything is ok, then the block is applied, internal structures updated, and next block
      is processed.
-     If something is not correct, then the whole blockchain is broken, and outside intervention
-     is required.
+     If something is not correct, then either the whole blockchain is broken, or our code is
+     incorrect (inconsistent with bugs in Haskell node); outside intervention is required.
 
    * Mempool or consensus blocks --- proposals for the blockchain. If the block/transaction is not 
      successfully verified, then it could be refused.
@@ -60,17 +60,20 @@ We have basically two variants of behaviour if the error in application/verifica
 
 ## Block number and rollbacks
 
-1. `BlockInfo` keeps track of the current block number. Blocks are numbered sequentially. So if the
+Rollback happens in one for the following situation:
+
+* `BlockInfo` keeps track of the current block number. Blocks are numbered sequentially. So if the
    number equal to the previous one (or smaller than it), then Rollback takes place (all info from
    blocks with this or greater number should be deleted, and new attempt to apply block is done).
    In another words, applying of block N may be possible only if module state is actual for block N-1.
 
-2. So, if the block applied unsuccessfully (and internal structures are broken), the situation 
-   can be corrected by rolling back to last correct block and applying different (correct) block 
-   after it.
+* Explicit `BlockStatus::RolledBack` message, which remvoes tip of the blockchain, and specifies
+   last good state.
 
-   However, after unsucessful application and before successful rollback the state of the node is
-   incorrect.
+If the block(s) are applied unsuccessfully (and internal structures are broken), the situation 
+can be corrected by rolling back to last correct block and applying different (correct) block 
+after it. However, after unsucessful application and before successful rollback the state of the 
+node is incorrect.
 
 ## Mulit-module ledger specifics
 
@@ -84,4 +87,4 @@ Ledger is split into several modules, so it gives additional challenges to the v
    but also block data (hash, etc), and skip all replies that do not correspond to current verification.
 
    Instead, it should wait either for all 'Go' (from all modules), or for at least one 'NoGo',
-   and do not wait for further messages.
+   and should not wait for further messages.