Fix handling shards dbs purge checkpoints in mem3_rep #5832

nickva · 2025-12-23T06:51:24Z

Previous PR [1] failed to account for shards db itself. Shards db (_dbs) is managed differently than regular shard copies. Its mem3:shards(Dbs) result is a single element shard list with a #shard{} having node = node() and range = [0, ff..]. They are replicated in a ring across all nodes, we expect to find a purge checkpoint pushing changes to the "next" node in a ring only.

[1] #5827

Previous PR [1] failed to account for shards db itself. Shards db (`_dbs`) is managed differently than regular shard copies. Its `mem3:shards(Dbs)` result is a single element shard list with a `#shard{}` having `node = node()` and `range = [0, ff..]`. They are replicated in a ring across all nodes, we expect to find a purge checkpoint pushing changes to the "next" node in a ring only. [1] #5827

jiahuili430

+1

In the previous PR [1] we added special handling for shards dbs since it has a custom ring replication topology. In PR [1] we considered only the checkpoint for pushes from the current node to the "next" none in the ring. However, we should also consider the checkpoints created by the "previous" node when it pull purges from current node, so that what we fix in this PR. As a reminder, a replication job from node A to node B will: 1) pull purges from B to A (checkpoint on B with a B->A purge checkpoint doc) 2) push purges from A to B (checkpoint on A with a A->B purge checkpoint doc) 3) push do updates from A to B (checkpoint on A with A->B sync checkpoint doc, and on B also with an A->B sync checkpoint doc) [1] #5832

nickva · 2025-12-24T07:01:32Z

There is continuation PR after this which adds handling for "previous" purge checkpoints for when the previous nodes in the ring replicates to us. When it pulls purges from "us", it also create its own checkpoint going from current node to the previous node.

I added the pervious node consideration in a new PR #5833

In the previous PR [1] we added special handling for shards dbs since it has a custom ring replication topology. In PR [1] we considered only the checkpoint for pushes from the current node to the "next" none in the ring. However, we should also consider the checkpoints created by the "previous" node when it pull purges from current node, so that what we fix in this PR. As a reminder, a replication job from node A to node B will: 1) pull purges from B to A (checkpoint on B with a B->A purge checkpoint doc) 2) push purges from A to B (checkpoint on A with a A->B purge checkpoint doc) 3) push do updates from A to B (checkpoint on A with A->B sync checkpoint doc, and on B also with an A->B sync checkpoint doc) [1] #5832

jiahuili430 approved these changes Dec 23, 2025

View reviewed changes

nickva merged commit 2c92efd into main Dec 23, 2025
49 checks passed

nickva deleted the handle-dbs-specially-for-purge-checkpoints branch December 23, 2025 19:11

nickva mentioned this pull request Dec 24, 2025

Consider previous node replications for _dbs purge checkpoints #5833

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix handling shards dbs purge checkpoints in mem3_rep #5832

Fix handling shards dbs purge checkpoints in mem3_rep #5832

nickva commented Dec 23, 2025

Uh oh!

jiahuili430 left a comment

Uh oh!

Uh oh!

nickva commented Dec 24, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix handling shards dbs purge checkpoints in mem3_rep #5832

Fix handling shards dbs purge checkpoints in mem3_rep #5832

Conversation

nickva commented Dec 23, 2025

Uh oh!

jiahuili430 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nickva commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nickva commented Dec 24, 2025 •

edited

Loading