Skip to content

Conversation

@jrray
Copy link
Collaborator

@jrray jrray commented Jul 2, 2024

$ cargo test --bin clean-loom

This shows three variations on how spfs clean can race with writers to a repo.

The first one tries to simulate the existing behavior and the test fails as expected.

The other two demonstrate how a write-ahead log might prevent races. The first one requires the write-ahead log to be locked while reading in all the tags, which in turn would block writes for a significant amount of time.

The second one proposes how the write-ahead log could be pruned over time with a kakfa-like tombstone concept, and only requires the log to be locked while actually deleting files.

In either case, the time while the write-ahead log is locked could early exit in order to not block writers for an extended amount of time. In this way it could make incremental progress by deleting some number of objects in each pass.

@jrray jrray added the agenda item Items to be brought up at the next dev meeting label Jul 2, 2024
@jrray jrray self-assigned this Jul 2, 2024
    $ cargo test --bin clean-loom

This shows three variations on how `spfs clean` can race with writers to
a repo.

The first one tries to simulate the existing behavior and the
test fails as expected.

The other two demonstrate how a write-ahead log might prevent races. The
first one requires the write-ahead log to be locked while reading in all
the tags, which in turn would block writes for a significant amount of
time.

The second one proposes how the write-ahead log could be pruned over
time with a kakfa-like tombstone concept, and only requires the log to
be locked while actually deleting files.

In either case, the time while the write-ahead log is locked could early
exit in order to not block writers for an extended amount of time. In
this way it could make incremental progress by deleting some number of
objects in each pass.

Signed-off-by: J Robert Ray <jrray@jrray.org>
jrray added 2 commits October 22, 2025 11:33
UPDATE: this approach still has a race but the original version wasn't
testing the condition where the race exists. The test has been updated
to show that a race still exists.

This approach doesn't require there to be any "global" lock while
performing a clean, but the trade-off is there is more overhead for
writing objects and for performing a clean.

In summary, when writing an object, first a <digest>.hold file will be
written. The existence of this file prevents <digest> from being
garbage.

The <digest>.hold file is not removed until after a tag exists that
creates a strong reference to the object.

The clean process collects what it thinks are garbage objects as usual,
but then it checks for .hold files for each candidate garbage object.
Then, a second pass over all the tags is needed to discover any strong
references that didn't exist in the first pass.

Since looking for tags happens after checking for .hold files, the new
tags are guaranteed to be found. The tag must exist by the time the
.hold file is deleted.

Signed-off-by: J Robert Ray <jrray@jrray.org>
The two cases that use locking continue to pass.

Signed-off-by: J Robert Ray <jrray@jrray.org>
jrray added 2 commits October 22, 2025 20:23
This approach avoids the need to hold a lock for the span of a whole
clean operation, but uses a lock per object to prevent newly-written
objects from being cleaned. The key is stopping a writer from writing an
object just before the cleaner deletes it. A staging file is used for
the writer to mark an object as non-garbage. Synchronization is still
needed so either the deleter sees the staging file, or the writer waits
until it is safe to write a new copy of the object.

Signed-off-by: J Robert Ray <jrray@jrray.org>
This one tries to detect any deletes that produce a dangling reference.
It uses the concept of staging again and a "global" lock, but it has a
nice property that it blocks the cleaner instead of blocking the writer,
at least in the sense that the lock doesn't have to be continuously held
by the cleaner. It would also be safe for concurrent writers, but the
cleaner can only make progress while there are no writers.

However there are some onerous requirements to make this work in
practice. The whole hierarchy of objects intended to be written need to
be marked as staged. It isn't safe to see that some sub-hierarchy
already exists in the destination and skip staging it, because it may
already be in the process of being deleted.

The cleaner is required to visit/delete objects in top-down order.

The cases added here are passing but more work is needed to exhaustively
test this approach.

Signed-off-by: J Robert Ray <jrray@jrray.org>
@jrray
Copy link
Collaborator Author

jrray commented Oct 23, 2025

I made some progress with the recent update, and have been thinking about how to optimize writes when a writer now has to mark all the objects that are going to be sent over.

I'm imagining a change to the Syncer where it first sends over a list of top level object digests. The server side would mark as many of these as possible (including children of these objects), and send back a list of digests that are unknown. Then the client uses this to produce a list of the remaining digests that the server still needs. This list is sent over to complete the marking phase. Then the object/payloads are transfered.

Doing it this way allows for the possibility of not having to transfer some objects that already exist in the destination, but can recover and resend objects that aren't marked in time to prevent a concurrent clean from deleting them.

As the new test case shows, as long as the objects are written and cleaned in topological order, repo consistency can be maintained throughout.

@jrray
Copy link
Collaborator Author

jrray commented Oct 23, 2025

The server side would mark as many of these as possible (including children of these objects).

Thinking about this more, since the cleaner is not allowed to create dangling pointers, then for pushing existing objects that can be marked in time, it wouldn't be necessary to mark the children. Keeping the parent alive keeps all the children alive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agenda item Items to be brought up at the next dev meeting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants