-
Notifications
You must be signed in to change notification settings - Fork 61
fix: infinite loading when reopening workflow in collaborative editor #4175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The bug was in merge_updates/1 in persistence_writer.ex. When multiple Yjs delta updates were batched together, they were applied to an empty Yex.Doc.new() instead of the current persisted state. Since Yjs deltas only contain changes (not full state), applying them to an empty doc produced corrupted 2-byte outputs, losing all workflow data including lock_version. The fix loads the existing persisted state (checkpoint + updates) from the database before applying new deltas on top. Evidence from reproduction: - Before fix: merged updates = 2 bytes (corrupted/empty) - After fix: merged updates = 969-1742 bytes (correct full state) Closes #4164
c53ecda to
cf3886f
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4175 +/- ##
==========================================
+ Coverage 89.03% 89.09% +0.05%
==========================================
Files 425 425
Lines 19646 19658 +12
==========================================
+ Hits 17492 17514 +22
+ Misses 2154 2144 -10 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Move duplicated checkpoint and update query logic from Persistence and PersistenceWriter into DocumentState module where data access belongs. - Add get_checkpoint_and_updates/1, get_latest_checkpoint/1, get_updates_since/2, apply_to_doc/3, and load_into_doc/2 functions - Update Persistence to use shared DocumentState functions - Update PersistenceWriter.merge_updates/2 to use DocumentState.load_into_doc/2 - Remove duplicated load_persisted_state_into_doc/2 from PersistenceWriter - Add @type t and proper @SPEC annotations to DocumentState This eliminates code duplication while keeping the fix focused and avoiding conflicts with #4178.
* Fix incorrect field access in create_checkpoint The create_checkpoint/1 function was using .data instead of .state_data when accessing DocumentState records. This would cause a KeyError crash when a document reached 500 updates and triggered checkpoint creation. Closes #4176 * Fix checkpoint creation bugs and add tests Two bugs fixed in create_checkpoint/1: 1. Used .data instead of .state_data when accessing DocumentState records 2. Used string "checkpoint" instead of atom :checkpoint for Ecto.Enum Added tests to verify checkpoint creation works correctly: - Test creating checkpoint from multiple updates - Test merging existing checkpoint with new updates Both bugs would cause crashes when a document reached 500 updates and triggered automatic checkpoint creation. Closes #4176
| end | ||
|
|
||
| defp merge_updates([update]), do: update | ||
| defp merge_updates([update], _document_name), do: update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix for my issue #4154 overlaps with what you're doing here. I want to get rid of this function all together. If there's a single update doesn't mean that
- we don't want to generate a delta between the current state of the document(
_document_name) - also, properly encode the changes using
Yex.encode_state_as_update!
with 2. I don't think that just returning the single update is the same as passing it through a new doc and encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my end, I've consistently reproduced the error whenever there's a single update coming through ydoc which leads to this function. checking whether I can reproduce on this branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description
This PR fixes the infinite loading issue when reopening a workflow in the collaborative editor.
1.
extract_lock_versionnot handlingnilvalues (persistence.ex)When a workflow is opened before first save, the Y.Doc gets
lock_version: nil. Theextract_lock_version/1function didn't handle the{:ok, nil}case fromYex.Map.fetch/2, causing it to fall through incorrectly. Thestale?/2function was also refactored for clarity.2.
merge_updatesapplying deltas to empty doc (persistence_writer.ex)This was the root cause of the infinite loading. When the
PersistenceWriterbatches multiple Yjs updates together,merge_updates/1was applying them to an emptyYex.Doc.new(). Since Yjs deltas only contain changes (not full state), applying them to an empty doc produced corrupted 2-byte outputs, losing all workflow data.Evidence from reproduction:
The fix loads the existing persisted state (checkpoint + updates) from the database before applying new deltas on top.
How this caused the
nillock_version crashlock_version: nillock_version: nil) ~1000 bytes ✓lock_version, Y.Doc updated viamerge_saved_workflow_into_ydoclock_versionchangemerge_updatesmerges pending deltaslock_version: nil) + corrupted update (2 bytes)lock_version: nil, DB has real value → stale check fails → crashCloses #4164 #4154
Validation steps
Additional notes for the reviewer
AI Usage
Please disclose how you've used AI in this work (it's cool, we just want to know!):
You can read more details in our Responsible AI Policy
Pre-submission checklist
:owner,:admin,:editor,:viewer)