Skip to content

Conversation

@danolivo
Copy link
Contributor

This commit made it clear that the recently added (c436818) spock_shmem_attach is unnecessary, complicates the code, and is likely to cause future bugs. I removed that old stuff and manually tested how it works during recovery, during checkpoints, and after a hard postmaster SIGKILL.

@danolivo danolivo requested a review from mason-sharp December 24, 2025 12:51
@danolivo danolivo self-assigned this Dec 24, 2025
@danolivo danolivo added the bug Something isn't working label Dec 24, 2025
@mason-sharp mason-sharp reopened this Dec 27, 2025
The SpockApplyProgress structure contained a redundant 'key' field that
duplicated information already present in the containing SpockGroupEntry.
This redundancy violated the DRY principle and could lead to inconsistency
if the key and entry's key ever diverged.

This commit aligns Spock with PostgreSQL core RMGR conventions by:

1. Removing the 'key' field from SpockApplyProgress
2. Introducing a separate xl_progress structure for WAL records that
   explicitly includes both the key (dbid, node_id, remote_node_id) and
   the progress data
3. Updating all callers to pass OIDs explicitly rather than relying on
   the embedded key field

Benefits:
- Eliminates data duplication and potential inconsistency
- Makes the separation between in-memory and on-disk representations clearer
- Follows PostgreSQL conventions for RMGR WAL record structures
- Removes unnecessary assertions that checked key field consistency

The xl_progress structure now serves as the canonical WAL format, while
SpockApplyProgress is purely the in-memory progress data structure.
Being loaded on startup, Spock always initializes its shared memory segment
and shared HTABs in the postmaster process as well as all local pointers to
shared memory.

There is no case when the postmaster does not call shmem_startup_hook() after
cleaning up its shared memory. Also, there is no case when the postmaster
does any sensitive operations before initializing shared memory.

In case of recovery, evidence that shared memory is already initialized and
operable is the fact that redo operations can acquire locks.

Remove the 'attach' routine and its calls in checkpoint hook and bgworker
initialization to make the code more straightforward, reduce the code base,
and reduce potential errors.

Additional code quality improvements:
- Clarify TODO comment about undefined behavior with padding bytes
- Improve function documentation for return values (spock_group_progress_update)
- Remove redundant Assert statement that duplicated runtime check
- Clarify comments about progress structure initialization

This is a partial revert of commit c436818.
@danolivo danolivo marked this pull request as draft December 29, 2025 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants