Skip to content

Conversation

@skipi
Copy link
Collaborator

@skipi skipi commented Nov 18, 2025

📝 Description

This PR implements retention policy support in plumber. When the usage service applies a retention policy, plumber marks pipeline records for expiration with a configurable grace period, allowing corrections before data is deleted.

  • Added expires_at column to pipeline_requests table to track when records should be deleted
  • Created PolicyConsumer to receive retention policy events from usage service via RabbitMQ
  • Created PolicyApplier to mark/unmark pipeline records based on the cutoff date
  • Records older than cutoff are marked with expires_at = now + grace_period (default 15 days)
  • Records newer than the cutoff have expires_at cleared, allowing policy corrections to "save" mistakenly marked records
  • Grace period is configurable via RETENTION_GRACE_PERIOD_DAYS env var (min: 7 days)
  • Record Deleter Worker - Added a GenServer that periodically deletes expired records in batches

✅ Checklist

  • I have tested this change
  • This change requires documentation update

@github-project-automation github-project-automation bot moved this to Backlog in Roadmap Nov 18, 2025
@skipi skipi force-pushed the mk/plumber/retention-policies branch 4 times, most recently from 7e03b42 to 3b83c30 Compare December 4, 2025 08:19
@skipi skipi marked this pull request as ready for review December 4, 2025 08:19
@skipi skipi force-pushed the mk/plumber/retention-policies branch from dd35670 to 18191c5 Compare December 4, 2025 09:27
@skipi
Copy link
Collaborator Author

skipi commented Dec 4, 2025

@codex review

@skipi skipi self-assigned this Dec 4, 2025
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@skipi
Copy link
Collaborator Author

skipi commented Dec 4, 2025

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@skipi skipi force-pushed the mk/plumber/retention-policies branch from aaac102 to af9fec5 Compare December 4, 2025 11:23
Comment on lines +20 to +26
expired_records =
from(pr in PplRequests,
where: pr.expires_at < ^now,
select: %{id: pr.id, ppl_artefact_id: pr.ppl_artefact_id},
limit: ^limit
)
|> EctoRepo.all()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment, but try to run this with the explain to validate that the planner picks the index that you created to run this query. (It should infer expires_at is not null but it may not)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


if days < @min_grace_period_days do
Logger.warning(
"[Retention] grace_period_days=#{days} is below minimum, using #{@min_grace_period_days}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add org_id in this log line, so it is easier to debug.

)

{marked_count, _} = EctoRepo.update_all(mark_query, set: [expires_at: expires_at])
{unmarked_count, _} = EctoRepo.update_all(unmark_query, set: [expires_at: nil])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These update_all calls can cause issues since if the cutoff date is recent enough, we might end up updating a lot of rows at once and have issues with locks, heavy IO, connection timeout, etc.

The more secure way here would be to do batching with a limit on the number of rows that are updated at once.

@skipi skipi force-pushed the mk/plumber/retention-policies branch from af9fec5 to 0d49ae0 Compare December 9, 2025 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

4 participants