-
Notifications
You must be signed in to change notification settings - Fork 475
Description
Hi,
I’ve encountered a race condition related to ClockedSchedule when multiple Celery workers attempt to schedule tasks concurrently.
According to the documentation, when multiple tasks share the same schedule (e.g., tasks running every 10 seconds), they should all reference the same schedule object. I assume the same principle applies to ClockedSchedule when tasks must be executed at the same exact time.
In my setup, several Celery workers may schedule a follow-up task dynamically. The next execution time is computed at runtime, and then the worker typically calls something like:
ClockedSchedule.objects.get_or_create(clock_time=calculated_time)
However, if multiple workers reach this point at the same time and no matching ClockedSchedule exists yet, they all fail the initial lookup and attempt to create a new entry simultaneously. This results in multiple identical ClockedSchedule rows being created.
This breaks the assumption that schedules are unique and shared, and it causes redundant schedule entries in the database.
⸻
Expected Behavior
Only one ClockedSchedule instance should be created for a given clock_time, even when many workers call get_or_create concurrently.
⸻
Actual Behavior
Under concurrent access:
• Multiple workers evaluate get_or_create at the same time.
• The initial lookup finds no existing row.
• Each worker inserts a new identical row.
• The database ends up with duplicate schedules.
⸻
Proposed Solution
The most reliable fix is to enforce database-level uniqueness (e.g., a unique constraint or UniqueConstraint) on the fields that uniquely identify a ClockedSchedule — typically clock_time.
This guarantees that even under high concurrency:
• Only one row can be created,
• Other workers attempting to create the same schedule will receive an IntegrityError,
• Which can then be caught and handled (retrying the get operation).
Django-Celery-Beat already uses uniqueness on some schedule models (e.g., CrontabSchedule), so it would be consistent to apply the same principle here.
⸻
Suggested Implementation
Add a database-level uniqueness constraint to ClockedSchedule at
class ClockedSchedule(models.Model):
...
clocked_time = models.DateTimeField(
verbose_name=_('Clock Time'),
help_text=_('Run the task at clocked time'),
unique=True
)
⸻
Additional Notes
• Application-level get_or_create alone is not safe under concurrency.
• Only the database can guarantee uniqueness in such scenarios.
• This change should be backward-compatible and aligns with the behavior documented for other schedule types.