Skip to content

server: add pending jobs to job_status_reports#153

Merged
hsjoberg merged 10 commits intomasterfrom
pending-entry-jobs
Feb 14, 2026
Merged

server: add pending jobs to job_status_reports#153
hsjoberg merged 10 commits intomasterfrom
pending-entry-jobs

Conversation

@hsjoberg
Copy link
Member

@hsjoberg hsjoberg commented Feb 13, 2026

This PR changes the behavior of job_status_reports and drops the notification_tracking table.
We create a pending entry (status Pending) once we send out the silent push notification, and once the client responds we update the entry.

We also set up a cronjob that runs every hour to clear up any pending jobs and set their status to Timeout (n.b: if the device didn't wake up)

Closes #149.

@hsjoberg hsjoberg changed the title server: mark status for stale pending jobs as Timeout server: add pending jobs to job_status_reports Feb 13, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @hsjoberg, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors how background job statuses and notification tracking are managed. It introduces new 'Pending' and 'Timeout' states for jobs, coupled with a server-side cronjob to automatically mark unresponsive pending jobs as timed out. The system for tracking notification spacing has been streamlined by removing a dedicated table and instead leveraging existing job status and heartbeat records. Client-side APIs have been updated to align with these changes, ensuring that all job status updates are correlated via a unique k1 identifier.

Highlights

  • Job Status Tracking Enhancement: Introduced 'Pending' and 'Timeout' statuses to the ReportStatus enum, allowing for more granular tracking of background job lifecycles. A new notification_k1 column and updated_at timestamp were added to job_status_reports to support this.
  • Automated Stale Job Timeout: Implemented a cronjob that runs hourly to identify and mark Pending job status reports older than one hour as Timeout, ensuring that jobs that fail to report back from the client are properly accounted for.
  • Refactored Notification Spacing Logic: The notification_tracking table has been deprecated and removed. Notification spacing logic now directly queries job_status_reports (for backup/maintenance) and heartbeat_notifications to determine the last sent notification time, simplifying the database schema and data flow.
  • Client-Side k1 Integration: Client-side API calls for backup uploads, job status reporting, invoice submission, and heartbeat responses now consistently include a k1 (correlation key) parameter. This k1 is used by the server to uniquely identify and update specific pending job reports.
  • Strict Job Status Reporting: The /report_job_status endpoint on the server now strictly accepts only 'Success' or 'Failure' statuses from clients. It updates an existing 'Pending' entry associated with the provided k1 rather than creating a new one, and returns a 'Not Found' error if no matching pending entry exists.
Changelog
  • bacon.toml
    • Removed --config-path config.toml from the server's run command.
  • client/src/lib/api.ts
    • Updated getUploadUrl, completeUpload, reportJobStatus, submitInvoice, and heartbeatResponse to require a k1 in their payloads.
    • Refactored reportJobStatus to conditionally post based on k1 presence and return an error if k1 is missing.
  • client/src/lib/backupService.ts
    • Integrated getK1 calls to include k1 in getUploadUrl and completeUpload requests for authentication.
  • client/src/lib/pushNotifications.ts
    • Enforced the presence of k1 for handleTaskCompletion, submitInvoiceTask, and heartbeatResponse functions, logging warnings or returning early if missing.
  • client/src/types/serverTypes.ts
    • Expanded the ReportStatus enum to include Pending and Timeout states.
  • server/migrations/0009_job_status_pending_and_drop_notification_tracking.sql
    • Added notification_k1 (TEXT) and updated_at (TIMESTAMPTZ) columns to the job_status_reports table.
    • Created unique index idx_job_status_reports_pubkey_notification_k1 and regular index idx_job_status_reports_pubkey_status on job_status_reports.
    • Dropped the notification_tracking table.
  • server/src/cron.rs
    • Added JobStatusRepository to imports.
    • Defined constants STALE_PENDING_JOB_TIMEOUT_MINUTES, STALE_PENDING_JOB_SWEEP_SCHEDULE, and STALE_PENDING_JOB_ERROR_MESSAGE.
    • Implemented timeout_stale_pending_job_reports function to mark stale pending jobs as timeout.
    • Added timeout_stale_pending_job_reports to the cron scheduler, running every 10 minutes.
  • server/src/db/job_status_repo.rs
    • Modified create_and_prune to create_with_k1_and_prune, now requiring notification_k1 and pruning to keep the last 50 reports.
    • Added update_by_k1 function to update existing job status reports based on pubkey and notification_k1.
    • Added mark_stale_pending_as_timeout function to update stale pending job reports.
    • Introduced test-only functions create_with_k1_and_created_at and find_status_and_error_by_k1 for easier testing of job status reports.
  • server/src/db/notification_tracking_repo.rs
    • Updated comments to reflect that notification timing is now derived from job_status_reports.created_at and heartbeat_notifications.sent_at.
    • Refactored get_last_notification_time and get_eligible_users to query job_status_reports and heartbeat_notifications instead of notification_tracking.
    • Removed record_notification_sent function as it is no longer needed.
  • server/src/db/push_token_repo.rs
    • Added find_all_with_pubkeys to retrieve all push tokens along with their associated public keys.
  • server/src/notification_coordinator.rs
    • Imported JobStatusRepository and PushDispatchReceipt.
    • Modified send_notification to record Pending job status reports after successfully dispatching notifications.
    • Added helper functions report_type_for_notification and record_pending_job_reports to manage pending job status creation.
  • server/src/push.rs
    • Introduced PushDispatchReceipt struct to capture pubkey and notification_k1 for dispatched notifications.
    • Modified send_push_notification_with_unique_k1 to return a vector of PushDispatchReceipt for each successful dispatch.
    • Refactored push token retrieval to use PushTarget struct, including pubkey and push_token.
  • server/src/routes/gated_api_v0.rs
    • Imported ReportStatus.
    • Modified report_job_status endpoint to only accept Success or Failure statuses from clients, returning InvalidArgument for Pending or Timeout.
    • Updated report_job_status to use JobStatusRepository::update_by_k1 to modify an existing pending entry, returning NotFound if no matching entry is found.
  • server/src/tests/common.rs
    • Removed notification_tracking from the list of tables to truncate in reset_database.
  • server/src/tests/coordinator_tests.rs
    • Updated test_normal_priority_respects_spacing and test_critical_priority_bypasses_spacing to insert job status reports into job_status_reports instead of notification_tracking.
    • Removed test_notification_tracking_records_sent and test_broadcast_filters_ineligible_users due to notification_tracking removal and refactored logic.
    • Added test_last_notification_time_includes_heartbeat_records to verify spacing logic with heartbeat notifications.
    • Updated test_eligible_users_query and test_spacing_configuration_from_config to use job_status_reports for notification tracking.
  • server/src/tests/gated_user_tests.rs
    • Updated test_report_job_status_pruning to use create_with_k1_and_prune directly and adjusted expected pruned count to 50.
    • Added test_report_job_status_updates_existing_pending_entry to verify that job status reports update existing pending entries.
    • Added test_report_job_status_missing_pending_entry_returns_not_found to ensure proper error handling for missing pending entries.
    • Added test_report_job_status_rejects_pending_status and test_report_job_status_rejects_timeout_status to confirm client-side status validation.
    • Added test_stale_pending_job_reports_are_marked_timeout_after_one_hour to test the cronjob's functionality.
    • Added test_stale_pending_timeout_cleanup_does_not_override_existing_error_message to verify error message preservation during timeout.
  • server/src/types.rs
    • Added Pending and Timeout variants to the ReportStatus enum.
    • Updated comments in NotificationData to reflect changes in notification tracking.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors job status reporting by introducing a 'Pending' state and a timeout mechanism, simplifying the design by removing the notification_tracking table and consolidating logic into job_status_reports. Critically, the implementation of UnifiedPush support introduces a significant security vulnerability, allowing authenticated users to leak the server's ntfy_auth_token and perform SSRF by registering arbitrary URLs as push tokens. This vulnerability must be addressed before merging. Additionally, consider a performance improvement for database transactions when recording pending jobs.

@hsjoberg
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors job status and notification tracking by consolidating logic into the job_status_reports table and introducing a Pending status with a cron job for timeouts. However, a critical security flaw was identified in the UnifiedPush implementation where untrusted user input is used as a network destination, leading to SSRF and the leakage of the ntfy_auth_token secret. This must be addressed by validating push token URLs and ensuring sensitive tokens are only sent to trusted endpoints.

Comment on lines +150 to +155
send_unified_notification(
&http_client_clone,
&target.push_token,
&data_string,
&ntfy_auth,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The send_unified_notification function uses the user-provided target.push_token as a destination URL without validation. An authenticated user can register a malicious URL (e.g., an internal service or an attacker-controlled server) to trigger an SSRF. Furthermore, the server's ntfy_auth_token is sent to this URL in the Authorization header, leading to the leakage of server-side credentials.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not true unless something changed. The client registers their push token on sign up and we do a regex check if it's a ntfy URL iirc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I didn't touch such code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gemini-code-assist it's being followed up on another PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. I'll close this thread then, as the issue is being addressed in another PR.

Comment on lines 226 to 232
export const reportJobStatus = (payload: ReportJobCompletionPayload & { k1: string }) =>
payload.k1
? post<ReportJobCompletionPayload & { k1: string }, DefaultSuccessPayload>(
"/report_job_status",
payload,
)
: Promise.resolve(err(new Error("Missing notification k1 for report_job_status")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The payload.k1 check appears to be redundant. The function's type signature ReportJobCompletionPayload & { k1: string } already ensures that k1 is a non-optional string. Furthermore, the main call site in pushNotifications.ts within handleTaskCompletion already includes a guard if (!k1) to prevent this function from being called with a falsy k1.

Relying on the type system and caller-side validation simplifies this function and avoids defensive code. Consider removing the ternary operator.

export const reportJobStatus = (payload: ReportJobCompletionPayload & { k1: string }) =>
  post<ReportJobCompletionPayload & { k1: string }, DefaultSuccessPayload>(
    "/report_job_status",
    payload,
  );

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think this review is valid

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Comment on lines +246 to +256
let mut tx = self.app_state.db_pool.begin().await?;
JobStatusRepository::create_with_k1_and_prune(
&mut tx,
&dispatch.pubkey,
&dispatch.notification_k1,
&report_type,
&ReportStatus::Pending,
None,
)
.await?;
tx.commit().await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Initiating and committing a new database transaction for each dispatch within this loop can be inefficient, especially during broadcast notifications with many recipients. This approach incurs significant overhead from repeated, small transactions.

For better performance and scalability, consider processing all dispatches within a single transaction. This would reduce the number of round-trips to the database. If failure isolation for each dispatch is a strict requirement, the current implementation is valid, but it's worth noting the performance trade-off.

)
: Promise.resolve(err(new Error("Missing notification k1 for report_job_status")));

export const submitInvoice = (payload: SubmitInvoicePayload & { k1: string }) =>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad formatting

@hsjoberg
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a substantial and well-executed refactoring of job status and notification tracking. Dropping the notification_tracking table and consolidating the logic into job_status_reports simplifies the data model and state management. The introduction of a Pending state and a cron job to handle timeouts is a robust way to manage jobs that rely on client responses. The changes are consistent across the backend and client, and the new test suite is comprehensive. I have a few suggestions to improve performance and maintainability.

Comment on lines +95 to +127
let last_sent = match notification_data {
NotificationData::Maintenance(_) => {
sqlx::query_scalar::<_, Option<DateTime<Utc>>>(
"SELECT MAX(created_at)
FROM job_status_reports
WHERE pubkey = $1 AND report_type = 'Maintenance'",
)
.bind(pubkey)
.fetch_one(self.pool)
.await?
}
NotificationData::BackupTrigger(_) => {
sqlx::query_scalar::<_, Option<DateTime<Utc>>>(
"SELECT MAX(created_at)
FROM job_status_reports
WHERE pubkey = $1 AND report_type = 'Backup'",
)
.bind(pubkey)
.fetch_one(self.pool)
.await?
}
NotificationData::Heartbeat(_) => {
sqlx::query_scalar::<_, Option<DateTime<Utc>>>(
"SELECT MAX(sent_at)
FROM heartbeat_notifications
WHERE pubkey = $1",
)
.bind(pubkey)
.fetch_one(self.pool)
.await?
}
NotificationData::LightningInvoiceRequest(_) => None,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The match statement has duplicated code for Maintenance and BackupTrigger notifications. This can be simplified to reduce redundancy and improve maintainability by combining the match arms.

        let last_sent = match notification_data {
            NotificationData::Maintenance(_) | NotificationData::BackupTrigger(_) => {
                let report_type = if matches!(notification_data, NotificationData::Maintenance(_)) {
                    "Maintenance"
                } else {
                    "Backup"
                };
                sqlx::query_scalar::<_, Option<DateTime<Utc>>>(
                    "SELECT MAX(created_at)
                     FROM job_status_reports
                     WHERE pubkey = $1 AND report_type = $2",
                )
                .bind(pubkey)
                .bind(report_type)
                .fetch_one(self.pool)
                .await?
            }
            NotificationData::Heartbeat(_) => {
                sqlx::query_scalar::<_, Option<DateTime<Utc>>>(
                    "SELECT MAX(sent_at)
                     FROM heartbeat_notifications
                     WHERE pubkey = $1",
                )
                .bind(pubkey)
                .fetch_one(self.pool)
                .await?
            }
            NotificationData::LightningInvoiceRequest(_) => None,
        };

@hsjoberg hsjoberg merged commit 5fe5e7e into master Feb 14, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Job reports: create pending entry on dispatch, update status on client response

2 participants

Comments