Skip to content

Conversation

Copy link

Copilot AI commented Dec 16, 2025

Why the changes were made

The kubectl oadp backup logs command was hanging for 2+ hours in CI when NonAdminDownloadRequest processing stalled. The hardcoded 120-second timeout wasn't enforced, and error messages provided no diagnostic information.

After analyzing all backup commands in response to feedback requesting timeouts for "all commands that could timeout," only the logs command actually waits for external operations that could hang. Other commands (describe, get, create, delete, bsl) perform immediate Kubernetes API operations and cannot hang indefinitely.

How to test the changes made

Basic usage with default 5-minute timeout:

kubectl oadp nonadmin backup logs my-backup
# Output: Waiting for backup logs to be processed (timeout: 5m0s)...

Custom timeout:

kubectl oadp nonadmin backup logs my-backup --timeout=30s

Timeout error message (when NonAdminDownloadRequest processing hangs):

Error: timed out after 5m0s waiting for NonAdminDownloadRequest 'backup-logs-xyz' to be processed.
NonAdminDownloadRequest conditions: Processed=False, Accepted=True

User cancellation (Ctrl+C):

Error: operation cancelled: context canceled

Changes made

Added --timeout flag with 5-minute default (previously hardcoded 120 seconds):

  • Named constant defaultLogsTimeout = 5 * time.Minute
  • Users can customize per environment needs

Enhanced error reporting:

  • Created shared FormatDownloadRequestTimeoutError() utility
  • Distinguishes timeout (context.DeadlineExceeded) from user cancellation
  • Shows request name and status conditions on timeout
  • Uses fresh context.Background() for final status check (parent context is cancelled)

Fixed context handling:

  • Single context-based timeout mechanism (eliminated duplicate time.After)
  • Uses timeoutCtx for all API calls to preserve full timeout window
  • Proper ticker cleanup with defer ticker.Stop()

Analysis of all commands:

  • Only logs command waits for external operations (NonAdminDownloadRequest)
  • Other commands (describe, get, create, delete, bsl) use immediate API calls and cannot hang

Files modified:

  • cmd/non-admin/backup/logs.go - Added timeout flag and context-based waiting
  • cmd/shared/download.go - Shared error formatting with named constants
  • cmd/non-admin/backup/backup_test.go - Test coverage for timeout flag
Original prompt

This section details on the original issue you should resolve

<issue_title>Implement timeout when log commands hangs</issue_title>
<issue_description>re: openshift/oadp-operator#2046</issue_description>

<agent_instructions>Investigate and implement</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@kaovilai 2025/12/16 01:49:23 Executing CLI command: kubectl oadp backup logs mysql-csi-cli-e2e-230ee60b-da21-11f0-b7c5-0a580a802bca {"component":"entrypoint","file":"sigs.k8s.io/prow/pkg/entrypoint/run.go:169","func":"sigs.k8s.io/prow/pkg/entrypoint.Options.ExecuteProcess","level":"error","msg":"Process did not finish before 2h0m0s timeout","severity":"error","time":"2025-12-16T03:43:51Z"}

https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_oadp-operator/2038/pull-ci-openshift-oadp-operator-oadp-dev-4.20-e2e-test-cli-aws/2000724942827556864#1:build-log.txt%3A942</comment_new>
<comment_new>@kaovilai
The command got killed after about 2h in prow</comment_new>
<comment_new>@kaovilai
User stories: I ran log command, I expect a response not in 2 hours.</comment_new>
<comment_new>@kaovilai
On timeout error it should output what it was waiting on such as downloadrequest</comment_new>


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 7 commits December 16, 2025 16:58
…sages

Co-authored-by: kaovilai <11228024+kaovilai@users.noreply.github.com>
Co-authored-by: kaovilai <11228024+kaovilai@users.noreply.github.com>
Co-authored-by: kaovilai <11228024+kaovilai@users.noreply.github.com>
Co-authored-by: kaovilai <11228024+kaovilai@users.noreply.github.com>
Co-authored-by: kaovilai <11228024+kaovilai@users.noreply.github.com>
Co-authored-by: kaovilai <11228024+kaovilai@users.noreply.github.com>
Co-authored-by: kaovilai <11228024+kaovilai@users.noreply.github.com>
Copilot AI changed the title [WIP] Implement timeout for log commands that hang Add configurable timeout to backup logs command Dec 16, 2025
Copilot AI requested a review from kaovilai December 16, 2025 17:19
Copy link
Contributor

@weshayutin weshayutin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed in scrum, imho

  1. any / all commands need a timeout
  2. if oadp logs needs an additional / different timeout, that is fine by me as well.

@kaovilai
Copy link
Member

@copilot implement timeout for all commands that could timeout not just logs.

Co-authored-by: kaovilai <11228024+kaovilai@users.noreply.github.com>
Copy link
Author

Copilot AI commented Dec 17, 2025

@copilot implement timeout for all commands that could timeout not just logs.

Implemented configurable timeout for all commands that could timeout:

logs command - Already had timeout (commits dc5c4bb through 9f3a3be)
describe command - Added timeout in commit d10b7dc

Both commands now support:

  • --timeout flag with 5-minute default
  • Customizable per environment: --timeout=10m, --timeout=30s, etc.
  • Enhanced error messages showing what was being waited for

Analysis of other commands:

  • get, create, delete, bsl commands use simple Kubernetes API calls that return immediately - no timeout needed

All commands that could potentially hang now have configurable timeouts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement timeout when log commands hangs

3 participants