Skip to content

Conversation

@Panaetius
Copy link
Member

No description provided.

@Panaetius
Copy link
Member Author

Panaetius commented Nov 19, 2025

This PR is part of a stack of 8 bookmarks:

  1. trunk()
  2. check-docker-arch
  3. stop-jobs-cli ← this PR
  4. add-cli-commands
  5. tui-login
  6. tui-switch-system
  7. tui-job-log
  8. add-ci
  9. new-readme

Created with jj-stack

Copy link
Member

@olevski olevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments but otherwise things look good.

image,
command,
} => cli_cscs_job_start(script_file, image, command).await?,
cli::CscsJobCommands::Stop { job_id } => cli_cscs_job_stop(job_id).await?,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment(non-blocking): If this results in the irrecoverable loss / removal of data probably stop is not a good name. Stop makes me think that whatever is running will stop but data is not lost.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this cancels the job. i.e. makes it go from Running to Non-Running. Data is not lost.

Actually the Firecrest API does not have a method to delete a job, as far as I can tell, which might be an issue where things get pretty cluttered over time. It is pretty weird that the HTTP verb DELETE on the job endpoint is for cancellation, I want to raise an issue with the firecrest people on this. Calling this on a job that is Finished or Failed gives an exception that the underlying Slurm job does not exist, it only works for running jobs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and the openapi spec summary says Delete Job Cancel for what it does, which is just gibberish 🙈

Ok(Some((job, job_metadata).into()))
}

pub async fn delete_stop(&self, system_name: &str, job_id: i64) -> Result<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: confusing name... maybe my other comment is more relevant... it may be important to distinguish stoping vs deletion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a typo and I was going for delete_job and renamed the wrong part to stop after I realized that the DELETE endpoint actually cancels... I'll rename it to cancel everywhere, I think that's the most clear. I originally only went for stop because it's less characters to type on the CLI 😆

Base automatically changed from check-docker-arch to main November 25, 2025 07:37
@Panaetius Panaetius requested a review from olevski November 25, 2025 08:15
@Panaetius Panaetius mentioned this pull request Nov 25, 2025
@Panaetius Panaetius merged commit a17dfc6 into main Nov 27, 2025
@Panaetius Panaetius deleted the stop-jobs-cli branch November 27, 2025 08:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants