Skip to content

Canned response for jobs filling up /tmp #86

@Premas

Description

@Premas

What would you like to see added?

Users sometimes build or run software directly on Cheaha compute nodes that writes temporary files to /tmp. Since /tmp is node-local storage and limited in size, files are automatically deleted when the job ends or the node reboots. This can cause jobs to fail with errors like: No space left on device

Proposal

Create a response template for users whose jobs fill the /tmp directory on compute nodes.

Draft response

We noticed that some of your jobs are filling up space in the /tmp directory on Cheaha compute nodes, creating temporary files with names similar to x. From the jobs you submitted today, it looks like you are running an array job (jobid: xxx). Based on the job name xyz, it appears you are running a xy workflow.

Quick Note: /tmp is local storage on each compute node and is limited in size. If it fills up, jobs can fail with “No space left on device” errors because many programs use it for temporary files during processing. Redirecting temp files to larger storage avoids these issues.

Could you share your job script? We can guide you on redirecting temporary files to your home directory or scratch space to prevent /tmp from filling up. For more details, please see our documentation on temporary file issues: https://docs.rc.uab.edu/data_management/cheaha_storage_gpfs/temporary_files/
. You can also join our Zoom office hours: https://docs.rc.uab.edu/#how-to-contact-us
, or I can schedule a meeting to go over it with you.

Where to Include / Update

Can we have a new page HPC Best Practices?

Metadata

Metadata

Assignees

Labels

canned responseStandardized template messages for staff to respond to common user issues.documentationImprovements or additions to documentation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions