Improve GPU-aware section in the docs #927

luraess · 2025-12-17T09:22:54Z

Adds infos to the doc as per discussion in #924.

docs/src/usage.md

aklocker42 · 2025-12-17T09:39:18Z

docs/src/usage.md

+> comm_loc = MPI.Comm_split_type(comm, MPI.COMM_TYPE_SHARED, rank)
+> rank_loc = MPI.Comm_rank(comm_loc)
+> ```
+> If using (2), one can use the default device but make sur to handle device visbility in the scheduler; for SLURM on Cray systems, this can be mostly achieved using `--gpus-per-task=1`.


Doesn't '--gpus-per-task' for SLURM prevent the use of GPU Peer2Peer IPC mechanisms (https://cpe.ext.hpe.com/docs/24.03/mpt/mpich/intro_mpi.html) which would have a negative impact on performance?

Yeah that's what I also remember, but perhaps Nvidia has finally fixed this?

Nope, not as far as I can tell.

Thanks for pointing this out. I can make the text more generic

docs/src/usage.md

vchuravy · 2025-12-17T19:24:52Z

docs/src/usage.md

+Successfully running the [alltoall\_test\_cuda.jl](https://gist.github.com/luraess/0063e90cb08eb2208b7fe204bbd90ed2)
+should confirm your MPI implementation to have the CUDA support enabled. Moreover, successfully running the
+[alltoall\_test\_cuda\_multigpu.jl](https://gist.github.com/luraess/ed93cc09ba04fe16f63b4219c1811566) should confirm


Can we move the files into this repository?

Yes, we can

Were shall one put them?

vchuravy · 2025-12-17T19:25:18Z

docs/src/usage.md

+!!! note "Preloads"
+    On Cray machines, you may need to ensure the following preloads to be set in the preferences:
+    ```
+    preloads = ["libmpi_gtl_hsa.so"]
+    preloads_env_switch = "MPICH_GPU_SUPPORT_ENABLED"
+    ```


This is also true for CUDA.

vchuravy · 2025-12-17T19:26:21Z

docs/src/usage.md

+    preloads_env_switch = "MPICH_GPU_SUPPORT_ENABLED"
+    ```
+
+!!! note "Multiple GPUs per node"


Suggested change

!!! note "Multiple GPUs per node"

### "Multiple GPUs per node"

Since the text is not just on ROCM?

Improve GPU-aware section

61e0ede

luraess changed the title ~~Improve GPU-aware section~~ Improve GPU-aware section in the docs Dec 17, 2025

luraess added the documentation label Dec 17, 2025

luraess commented Dec 17, 2025

View reviewed changes

docs/src/usage.md Outdated Show resolved Hide resolved

luraess mentioned this pull request Dec 17, 2025

issues with test of CUDA-aware MPI support #924

Closed

aklocker42 reviewed Dec 17, 2025

View reviewed changes

giordano reviewed Dec 17, 2025

View reviewed changes

docs/src/usage.md Outdated Show resolved Hide resolved

Add suggestions

b8312ea

vchuravy reviewed Dec 17, 2025

View reviewed changes

luraess added 2 commits December 17, 2025 21:05

Update

a017080

Add examples

d64e598

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Improve GPU-aware section in the docs #927

Improve GPU-aware section in the docs #927

Uh oh!

luraess commented Dec 17, 2025

Uh oh!

Uh oh!

aklocker42 Dec 17, 2025

Uh oh!

vchuravy Dec 17, 2025

Uh oh!

aklocker42 Dec 17, 2025

Uh oh!

luraess Dec 17, 2025

Uh oh!

Uh oh!

vchuravy Dec 17, 2025

Uh oh!

luraess Dec 17, 2025

Uh oh!

luraess Dec 17, 2025

Uh oh!

vchuravy Dec 17, 2025

Uh oh!

vchuravy Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	!!! note "Multiple GPUs per node"
	### "Multiple GPUs per node"

Uh oh!

Improve GPU-aware section in the docs #927

Are you sure you want to change the base?

Improve GPU-aware section in the docs #927

Uh oh!

Conversation

luraess commented Dec 17, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants