-
-
Notifications
You must be signed in to change notification settings - Fork 124
Improve GPU-aware section in the docs #927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
docs/src/usage.md
Outdated
| > comm_loc = MPI.Comm_split_type(comm, MPI.COMM_TYPE_SHARED, rank) | ||
| > rank_loc = MPI.Comm_rank(comm_loc) | ||
| > ``` | ||
| > If using (2), one can use the default device but make sur to handle device visbility in the scheduler; for SLURM on Cray systems, this can be mostly achieved using `--gpus-per-task=1`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't '--gpus-per-task' for SLURM prevent the use of GPU Peer2Peer IPC mechanisms (https://cpe.ext.hpe.com/docs/24.03/mpt/mpich/intro_mpi.html) which would have a negative impact on performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's what I also remember, but perhaps Nvidia has finally fixed this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, not as far as I can tell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out. I can make the text more generic
docs/src/usage.md
Outdated
| Successfully running the [alltoall\_test\_cuda.jl](https://gist.github.com/luraess/0063e90cb08eb2208b7fe204bbd90ed2) | ||
| should confirm your MPI implementation to have the CUDA support enabled. Moreover, successfully running the | ||
| [alltoall\_test\_cuda\_multigpu.jl](https://gist.github.com/luraess/ed93cc09ba04fe16f63b4219c1811566) should confirm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move the files into this repository?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were shall one put them?
docs/src/usage.md
Outdated
| !!! note "Preloads" | ||
| On Cray machines, you may need to ensure the following preloads to be set in the preferences: | ||
| ``` | ||
| preloads = ["libmpi_gtl_hsa.so"] | ||
| preloads_env_switch = "MPICH_GPU_SUPPORT_ENABLED" | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also true for CUDA.
docs/src/usage.md
Outdated
| preloads_env_switch = "MPICH_GPU_SUPPORT_ENABLED" | ||
| ``` | ||
|
|
||
| !!! note "Multiple GPUs per node" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| !!! note "Multiple GPUs per node" | |
| ### "Multiple GPUs per node" |
Since the text is not just on ROCM?
Adds infos to the doc as per discussion in #924.