-
-
Notifications
You must be signed in to change notification settings - Fork 53
Description
Marvin has done a bunch of work implementing a map_overlap function for dask-image: #237
Currently it supports numpy backed dask arrays, and does not support cupy GPU arrays. I believe it would be possible to extend this work to support cupy input arrays as well.
Suggested approach:
One way to approach this could be to:
- Take the the
_map_single_coordinates_array_chunkfunction, and make a second version of it using cupy/cupyx functions replacing all the lines that call numpy directly. - Then we could use the same dispatch mechanism as we do with the other modules (see the dask_image/dispatch folder) to register our new numpy and cupy versions.
- Add a test for the new functionality
- Update the
coverage.rsttable, adding a tick mark to indicate this function now has GPU support
All of the numpy specific functions I can see in the _map_single_coordinates_array_chunk function seem to have cupy/cupyx equivalents. If we need to pin to a certain version of cupy, or check before the code runs with packaging.version.parse that's ok.
- np.array -> https://docs.cupy.dev/en/latest/reference/generated/cupy.asarray.html
- np.argsort -> https://docs.cupy.dev/en/stable/reference/generated/cupy.argsort.html
- np.bincount -> https://docs.cupy.dev/en/stable/reference/generated/cupy.bincount.html
- np.clip -> https://docs.cupy.dev/en/latest/reference/generated/cupy.clip.html
- np.prod -> https://docs.cupy.dev/en/latest/reference/generated/cupy.prod.html
- np.ndindex -> https://docs.cupy.dev/en/latest/reference/generated/cupy.ndindex.html
- np.floor -> https://docs.cupy.dev/en/stable/reference/generated/cupy.floor.html
- np.ceil -> https://docs.cupy.dev/en/stable/reference/generated/cupy.ceil.html
- np.min -> https://docs.cupy.dev/en/latest/reference/generated/cupy.minimum.html
- np.max -> https://docs.cupy.dev/en/latest/reference/generated/cupy.maximum.html
- np.int64 - > ... I'm reasonably sure you can just use cupy.int64 as the equivalent, but I couldn't find a docs page
- np.where -> https://docs.cupy.dev/en/stable/reference/generated/cupy.where.html
Profiling results
A flow on possible complication might be how good or poor performance is in cases with larger than memory arrays. Since GPUs generally have more limited memory, is it more likely people will have larger than memory arrays under these circumstances.
That's not a blocker for this work, I think it would be valuable to have regardless. But it would be something interesting to look into.
In the CPU implementation, Marvin did some profiling of different use cases here: #237 (comment)