Skip to content

Conversation

@AnshSinghSonkhia
Copy link

Introduces reduce_bench.py, a Python script to benchmark parallel reductions on NVIDIA GPUs using the cuda.cccl library, and updates the README with usage instructions and example output. This allows users to compare naive CuPy reductions with optimized CUDA JIT reductions from Python.


This solves #9

Introduces reduce_bench.py, a Python script to benchmark parallel reductions on NVIDIA GPUs using the cuda.cccl library, and updates the README with usage instructions and example output. This allows users to compare naive CuPy reductions with optimized CUDA JIT reductions from Python.
@AnshSinghSonkhia
Copy link
Author

Hi @ashvardanian, Could you please review this PR?

@ashvardanian
Copy link
Owner

Hey @AnshSinghSonkhia! Thanks for the PR! It's a good start! I'll find some time to explore the CCCL Python functionality in more depth and integrate the PR over the next few weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants