Skip to content

Conversation

@Andy-Jost
Copy link
Contributor

@Andy-Jost Andy-Jost commented Dec 18, 2025

Summary

Adds a version compatibility check that warns users when cuda-bindings was compiled against a newer CUDA major version than the installed driver supports.

Changes

cuda-bindings

  • Added check_cuda_version_compatibility() function in cuda/bindings/utils/_version_check.py
  • Compares compile-time CUDA_VERSION vs runtime cuDriverGetVersion()
  • Exported from cuda.bindings.utils
  • Added comprehensive unit tests in tests/test_version_check.py

cuda-core

  • Device.__new__ calls check_cuda_version_compatibility() after cuInit succeeds
  • Imports the function directly from cuda.bindings.utils

Rationale

When cuda-bindings is built against CUDA 13 headers but the user's driver only supports CUDA 12, many features will silently fail or behave unexpectedly. This check provides early, clear feedback:

UserWarning: cuda-bindings was built against CUDA 13.0, but the installed driver 
only supports CUDA 12.8. Some features may not work correctly. Consider updating 
your NVIDIA driver. Set CUDA_PYTHON_DISABLE_VERSION_CHECK=1 to suppress this warning.

Design

  • Provided by cuda-bindings: The version check implementation lives in cuda.bindings.utils since it checks cuda-bindings' compile-time version
  • Invoked by cuda-core: Called when Device first triggers CUDA initialization
  • Runs once: Uses a module-level flag to ensure the check runs only once per process
  • Non-blocking: Warning only, does not prevent operation
  • Suppressible: Set CUDA_PYTHON_DISABLE_VERSION_CHECK=1 to disable
  • Major version only: No warning for minor version differences (handled by graceful degradation per PR Handle unsupported device attributes gracefully #1409)

Future Work

We could not find a suitable place to invoke the version check automatically within cuda-bindings itself (e.g., hooking into cuInit), so the check is currently triggered by cuda-core. This may be revisited in the future.

Test Coverage

7 tests in cuda-bindings covering:

  • No warning when driver is newer
  • No warning when same major version
  • Warning when compile major > driver major
  • Warning only issued once per process
  • Suppression via environment variable
  • Silent handling of driver errors
  • Silent handling of missing CUDA_VERSION attribute

Related Work

@Andy-Jost Andy-Jost added this to the cuda.core beta 11 milestone Dec 18, 2025
@Andy-Jost Andy-Jost added P0 High priority - Must do! feature New feature or request cuda.core Everything related to the cuda.core module labels Dec 18, 2025
@Andy-Jost Andy-Jost self-assigned this Dec 18, 2025
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Dec 18, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Andy-Jost
Copy link
Contributor Author

/ok to test 7ce325c

@Andy-Jost Andy-Jost force-pushed the runtime-version-check branch from 7ce325c to 1962e35 Compare December 18, 2025 19:08
@Andy-Jost
Copy link
Contributor Author

/ok to test 1962e35

@github-actions
Copy link

@Andy-Jost Andy-Jost force-pushed the runtime-version-check branch 2 times, most recently from 8a0d248 to 73e5a43 Compare December 19, 2025 00:09
@Andy-Jost
Copy link
Contributor Author

/ok to test 73e5a43

@Andy-Jost Andy-Jost force-pushed the runtime-version-check branch from 73e5a43 to ddd92bd Compare December 19, 2025 00:47
@Andy-Jost
Copy link
Contributor Author

/ok to test ddd92bd

@leofang leofang self-requested a review December 19, 2025 01:22
Copy link
Collaborator

@rparolin rparolin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should issue a warning to the user if we can't fetch the driver version.

Add warn_if_cuda_major_version_mismatch() to cuda-bindings that warns
when cuda-bindings was compiled for a newer CUDA major version than
the installed driver supports. Called by cuda.core on first Device access.
@Andy-Jost
Copy link
Contributor Author

/ok to test 62dfcca

@Andy-Jost Andy-Jost requested a review from rparolin January 8, 2026 17:48
Import warn_if_cuda_major_version_mismatch locally in Device.__new__
after cuInit, using try/except/else pattern instead of module-level
import with lambda fallback.
@Andy-Jost Andy-Jost force-pushed the runtime-version-check branch from 07bf4a4 to b2083ed Compare January 8, 2026 18:00
@Andy-Jost
Copy link
Contributor Author

/ok to test b2083ed

Extract Device.__new__ logic into cdef helper functions:
- Device_ensure_cuda_initialized(): cuInit + version check
- Device_resolve_device_id(): resolve None to current device or 0
- Device_ensure_tls_devices(): create thread-local singletons

Reduces Device.__new__ from ~60 lines to ~12 lines.
Helpers placed after Device class following memory module pattern.
@Andy-Jost Andy-Jost force-pushed the runtime-version-check branch from a92ed3b to fdb3a7e Compare January 8, 2026 18:18
@Andy-Jost
Copy link
Contributor Author

/ok to test fdb3a7e


def setup_method(self):
"""Reset the version compatibility check flag before each test."""
_version_check._major_version_compatibility_checked = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is effectively going to reset if the version was checked already and after the unit tests executes the check could potentially execute again. Shouldn't we cache the current state and before setting it to false for the unit tests and then restore to its previous value on unit test teardown?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone with a dodgy build might get multiple warnings during testing. Not sure it's worth fixing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should either mock it or restore whatever the original value was during teardown, I agree with Rob.

@Andy-Jost
Copy link
Contributor Author

/ok to test acadb31

Integrate RAII handles architecture from main while preserving the
version compatibility check feature. Resolved conflict in _device.pyx
by keeping the refactored helper functions and updating field names
to match RAII object structure.
@Andy-Jost Andy-Jost force-pushed the runtime-version-check branch from acadb31 to 3a5c210 Compare January 13, 2026 19:42
@Andy-Jost
Copy link
Contributor Author

/ok to test 3a5c210

@Andy-Jost
Copy link
Contributor Author

/ok to test 424a113

@Andy-Jost
Copy link
Contributor Author

/ok to test 6071609

Copy link
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for my late review.

Comment on lines 24 to 27
global _major_version_compatibility_checked
if _major_version_compatibility_checked:
return
_major_version_compatibility_checked = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. need a lock + double check pattern
  2. move _major_version_compatibility_checked = True to the code later because we could raise below

Copy link
Contributor Author

@Andy-Jost Andy-Jost Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Added a lock
  2. This was intentional. If the code attempting to perform the check fails for some reason, we would never want to retry it.


- ``CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM`` : When set to 1, the default stream is the per-thread default stream. When set to 0, the default stream is the legacy default stream. This defaults to 0, for the legacy default stream. See `Stream Synchronization Behavior <https://docs.nvidia.com/cuda/cuda-runtime-api/stream-sync-behavior.html>`_ for an explanation of the legacy and per-thread default streams.

- ``CUDA_PYTHON_DISABLE_MAJOR_VERSION_WARNING`` : When set to 1, suppresses warnings about CUDA major version mismatches between ``cuda-bindings`` and the installed driver.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I am not sure if this env var is really useful, because the added API is entirely opt-in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version check can be triggered indirectly from the end user's POV. For example, by using cuda-core. Without this, each library that calls the version-check function would need its own opt-out, and users might need to set them all. To me it makes sense to place the opt-out alongside the implementation and there is little/no cost.


def setup_method(self):
"""Reset the version compatibility check flag before each test."""
_version_check._major_version_compatibility_checked = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should either mock it or restore whatever the original value was during teardown, I agree with Rob.

return GraphBuilder._init(stream=self.create_stream(), is_stream_owner=True)


cdef inline void Device_ensure_cuda_initialized() except *:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We prefer cdef inline int Device_ensure_cuda_initialized() except? -1: to avoid an unconditional exception check (which Cython warns at build time).

Comment on lines +1364 to +1369
try:
from cuda.bindings.utils import warn_if_cuda_major_version_mismatch
except ImportError:
pass
else:
warn_if_cuda_major_version_mismatch()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to move imports like this to the top so as to avoid the little overheads in a hot loop. Please benchmark the constructor to see the perf diff before/after this PR. As per @mdboom such imports, despite they are cached, have a noticeable perf overhead (on the order that we care about).

Copy link
Contributor Author

@Andy-Jost Andy-Jost Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the general advice. Note that in this case, the block is guarded by _is_cuInit so the import is only attempted once. I decided to bury this import because 1) it would otherwise add clutter to the top-of-file import list, which human readers parse relatively often, 2) an import failure is ignored, so it can't hurt anything, and 3) there is no performance impact, since the import is only attempted once.

@Andy-Jost
Copy link
Contributor Author

/ok to test 96532b2

@Andy-Jost Andy-Jost force-pushed the runtime-version-check branch from 96532b2 to 973af9b Compare January 14, 2026 17:51
@Andy-Jost
Copy link
Contributor Author

/ok to test 973af9b

@Andy-Jost Andy-Jost force-pushed the runtime-version-check branch from 973af9b to 27732f2 Compare January 14, 2026 17:56
Replace setup_method/teardown_method with a pytest fixture that uses
monkeypatch to properly save and restore the original value of
_major_version_compatibility_checked after each test.

Minor change to Cython cdef inline helper function signature.
@Andy-Jost Andy-Jost force-pushed the runtime-version-check branch from 27732f2 to 7f23b08 Compare January 14, 2026 18:03
@Andy-Jost
Copy link
Contributor Author

/ok to test 7f23b08

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.core Everything related to the cuda.core module feature New feature or request P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants