-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Required prerequisites
- Make sure you've read the documentation. Your issue may be addressed there.
- Search the issue tracker and Discussions to verify that this hasn't already been reported. +1 or comment there if it has.
- Consider asking first in the Gitter chat room or in a Discussion.
What version (or hash if on master) of pybind11 are you using?
3.0.1
Problem description
Overview
I am currently updating my C++ extension to support Python subinterpreters (PEP 734), utilizing Pybind11 3.x and Python 3.14+.
I have discovered that pybind11::gil_safe_call_once_and_store is fundamentally unsafe in a multi-interpreter environment.
It relies on static (process-global) storage to cache Python objects.
When the interpreter that initialized the static storage is destroyed, the cached pointers become invalid, leading to segmentation faults when subsequent interpreters attempt to access them.
The core issue is a lifetime mismatch between C++ static storage and Python interpreter contexts.
Technical Diagnosis
- Module Imports: Imported modules (e.g.,
collections) are interpreter-dependent. Caching them statically means retaining a reference to a module object belonging to a specific interpreter. - Interned/Immutable Objects: Even for immutable objects (e.g.,
float,int,strcan be shared between interpreters) like interned strings (PyUnicode_InternFromString), if the cached result is created by a subinterpreter, and that subinterpreter is destroyed, the static pointer stored bygil_safe_call_once_and_storebecomes a dangling pointer.
Reproduction
The issue triggers a segmentation fault when a subinterpreter initializes the static cache and is then destroyed before another interpreter accesses it.
- CI Failure/Core Dump: https://github.com/metaopt/optree/actions/runs/20019607592/job/57403715607?pr=245#step:18:266
Problematic C++ Pattern
-
Module imports are interpreter-dependent. The previous best practice code is invalid under the subinterpreters context. I need to re-fetch the object every time instead of having a per-process static cache.
#if defined(MYPACAKGE_HAS_SUBINTERPRETER_SUPPORT) inline py::object get_defaultdict() { return py::getattr(py::module_::import("collections"), "defaultdict"); } #else inline const py::object &get_defaultdict() { PYBIND11_CONSTINIT static py::gil_safe_call_once_and_store<py::object> storage; return storage .call_once_and_store_result([]() -> py::object { return py::getattr(py::module_::import("collections"), "defaultdict"); }) .get_stored(); } #endif
-
Immutable objects, such as,
float,int,strcan be shared between interpreters. ButPYBIND11_CONSTINIT static pybind11::gil_safe_call_once_and_storewill cause a segmentation fault when the C++ static is initialized by the subinterpreter, not the main interpreter. The stored result ofcall_onceis created by a subinterpreter, which may be gone when another interpreter accesses the result.The test case that triggers the issue:
def test_import_in_subinterpreter_before_main(): """ Triggers segfault by initializing the C++ static cache in a subinterpreter, destroying that interpreter, and then accessing the cache in the main interpreter. """ script = textwrap.dedent(""" import contextlib import gc from concurrent import interpreters # 1. Initialize library in a subinterpreter (sets the static C++ pointer) subinterpreter = None with contextlib.closing(interpreters.create()) as subinterpreter: subinterpreter.exec('import optree') # 2. Subinterpreter dies here. The cached object in C++ is now invalid. # 3. Import in main interpreter tries to read the invalid static pointer -> Segfault import optree del optree, subinterpreter for _ in range(10): gc.collect() """) check_script_in_subprocess(script, rerun=5) def test_import_in_subinterpreters_concurrently(): script = textwrap.dedent(""" from concurrent.futures import InterpreterPoolExecutor, as_completed def check_import(): import optree with InterpreterPoolExecutor(max_workers=32) as executor: futures = [executor.submit(check_import) for _ in range(128)] for future in as_completed(futures): future.result() """) check_script_in_subprocess(script, rerun=5)
I have resolved this in my project by disabling
pybind11::gil_safe_call_once_and_storeentirely when subinterpreter support is detected. Instead, I re-create objects every time they are needed to ensure they belong to the current interpreter context.#if defined(MYPACAKGE_HAS_SUBINTERPRETER_SUPPORT) # define Py_Declare_ID(name) \ namespace { \ [[nodiscard]] inline PyObject *Py_ID_##name() { \ PyObject * const ptr = PyUnicode_InternFromString(#name); \ if (ptr == nullptr) [[unlikely]] { \ throw py::error_already_set(); \ } \ return ptr; \ } \ } // namespace #else # define Py_Declare_ID(name) \ namespace { \ [[nodiscard]] inline PyObject *Py_ID_##name() { \ PYBIND11_CONSTINIT static py::gil_safe_call_once_and_store<PyObject *> storage; \ return storage \ .call_once_and_store_result([]() -> PyObject * { \ PyObject * const ptr = PyUnicode_InternFromString(#name); \ if (ptr == nullptr) [[unlikely]] { \ throw py::error_already_set(); \ } \ Py_INCREF(ptr); /* leak a reference on purpose */ \ return ptr; \ }) \ .get_stored(); \ } \ } // namespace #endif #define Py_Get_ID(name) (::Py_ID_##name())
Reproducible example code
Is this a regression? Put the last known working version here if it is.
Not a regression