Skip to content

Docker container does not contain libcuda.so #195

@hscarter

Description

@hscarter

Hello!

I built a docker container from your Dockerfile, then converted it to Singularity to run on an HPC system.

When attempting to run

python -m evo2.test.test_evo2_generation --model_name evo2_7b

from within the container, I get the error:

AssertionError: libcuda.so cannot found!

For anyone using the container to avoid installing the dependencies, this library should be added as well. I've pasted the full error traceback below.

Thanks!

Traceback (most recent call last): File "<frozen runpy>", line 189, in _run_module_as_main File "<frozen runpy>", line 112, in _get_module_details File "/usr/local/lib/python3.12/dist-packages/evo2/__init__.py", line 1, in <module> from .models import Evo2 File "/usr/local/lib/python3.12/dist-packages/evo2/models.py", line 12, in <module> from vortex.model.model import StripedHyena File "/usr/local/lib/python3.12/dist-packages/vortex/model/model.py", line 15, in <module> from vortex.model.layers import ( File "/usr/local/lib/python3.12/dist-packages/vortex/model/layers.py", line 10, in <module> from transformer_engine.pytorch import Linear File "/usr/local/lib/python3.12/dist-packages/transformer_engine/__init__.py", line 13, in <module> from . import pytorch File "/usr/local/lib/python3.12/dist-packages/transformer_engine/pytorch/__init__.py", line 95, in <module> from transformer_engine.pytorch.permutation import ( File "/usr/local/lib/python3.12/dist-packages/transformer_engine/pytorch/permutation.py", line 11, in <module> import transformer_engine.pytorch.triton.permutation as triton_permutation File "/usr/local/lib/python3.12/dist-packages/transformer_engine/pytorch/triton/permutation.py", line 158, in <module> _permute_kernel = triton.autotune( ^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/triton/runtime/autotuner.py", line 368, in decorator return Autotuner(fn, fn.arg_names, configs, key, reset_to_zero, restore_value, pre_hook=pre_hook, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/triton/runtime/autotuner.py", line 130, in __init__ self.do_bench = driver.active.get_benchmarker() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/triton/runtime/driver.py", line 23, in __getattr__ self._initialize_obj() File "/usr/local/lib/python3.12/dist-packages/triton/runtime/driver.py", line 20, in _initialize_obj self._obj = self._init_fn() ^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/triton/runtime/driver.py", line 9, in _create_driver return actives[0]() ^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/driver.py", line 450, in __init__ self.utils = CudaUtils() # TODO: make static ^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/driver.py", line 80, in __init__ mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "cuda_utils") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/driver.py", line 57, in compile_module_from_src so = _build(name, src_path, tmpdir, library_dirs(), include_dir, libraries) ^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/driver.py", line 45, in library_dirs return [libdevice_dir, *libcuda_dirs()] ^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/triton/backends/nvidia/driver.py", line 39, in libcuda_dirs assert any(os.path.exists(os.path.join(path, 'libcuda.so.1')) for path in dirs), msg ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError: libcuda.so cannot found! Possible files are located at ['/usr/local/cuda/compat/lib/libcuda.so.1'].Please create a symlink of libcuda.so to any of the files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions