Skip to content

Conversation

@cdreetz
Copy link
Collaborator

@cdreetz cdreetz commented Dec 13, 2025

Description

Add [tool.hatch.build] to math-python pyproject. Without it running vf-eval math-python results in error:


christian@math_python$ uv run vf-eval math-python -n 1 -r 1
warning: The `extra-build-dependencies` option is experimental and may change without warning. Pass `--preview-features extra-build-dependencies` to disable this warning.
      Built math-python @ file:///Users/christian/dev/my-prime/my-verifiers/docker-sandbox/verifiers/environments/math_python
Uninstalled 1 package in 2ms
Installed 1 package in 1ms
2025-12-13 14:53:56 - verifiers.utils.eval_utils - WARNING - No local endpoint registry found at ./configs/endpoints.py. Please specify the model name (-m), API host base URL (-b), and API key variable name (-k). Error details: endpoints.py not found at configs/endpoints.py
2025-12-13 14:53:56 - verifiers.utils.env_utils - INFO - Loading environment: math-python
2025-12-13 14:53:56 - verifiers.utils.env_utils - ERROR - Failed to import environment module math_python for env_id math-python: No module named 'math_python'
Traceback (most recent call last):
  File "/Users/christian/dev/verifiers/verifiers/utils/env_utils.py", line 15, in load_environment
    module = importlib.import_module(module_name)
  File "/opt/homebrew/Cellar/python@3.13/3.13.3/Frameworks/Python.framework/Versions/3.13/lib/python3.13/importlib/__init__.py", line 88, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1324, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'math_python'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/christian/.local/bin/vf-eval", line 10, in <module>
    sys.exit(main())
             ~~~~^^
  File "/Users/christian/dev/verifiers/verifiers/scripts/eval.py", line 239, in main
    asyncio.run(run_evaluation(eval_config))
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.3/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ~~~~~~~~~~^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.3/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.3/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/base_events.py", line 719, in run_until_complete
    return future.result()
           ~~~~~~~~~~~~~^^
  File "/Users/christian/dev/verifiers/verifiers/utils/eval_utils.py", line 110, in run_evaluation
    vf_env = vf.load_environment(env_id=config.env_id, **config.env_args)
  File "/Users/christian/dev/verifiers/verifiers/utils/env_utils.py", line 81, in load_environment
    raise ValueError(
        f"Could not import '{env_id}' environment. Ensure the package for the '{env_id}' environment is installed."
    ) from e
ValueError: Could not import 'math-python' environment. Ensure the package for the 'math-python' environment is installed.

But after:


christian@math_python$ uv run vf-eval math-python -n 1 -r 1
warning: The `extra-build-dependencies` option is experimental and may change without warning. Pass `--preview-features extra-build-dependencies` to disable this warning.
/Users/christian/dev/my-prime/my-verifiers/docker-sandbox/verifiers/environments/math_python/.venv/lib/python3.14/site-packages/multiprocess/connection.py:335: SyntaxWarning: 'return' in a 'finally' block
  return f
/Users/christian/dev/my-prime/my-verifiers/docker-sandbox/verifiers/environments/math_python/.venv/lib/python3.14/site-packages/multiprocess/connection.py:337: SyntaxWarning: 'return' in a 'finally' block
  return self._get_more_data(ov, maxsize)
2025-12-13 14:54:36 - verifiers.utils.eval_utils - WARNING - No local endpoint registry found at ./configs/endpoints.py. Please specify the model name (-m), API host base URL (-b), and API key variable name (-k). Error details: endpoints.py not found at configs/endpoints.py
2025-12-13 14:54:36 - verifiers.utils.env_utils - INFO - Loading environment: math-python
2025-12-13 14:54:36 - verifiers.utils.env_utils - INFO - Using default args: sandbox_memory_gb=2, dataset_split='train', sandbox_cpu_cores=1, num_train_examples=-1, dataset_name='math', max_startup_wait_seconds=30, pip_install_packages='numpy sympy scipy', sandbox_timeout_per_command_seconds=30, sandbox_timeout_minutes=60, max_turns=100, sandbox_gpu_count=0, sandbox_disk_size_gb=5
2025-12-13 14:54:39 - verifiers.rubrics.RubricGroup - INFO - Initialized RubricGroup with 2 rubrics
2025-12-13 14:54:39 - verifiers.utils.env_utils - INFO - Successfully loaded environment 'math-python'
2025-12-13 14:54:39 - verifiers.utils.eval_utils - INFO - Starting evaluation with model: gpt-4.1-mini
2025-12-13 14:54:39 - verifiers.utils.eval_utils - INFO - Configuration: num_examples=1, rollouts_per_example=1, max_concurrent=32
2025-12-13 14:54:39 - verifiers.envs.PythonEnv - INFO - eval_dataset is not set, falling back to train dataset
Processing 1 groups (1 total rollouts):   0%|                                                                     | 0/1 [00:00<?, ?it/s]

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Test improvement

Testing

  • All existing tests pass when running uv run pytest locally.
  • New tests have been added to cover the changes

Checklist

  • My code follows the style guidelines of this project as outlined in AGENTS.md
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Additional Notes


Note

Ensures the math-python package builds and installs correctly and defines default eval settings.

  • Configure Hatch build: Adds [tool.hatch.build] with include = ["math_python.py", "pyproject.toml"] so only required files are packaged
  • Eval defaults: Adds [tool.verifiers.eval] with num_examples = 20 and rollouts_per_example = 1

Written by Cursor Bugbot for commit db5e0d7. This will update automatically on new commits. Configure here.

@mikasenghaas
Copy link
Member

hm, it's installable for me with uv pip install -e environments/math_python

@willccbb
Copy link
Member

@mikasenghaas we should always have the hatch setup to ensure env hub installs + vf-install work properly

@willccbb
Copy link
Member

willccbb commented Jan 9, 2026

@cursoragent please add pyproject.toml to the hatch build so that the eval config is available from the module

@cursor
Copy link

cursor bot commented Jan 9, 2026

  • I updated environments/math_python/pyproject.toml to include pyproject.toml in the Hatch build, so the [tool.verifiers.eval] config is packaged and available at runtime.
[tool.hatch.build]
include = ["math_python.py", "pyproject.toml"]

This aligns with other environments (e.g., gsm8k) and ensures vf-eval can read the eval config from the installed module.


Open in Cursor Open in Web

Learn more about Cursor Agents

Co-authored-by: williambrown97 <williambrown97@gmail.com>
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ cdreetz
❌ cursoragent
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants