Skip to content

Conversation

@Nadejde
Copy link
Contributor

@Nadejde Nadejde commented Jan 15, 2026

This PR aims to improve the way a validator executes one agent code mainly to increase speed and flexibility. Speed is very useful for local development and debugging while the flexibility should help on the live validator side. The PR includes 3 parts:

  • Proxy docker container now runs 128 workers. This is to support parallel agent runs and eventually agents that might do multiple LLM calls concurrently. The idea is not to have the proxy service be a bottleneck for agent executions

  • Docker image for agent sandbox is now built locally before execution. This should help with:
    ** Reducing the workload and dependency on prebuilt images for each project on different platforms and operating systems
    ** Docker images should be more robust now and validator should be able to work on any system
    ** You can now quickly add new projects into the validation pipeline just by adding the project name into the projects list and restarting the validator (no image builds required)
    ** Quickly adding more libraries for agents to use without having to centrally rebuild all the docker images.
    ** There is very little overhead on the validator side because the docker build is only done the first time as docker then cashes the images.

  • The validator now runs multiple sandbox containers one for each project. The agent runs and solves projects in parallel instead of in sequence. This saves a lot of time with just a very small increase in cost (the sandbox containers use very little resources). I have found this very useful for development and testing and it might be useful to speed up evaluations on the live system as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant