pytest-codingagents

Combatting cargo cult programming in Agent Instructions, Skills, and Custom Agents for GitHub Copilot and other coding agents since 2026.

Everyone's copying instruction files from blog posts, pasting "you are a senior engineer" into agent configs, and adding skills they found on Reddit. But does any of it actually work? Are your instructions making your coding agent better — or just longer? Is that skill helping, or is the agent ignoring it entirely?

You don't know, because you're not testing it.

pytest-codingagents is a pytest plugin that runs your actual coding agent configuration against real tasks — then uses AI analysis to tell you why things failed and what to fix.

Currently supports GitHub Copilot via copilot-sdk. More agents (Claude Code, etc.) coming soon.

from pytest_codingagents import CopilotAgent

async def test_create_file(copilot_run, tmp_path):
    agent = CopilotAgent(
        instructions="Create files as requested.",
        working_directory=str(tmp_path),
    )
    result = await copilot_run(agent, "Create hello.py with print('hello')")
    assert result.success
    assert result.tool_was_called("create_file")

Install

uv add pytest-codingagents

Authenticate via GITHUB_TOKEN env var (CI) or gh auth status (local).

What You Can Test

Capability	What it proves	Guide
Instructions	Your custom instructions actually produce the desired behavior — not just vibes	Getting Started
Skills	That domain knowledge file is helping, not being ignored	Skill Testing
Models	Which model works best for your use case and budget	Model Comparison
Custom Agents	Your custom agent configurations actually work as intended	Getting Started
MCP Servers	The agent discovers and uses your custom tools	MCP Server Testing
CLI Tools	The agent operates command-line interfaces correctly	CLI Tool Testing

AI Analysis

See it in action: Basic Report · Model Comparison · Instruction Testing

Every test run produces an HTML report with AI-powered insights:

Diagnoses failures — root cause analysis with suggested fixes
Compares models — leaderboards ranked by pass rate and cost
Evaluates instructions — which instructions produce better results
Recommends improvements — actionable changes to tools, instructions, and skills

uv run pytest tests/ --aitest-html=report.html --aitest-summary-model=azure/gpt-5.2-chat

Documentation

Full docs at sbroenne.github.io/pytest-codingagents — API reference, how-to guides, and demo reports.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github		.github
docs		docs
scripts		scripts
src/pytest_codingagents		src/pytest_codingagents
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pytest-codingagents

Install

What You Can Test

AI Analysis

Documentation

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

License

sbroenne/pytest-codingagents

Folders and files

Latest commit

History

Repository files navigation

pytest-codingagents

Install

What You Can Test

AI Analysis

Documentation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages