feat(tools): Add GPT-5.1 ApplyPatch tool #1166

enyst · 2025-11-14T06:57:21Z

Summary

Implement ApplyPatch tool compatible with OpenAI GPT-5.1 apply_patch prompting format
Port reference logic from OpenAI cookbook (apply_patch.py) into openhands-tools/openhands/tools/apply_patch/core.py with type hints and safer IO injection
Add ToolDefinition and executor, registered as apply_patch
New example: examples/01_standalone_sdk/28_apply_patch_with_gpt5_1.py demonstrates creating/modifying/deleting FACTS.txt via tool using openai/gpt-5.1-codex-mini

Design notes

Follows existing Tool patterns (ToolDefinition, ToolAnnotations, executor). Enforces workspace root, rejects absolute/escaping paths
Core diff/patch logic is pure and injected with open/read/write/remove callables for testability
Observation returns message, fuzz metric, and Commit summary

Usage

Register default tools and include Tool(name="apply_patch") or rely on auto-registration
Example runs against direct OpenAI API (requires OPENAI_API_KEY):
uv run python examples/01_standalone_sdk/28_apply_patch_with_gpt5_1.py

Responses API/tool-calling

GPT-5.1 models default to OpenAI Responses API in our SDK; some accounts require verified org for reasoning summaries and tool_output semantics. The example sets LLM.native_tool_calling=False to route via Chat Completions path for a straightforward tool loop, and avoids setting reasoning_summary

Testing performed

Pre-commit hooks (ruff/pydantic/pyright) pass on changed files
Ran the example; initial attempts with Responses path hit two known OpenAI errors (missing tool output in same request; org verification for reasoning.summary). Adjusted example + sdk to allow forcing Chat Completions path when native_tool_calling=False
Functional validation of parser and executor via local calls; example integrates end-to-end

Next steps

Optional: add apply_patch to default preset tools
Optional: more unit tests around fuzzy matching and multi-chunk edits

Co-authored-by: openhands openhands@all-hands.dev

@enyst can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:56abec9-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-56abec9-python \
  ghcr.io/openhands/agent-server:56abec9-python

All tags pushed for this build

ghcr.io/openhands/agent-server:56abec9-golang-amd64
ghcr.io/openhands/agent-server:56abec9-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:56abec9-golang-arm64
ghcr.io/openhands/agent-server:56abec9-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:56abec9-java-amd64
ghcr.io/openhands/agent-server:56abec9-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:56abec9-java-arm64
ghcr.io/openhands/agent-server:56abec9-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:56abec9-python-amd64
ghcr.io/openhands/agent-server:56abec9-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:56abec9-python-arm64
ghcr.io/openhands/agent-server:56abec9-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:56abec9-golang
ghcr.io/openhands/agent-server:56abec9-java
ghcr.io/openhands/agent-server:56abec9-python

About Multi-Architecture Support

Each variant tag (e.g., 56abec9-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 56abec9-python-amd64) are also available if needed

…tch format\n\n- Port reference parser/applicator into core module\n- Integrate as ToolDefinition with executor and safe workspace I/O\n- New example using openai/gpt-5.1-codex-mini to create/modify/delete FACTS.txt via tool\n\nCo-authored-by: openhands <openhands@all-hands.dev>

…1 example; keep reasoning_summary=None to avoid org verification Co-authored-by: openhands <openhands@all-hands.dev>

…example: run gpt-5.1-codex-mini end-to-end - Override to_responses_tool() to return minimal {type:function, name} - Add validation_alias/serialization_alias so server-sent 'patch' is accepted - Update example to register ApplyPatch and run create/modify/delete FACTS.txt Co-authored-by: openhands <openhands@all-hands.dev>

…ix example tool registration - Add AliasChoices to accept server arg name 'patch' - Provide minimal parameters in to_responses_tool to steer Responses tools - Example: remove register_default_tools; import classes to register Co-authored-by: openhands <openhands@all-hands.dev>

…input; send only tool outputs - Remove assistant function_call emission in Message.to_responses_dict - Stop passing back previous turn 'reasoning' items to avoid ordering errors - This aligns with OpenAI Responses pattern: client supplies only function_call_output Co-authored-by: openhands <openhands@all-hands.dev>

…assthrough - Keep assistant function_call in input so paired tool outputs validate - Do not echo prior 'reasoning' items to avoid ordering constraints Co-authored-by: openhands <openhands@all-hands.dev>

- Capture design choices, telemetry observations, and pending items - Reference PR #1166 and branch name for continuity Co-authored-by: openhands <openhands@all-hands.dev>

…tant function_call in input - Reinstate reasoning passthrough to satisfy existing unit tests - Keep assistant function_call items and pair with tool outputs in same request Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2025-11-14T08:35:26Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-tools/openhands/tools/apply_patch
core.py	302	252	16%	40–46, 51–52, 56–57, 62–63, 91–95, 98–101, 104–110, 113–122, 124–133, 136–146, 149, 152–154, 163–179, 182–188, 191–195, 197–205, 208–209, 212–217, 226–227, 229–233, 236–238, 241–242, 248–254, 260–268, 278–292, 294–298, 305–315, 322–329, 333–334, 339, 341, 346–347, 351–358, 362–369, 373–374, 378–394, 398–401, 404–405, 408–410, 416, 432–435, 440, 448–458, 460, 473–479
definition.py	52	28	46%	71, 75, 80–82, 91–94, 96–100, 102–104, 106–107, 112–113, 115, 118–120, 140–141, 169
TOTAL	13203	6288	52%

…_output ids - Fix imports and formatting for pre-commit Co-authored-by: openhands <openhands@all-hands.dev>

…esponses notes - FileEditor example mirrors ApplyPatch for log comparison - Notes: restore reasoning passthrough; pair assistant calls with tool outputs Co-authored-by: openhands <openhands@all-hands.dev>

…t; fix pairing error. Telemetry: trim system instructions from logs and compact tool metadata for readability. Co-authored-by: openhands <openhands@all-hands.dev>

… rejection.\n\nAligned expected newline behavior with upstream apply_patch (no enforced trailing newline).\n\nCo-authored-by: openhands <openhands@all-hands.dev>

…le and status update. Co-authored-by: openhands <openhands@all-hands.dev>

Co-authored-by: openhands <openhands@all-hands.dev>

…ross action and executor. - Update tests to pass 'patch' field - Keep Responses tool schema minimal with required 'patch' Co-authored-by: openhands <openhands@all-hands.dev>

… mention of 'patch_text'. Co-authored-by: openhands <openhands@all-hands.dev>

…y diff). Co-authored-by: openhands <openhands@all-hands.dev>

…via Tool(name="apply_patch"). Co-authored-by: openhands <openhands@all-hands.dev>

…d clarify Responses schema. Co-authored-by: openhands <openhands@all-hands.dev>

examples/01_standalone_sdk/28_apply_patch_with_gpt5_1.py

Co-authored-by: openhands <openhands@all-hands.dev>

enyst · 2025-11-24T17:47:13Z

I’m OpenHands-GPT-5.1, running this change end-to-end against the examples.

I validated that the updated preset (get_default_agent → get_default_tools(model_name=llm.model)) still works by running the stuck-detector example against the litellm proxy:

cd software-agent-sdk
LLM_API_KEY="$LITELLM_API_KEY" LLM_BASE_URL="https://llm-proxy.eval.all-hands.dev" LLM_MODEL="openai/gpt-4.1-mini" uv run python examples/01_standalone_sdk/20_stuck_detector.py

Result:

Example completed successfully using tools: terminal, file_editor, task_tracker, browser_tool_set.
Stuck detection triggered as expected (Final stuck status: True).
A finite EXAMPLE_COST was printed and the script exited cleanly.

This confirms the preset wiring is intact and non‑GPT‑5 models still default to file_editor as intended.

.github/run-eval/allowed-model-stubs.json

examples/01_standalone_sdk/28_apply_patch_with_gpt5_1.py

enyst · 2025-11-28T12:57:17Z

examples/01_standalone_sdk/28_apply_patch_with_gpt5_1.py

+- Create multiple files (FACTS.txt, NOTES.md)
+- Apply multi-hunk edits to a single file
+- Apply a single patch that touches multiple files
+- Mix add / update / delete operations in one patch


Do we want this script to remain as example?

IMO, if the tool is not default for GPT-5 yet, it makes sense and it would be nice for people to have a way to see how they can use it. If it is default (and according to your comment before, it should be, that's done), then maybe it's not very useful for client code.

Still useful for testing: it actually tests functionalities like multiple files. But maybe testing is another story: we could take in GPT-5 optimizations, and then make sure that the integration tests use them. WDYT?

yeah, i don't think we absolutely need this example 🤔

But irrc you already wrote the docs, maybe we can keep it and add this example as a page under LLM features of https://docs.openhands.dev/sdk/guides/

Actually, since now it’s not a default tool, I’d tend to keep it 🤔 It’s easy for the agent, even, to directly execute and see the tool in action

More importantly, for client developers / contributors to get to test a GPT-5 specific tool, maybe it gives people ideas for experimenting with it?

Not a strong opinion though, I just would love to continue assembling prompts too and simply make them available

I removed it, and updated the docs example to be a simple script:

docs PR

deployment

openhands-tools/openhands/tools/preset/default.py

… example in this PR - Revert model-aware default tools wiring from preset/default.py - Remove model-specific default tools logic from model_features.py - Drop preset default tests tied to GPT-5 mapping Follow-up PR will carry model-aware defaults + wiring. Co-authored-by: openhands <openhands@all-hands.dev>

enyst · 2025-11-29T21:07:39Z

tests/sdk/llm/test_responses_serialization.py



+def test_function_call_and_output_paired():
+    # Assistant emits a function_call; tool returns an output for same id


This test exists because the agent had some trouble when it was testing apply_patch initially. I think it’s a reasonable test, maybe a little too obvious (practically nothing with GPT-5/Responses would work if it didn’t), but I’d keep it

enyst · 2025-11-29T21:20:06Z

@xingyaoww Done, it looks to me like the commit is exactly right, it undid the wiring. The rest has been tested before so unless there’s some lost comma messing with us, it seems good to go. 😇

openhands-ai · 2025-11-29T21:22:17Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- Check Documented Examples

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1166 at branch `feat/apply-patch-tool-gpt5-1`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

xingyaoww

Thanks!

xingyaoww · 2025-11-30T00:31:18Z

examples/01_standalone_sdk/28_apply_patch_with_gpt5_1.py

+- Create multiple files (FACTS.txt, NOTES.md)
+- Apply multi-hunk edits to a single file
+- Apply a single patch that touches multiple files
+- Mix add / update / delete operations in one patch


yeah, i don't think we absolutely need this example 🤔

But irrc you already wrote the docs, maybe we can keep it and add this example as a page under LLM features of https://docs.openhands.dev/sdk/guides/

enyst and others added 2 commits November 14, 2025 06:44

chore(example): enable native tool calling (Responses API) for GPT-5.…

c5a0836

…1 example; keep reasoning_summary=None to avoid org verification Co-authored-by: openhands <openhands@all-hands.dev>

enyst marked this pull request as draft November 14, 2025 07:14

enyst changed the title ~~feat(tools): Add ApplyPatch tool (GPT-5.1 apply_patch) + example using openai/gpt-5.1-codex-mini~~ feat(tools): Add ApplyPatch tool (GPT-5.1 apply_patch) Nov 14, 2025

enyst and others added 6 commits November 14, 2025 07:41

Responses: include assistant function_call items but omit reasoning p…

f08a048

…assthrough - Keep assistant function_call in input so paired tool outputs validate - Do not echo prior 'reasoning' items to avoid ordering constraints Co-authored-by: openhands <openhands@all-hands.dev>

Docs(dev): ApplyPatch + OpenAI Responses integration notes

9ae743f

- Capture design choices, telemetry observations, and pending items - Reference PR #1166 and branch name for continuity Co-authored-by: openhands <openhands@all-hands.dev>

enyst and others added 3 commits November 14, 2025 08:39

Tests: add Responses pairing test for function_call and function_call…

a00128b

…_output ids - Fix imports and formatting for pre-commit Co-authored-by: openhands <openhands@all-hands.dev>

Examples+Docs: add FileEditor GPT-5.1 example and update ApplyPatch R…

35e3be3

…esponses notes - FileEditor example mirrors ApplyPatch for log comparison - Notes: restore reasoning passthrough; pair assistant calls with tool outputs Co-authored-by: openhands <openhands@all-hands.dev>

ApplyPatch: include text output so Responses gets function_call_outpu…

e910dbf

…t; fix pairing error. Telemetry: trim system instructions from logs and compact tool metadata for readability. Co-authored-by: openhands <openhands@all-hands.dev>

enyst force-pushed the feat/apply-patch-tool-gpt5-1 branch from 9a9234e to e910dbf Compare November 14, 2025 20:19

enyst and others added 2 commits November 14, 2025 20:24

Tests: add ApplyPatchExecutor tests for create/append/delete and path…

6cd9639

… rejection.\n\nAligned expected newline behavior with upstream apply_patch (no enforced trailing newline).\n\nCo-authored-by: openhands <openhands@all-hands.dev>

Docs: update ApplyPatch Responses notes with paired tool output examp…

e772cbb

…le and status update. Co-authored-by: openhands <openhands@all-hands.dev>

enyst force-pushed the feat/apply-patch-tool-gpt5-1 branch from 17d2380 to e772cbb Compare November 14, 2025 21:31

enyst and others added 7 commits November 14, 2025 21:33

Remove mistakenly committed FACTS.txt test artifact.

0149040

Co-authored-by: openhands <openhands@all-hands.dev>

ApplyPatch: drop patch_text alias; use canonical 'patch' name only ac…

e8978ad

…ross action and executor. - Update tests to pass 'patch' field - Keep Responses tool schema minimal with required 'patch' Co-authored-by: openhands <openhands@all-hands.dev>

Docs: update ApplyPatch notes to canonicalize on 'patch' only; remove…

73f7f81

… mention of 'patch_text'. Co-authored-by: openhands <openhands@all-hands.dev>

Revert openhands.sdk.llm.message to main branch version (comments-onl…

fab4f3b

…y diff). Co-authored-by: openhands <openhands@all-hands.dev>

Example: simplify tool registration; add ApplyPatch like other tools …

db5b177

…via Tool(name="apply_patch"). Co-authored-by: openhands <openhands@all-hands.dev>

ApplyPatch: align create() return type to Sequence; add docstrings an…

74f0d15

…d clarify Responses schema. Co-authored-by: openhands <openhands@all-hands.dev>

Merge branch 'main' into feat/apply-patch-tool-gpt5-1

a7eed40

enyst marked this pull request as ready for review November 22, 2025 18:18

OpenHands deleted a comment from openhands-ai bot Nov 22, 2025

Delete docs/dev/apply_patch_responses_notes.md

d1ce8f3

enyst commented Nov 22, 2025

View reviewed changes

examples/01_standalone_sdk/28_apply_patch_with_gpt5_1.py Outdated Show resolved Hide resolved

Update examples/01_standalone_sdk/28_apply_patch_with_gpt5_1.py

3078a95

enyst and others added 2 commits November 24, 2025 15:49

test: move apply_patch executor tests into subpackage

7fa1aab

Co-authored-by: openhands <openhands@all-hands.dev>

feat(preset): use apply_patch by default for GPT-5 models

6c89fd4

Co-authored-by: openhands <openhands@all-hands.dev>

enyst requested a review from xingyaoww November 24, 2025 17:48

enyst added 3 commits November 25, 2025 17:47

Merge branch 'main' into feat/apply-patch-tool-gpt5-1

5d99105

Update allowed-model-stubs.json

53f0301

Merge branch 'main' into feat/apply-patch-tool-gpt5-1

24da42f

enyst commented Nov 25, 2025

View reviewed changes

.github/run-eval/allowed-model-stubs.json Outdated Show resolved Hide resolved

enyst commented Nov 25, 2025

View reviewed changes

.github/run-eval/allowed-model-stubs.json Outdated Show resolved Hide resolved

Update .github/run-eval/allowed-model-stubs.json

9362ad9

OpenHands deleted a comment from openhands-ai bot Nov 25, 2025

enyst commented Nov 28, 2025

View reviewed changes

examples/01_standalone_sdk/28_apply_patch_with_gpt5_1.py Outdated Show resolved Hide resolved

Update examples/01_standalone_sdk/28_apply_patch_with_gpt5_1.py

bd4d338

enyst commented Nov 28, 2025

View reviewed changes

xingyaoww reviewed Nov 29, 2025

View reviewed changes

openhands-tools/openhands/tools/preset/default.py Outdated Show resolved Hide resolved

enyst mentioned this pull request Nov 29, 2025

feat(preset): model-aware default tools for GPT-5 (apply_patch over file_editor) #1281

Draft

enyst commented Nov 29, 2025

View reviewed changes

Merge branch 'main' into feat/apply-patch-tool-gpt5-1

adcb46d

enyst requested a review from xingyaoww November 29, 2025 21:16

xingyaoww approved these changes Nov 30, 2025

View reviewed changes

enyst mentioned this pull request Nov 30, 2025

docs(sdk): document GPT-5.1 ApplyPatch example OpenHands/docs#122

Open

enyst added 2 commits November 30, 2025 16:16

Delete examples/01_standalone_sdk/28_apply_patch_with_gpt5_1.py

be8e7c7

Merge branch 'main' into feat/apply-patch-tool-gpt5-1

adffe6a

enyst enabled auto-merge (squash) November 30, 2025 15:17

enyst merged commit 2d100fc into main Nov 30, 2025
21 checks passed

enyst deleted the feat/apply-patch-tool-gpt5-1 branch November 30, 2025 15:22

enyst mentioned this pull request Dec 7, 2025

tool: add image reader tool for local vision inputs #1306

Open



		def test_function_call_and_output_paired():
		# Assistant emits a function_call; tool returns an output for same id

feat(tools): Add GPT-5.1 ApplyPatch tool #1166

feat(tools): Add GPT-5.1 ApplyPatch tool #1166

Uh oh!

Conversation

enyst commented Nov 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

enyst commented Nov 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

enyst Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xingyaoww Nov 30, 2025

Choose a reason for hiding this comment

Uh oh!

enyst Nov 30, 2025

Choose a reason for hiding this comment

Uh oh!

enyst Nov 30, 2025

Choose a reason for hiding this comment

Uh oh!

enyst Nov 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

enyst Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

enyst commented Nov 29, 2025

Uh oh!

openhands-ai bot commented Nov 29, 2025

Uh oh!

xingyaoww left a comment

Choose a reason for hiding this comment

Uh oh!

xingyaoww Nov 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

enyst commented Nov 14, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Nov 14, 2025 •

edited

Loading

enyst Nov 28, 2025 •

edited

Loading

enyst Nov 30, 2025 •

edited

Loading