Skip to content

Conversation

@enyst
Copy link
Collaborator

@enyst enyst commented Nov 14, 2025

Summary

  • Implement ApplyPatch tool compatible with OpenAI GPT-5.1 apply_patch prompting format
  • Port reference logic from OpenAI cookbook (apply_patch.py) into openhands-tools/openhands/tools/apply_patch/core.py with type hints and safer IO injection
  • Add ToolDefinition and executor, registered as apply_patch
  • New example: examples/01_standalone_sdk/28_apply_patch_with_gpt5_1.py demonstrates creating/modifying/deleting FACTS.txt via tool using openai/gpt-5.1-codex-mini

Design notes

  • Follows existing Tool patterns (ToolDefinition, ToolAnnotations, executor). Enforces workspace root, rejects absolute/escaping paths
  • Core diff/patch logic is pure and injected with open/read/write/remove callables for testability
  • Observation returns message, fuzz metric, and Commit summary

Usage

  • Register default tools and include Tool(name="apply_patch") or rely on auto-registration
  • Example runs against direct OpenAI API (requires OPENAI_API_KEY):
    uv run python examples/01_standalone_sdk/28_apply_patch_with_gpt5_1.py

Responses API/tool-calling

  • GPT-5.1 models default to OpenAI Responses API in our SDK; some accounts require verified org for reasoning summaries and tool_output semantics. The example sets LLM.native_tool_calling=False to route via Chat Completions path for a straightforward tool loop, and avoids setting reasoning_summary

Testing performed

  • Pre-commit hooks (ruff/pydantic/pyright) pass on changed files
  • Ran the example; initial attempts with Responses path hit two known OpenAI errors (missing tool output in same request; org verification for reasoning.summary). Adjusted example + sdk to allow forcing Chat Completions path when native_tool_calling=False
  • Functional validation of parser and executor via local calls; example integrates end-to-end

Next steps

  • Optional: add apply_patch to default preset tools
  • Optional: more unit tests around fuzzy matching and multi-chunk edits

Co-authored-by: openhands openhands@all-hands.dev

@enyst can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:56abec9-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-56abec9-python \
  ghcr.io/openhands/agent-server:56abec9-python

All tags pushed for this build

ghcr.io/openhands/agent-server:56abec9-golang-amd64
ghcr.io/openhands/agent-server:56abec9-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:56abec9-golang-arm64
ghcr.io/openhands/agent-server:56abec9-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:56abec9-java-amd64
ghcr.io/openhands/agent-server:56abec9-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:56abec9-java-arm64
ghcr.io/openhands/agent-server:56abec9-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:56abec9-python-amd64
ghcr.io/openhands/agent-server:56abec9-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:56abec9-python-arm64
ghcr.io/openhands/agent-server:56abec9-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:56abec9-golang
ghcr.io/openhands/agent-server:56abec9-java
ghcr.io/openhands/agent-server:56abec9-python

About Multi-Architecture Support

  • Each variant tag (e.g., 56abec9-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 56abec9-python-amd64) are also available if needed

enyst and others added 2 commits November 14, 2025 06:44
…tch format\n\n- Port reference parser/applicator into core module\n- Integrate as ToolDefinition with executor and safe workspace I/O\n- New example using openai/gpt-5.1-codex-mini to create/modify/delete FACTS.txt via tool\n\nCo-authored-by: openhands <openhands@all-hands.dev>
…1 example; keep reasoning_summary=None to avoid org verification

Co-authored-by: openhands <openhands@all-hands.dev>
@enyst enyst marked this pull request as draft November 14, 2025 07:14
@enyst enyst changed the title feat(tools): Add ApplyPatch tool (GPT-5.1 apply_patch) + example using openai/gpt-5.1-codex-mini feat(tools): Add ApplyPatch tool (GPT-5.1 apply_patch) Nov 14, 2025
enyst and others added 6 commits November 14, 2025 07:41
…example: run gpt-5.1-codex-mini end-to-end

- Override to_responses_tool() to return minimal {type:function, name}
- Add validation_alias/serialization_alias so server-sent 'patch' is accepted
- Update example to register ApplyPatch and run create/modify/delete FACTS.txt

Co-authored-by: openhands <openhands@all-hands.dev>
…ix example tool registration

- Add AliasChoices to accept server arg name 'patch'
- Provide minimal parameters in to_responses_tool to steer Responses tools
- Example: remove register_default_tools; import classes to register

Co-authored-by: openhands <openhands@all-hands.dev>
…input; send only tool outputs

- Remove assistant function_call emission in Message.to_responses_dict
- Stop passing back previous turn 'reasoning' items to avoid ordering errors
- This aligns with OpenAI Responses pattern: client supplies only function_call_output

Co-authored-by: openhands <openhands@all-hands.dev>
…assthrough

- Keep assistant function_call in input so paired tool outputs validate
- Do not echo prior 'reasoning' items to avoid ordering constraints

Co-authored-by: openhands <openhands@all-hands.dev>
- Capture design choices, telemetry observations, and pending items
- Reference PR #1166 and branch name for continuity

Co-authored-by: openhands <openhands@all-hands.dev>
…tant function_call in input

- Reinstate reasoning passthrough to satisfy existing unit tests
- Keep assistant function_call items and pair with tool outputs in same request

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Contributor

github-actions bot commented Nov 14, 2025

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-tools/openhands/tools/apply_patch
   core.py30225216%40–46, 51–52, 56–57, 62–63, 91–95, 98–101, 104–110, 113–122, 124–133, 136–146, 149, 152–154, 163–179, 182–188, 191–195, 197–205, 208–209, 212–217, 226–227, 229–233, 236–238, 241–242, 248–254, 260–268, 278–292, 294–298, 305–315, 322–329, 333–334, 339, 341, 346–347, 351–358, 362–369, 373–374, 378–394, 398–401, 404–405, 408–410, 416, 432–435, 440, 448–458, 460, 473–479
   definition.py522846%71, 75, 80–82, 91–94, 96–100, 102–104, 106–107, 112–113, 115, 118–120, 140–141, 169
TOTAL13203628852% 

enyst and others added 3 commits November 14, 2025 08:39
…_output ids

- Fix imports and formatting for pre-commit

Co-authored-by: openhands <openhands@all-hands.dev>
…esponses notes

- FileEditor example mirrors ApplyPatch for log comparison
- Notes: restore reasoning passthrough; pair assistant calls with tool outputs

Co-authored-by: openhands <openhands@all-hands.dev>
…t; fix pairing error.

Telemetry: trim system instructions from logs and compact tool metadata for readability.

Co-authored-by: openhands <openhands@all-hands.dev>
@enyst enyst force-pushed the feat/apply-patch-tool-gpt5-1 branch from 9a9234e to e910dbf Compare November 14, 2025 20:19
enyst and others added 2 commits November 14, 2025 20:24
… rejection.\n\nAligned expected newline behavior with upstream apply_patch (no enforced trailing newline).\n\nCo-authored-by: openhands <openhands@all-hands.dev>
…le and status update.

Co-authored-by: openhands <openhands@all-hands.dev>
@enyst enyst force-pushed the feat/apply-patch-tool-gpt5-1 branch from 17d2380 to e772cbb Compare November 14, 2025 21:31
enyst and others added 7 commits November 14, 2025 21:33
Co-authored-by: openhands <openhands@all-hands.dev>
…ross action and executor.

- Update tests to pass 'patch' field
- Keep Responses tool schema minimal with required 'patch'

Co-authored-by: openhands <openhands@all-hands.dev>
… mention of 'patch_text'.

Co-authored-by: openhands <openhands@all-hands.dev>
…y diff).

Co-authored-by: openhands <openhands@all-hands.dev>
…via Tool(name="apply_patch").

Co-authored-by: openhands <openhands@all-hands.dev>
…d clarify Responses schema.

Co-authored-by: openhands <openhands@all-hands.dev>
@enyst enyst marked this pull request as ready for review November 22, 2025 18:18
@OpenHands OpenHands deleted a comment from openhands-ai bot Nov 22, 2025
enyst and others added 2 commits November 24, 2025 15:49
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Collaborator Author

enyst commented Nov 24, 2025

I’m OpenHands-GPT-5.1, running this change end-to-end against the examples.

I validated that the updated preset (get_default_agentget_default_tools(model_name=llm.model)) still works by running the stuck-detector example against the litellm proxy:

cd software-agent-sdk
LLM_API_KEY="$LITELLM_API_KEY" LLM_BASE_URL="https://llm-proxy.eval.all-hands.dev" LLM_MODEL="openai/gpt-4.1-mini" uv run python examples/01_standalone_sdk/20_stuck_detector.py

Result:

  • Example completed successfully using tools: terminal, file_editor, task_tracker, browser_tool_set.
  • Stuck detection triggered as expected (Final stuck status: True).
  • A finite EXAMPLE_COST was printed and the script exited cleanly.

This confirms the preset wiring is intact and non‑GPT‑5 models still default to file_editor as intended.

@enyst enyst requested a review from xingyaoww November 24, 2025 17:48
@OpenHands OpenHands deleted a comment from openhands-ai bot Nov 25, 2025
- Create multiple files (FACTS.txt, NOTES.md)
- Apply multi-hunk edits to a single file
- Apply a single patch that touches multiple files
- Mix add / update / delete operations in one patch
Copy link
Collaborator Author

@enyst enyst Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this script to remain as example?

IMO, if the tool is not default for GPT-5 yet, it makes sense and it would be nice for people to have a way to see how they can use it. If it is default (and according to your comment before, it should be, that's done), then maybe it's not very useful for client code.

Still useful for testing: it actually tests functionalities like multiple files. But maybe testing is another story: we could take in GPT-5 optimizations, and then make sure that the integration tests use them. WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i don't think we absolutely need this example 🤔

But irrc you already wrote the docs, maybe we can keep it and add this example as a page under LLM features of https://docs.openhands.dev/sdk/guides/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, since now it’s not a default tool, I’d tend to keep it 🤔 It’s easy for the agent, even, to directly execute and see the tool in action

More importantly, for client developers / contributors to get to test a GPT-5 specific tool, maybe it gives people ideas for experimenting with it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a strong opinion though, I just would love to continue assembling prompts too and simply make them available

Copy link
Collaborator Author

@enyst enyst Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it, and updated the docs example to be a simple script:

… example in this PR

- Revert model-aware default tools wiring from preset/default.py
- Remove model-specific default tools logic from model_features.py
- Drop preset default tests tied to GPT-5 mapping

Follow-up PR will carry model-aware defaults + wiring.

Co-authored-by: openhands <openhands@all-hands.dev>


def test_function_call_and_output_paired():
# Assistant emits a function_call; tool returns an output for same id
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test exists because the agent had some trouble when it was testing apply_patch initially. I think it’s a reasonable test, maybe a little too obvious (practically nothing with GPT-5/Responses would work if it didn’t), but I’d keep it

@enyst enyst requested a review from xingyaoww November 29, 2025 21:16
@enyst
Copy link
Collaborator Author

enyst commented Nov 29, 2025

@xingyaoww Done, it looks to me like the commit is exactly right, it undid the wiring. The rest has been tested before so unless there’s some lost comma messing with us, it seems good to go. 😇

@openhands-ai
Copy link

openhands-ai bot commented Nov 29, 2025

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Check Documented Examples

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1166 at branch `feat/apply-patch-tool-gpt5-1`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

- Create multiple files (FACTS.txt, NOTES.md)
- Apply multi-hunk edits to a single file
- Apply a single patch that touches multiple files
- Mix add / update / delete operations in one patch
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i don't think we absolutely need this example 🤔

But irrc you already wrote the docs, maybe we can keep it and add this example as a page under LLM features of https://docs.openhands.dev/sdk/guides/

@enyst enyst enabled auto-merge (squash) November 30, 2025 15:17
@enyst enyst merged commit 2d100fc into main Nov 30, 2025
21 checks passed
@enyst enyst deleted the feat/apply-patch-tool-gpt5-1 branch November 30, 2025 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants