Human-in-the-Loop Implementation #2021

mjschock · 2025-11-01T01:47:43Z

Resolves #636.

OPENAI_API_KEY="your_api_key_here" uv run python examples/agent_patterns/human_in_the_loop.py

================================================================================
Run interrupted - tool approval required
================================================================================
State saved to result.json
Loading state from result.json

Tool call details:
  Agent: Weather Assistant
  Tool: get_temperature
  Arguments: {"city":"Oakland"}

Do you approve this tool call? (y/n): y
✓ Approved: get_temperature

Resuming agent execution...

================================================================================
Final Output:
================================================================================
The weather in Oakland is sunny, and the temperature is 20°C.

OPENAI_API_KEY="your_api_key_here" uv run python examples/agent_patterns/human_in_the_loop_stream.py 

================================================================================
Human-in-the-loop: approval required for the following tool calls:
================================================================================

Tool call details:
  Agent: Weather Assistant
  Tool: get_temperature
  Arguments: {"city":"Oakland"}

Do you approve this tool call? (y/n): y
✓ Approved: get_temperature

Resuming agent execution...

================================================================================
Final Output:
================================================================================
The current weather in Oakland is sunny, with a temperature of 20°C.

Done!

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-11-01T18:19:41Z

src/agents/run.py

        previous_response_id: str | None,
        conversation_id: str | None,
        session: Session | None,
+        run_state: RunState[TContext] | None = None,
    ):
        if streamed_result.trace:
            streamed_result.trace.start(mark_as_current=True)


Prime server tracker when resuming streaming runs

When Runner.run_streamed resumes from a RunState with conversation_id or previous_response_id, _start_streaming constructs a _ServerConversationTracker but never seeds it with the prior model_responses that were already sent. Unlike the synchronous path, no call to track_server_items is made, so prepare_input treats every previously generated item as unsent and resubmits them to the server. This duplicates earlier messages and breaks server-side conversation threading when a run is resumed.

Useful? React with 👍 / 👎.

Fixed in a56ce0a. Added server conversation tracker priming at lines 1076-1079 to match the non-streaming implementation and prevent message duplication when resuming from RunState.

chatgpt-codex-connector · 2025-11-01T18:19:41Z

src/agents/run.py

        previous_response_id: str | None,
        conversation_id: str | None,
        session: Session | None,
+        run_state: RunState[TContext] | None = None,
    ):
        if streamed_result.trace:
            streamed_result.trace.start(mark_as_current=True)


Streaming resume ignores existing turn count

The streaming execution path always initializes current_turn = 0 when _start_streaming is called, even if a RunState with an existing _current_turn is supplied. The loop then increments from zero, so any turns completed before the interruption are ignored and the max_turns guard is reset. After each interruption, a resumed streaming run can exceed the user’s turn limit and misreport the current turn number.

Useful? React with 👍 / 👎.

This was already fixed in 74c50fd at line 914: current_turn=run_state._current_turn if run_state else 0. The turn counter is properly restored from the RunState.

…essage duplication When resuming a streaming run from RunState, the server conversation tracker was not being primed with previously sent model responses. This caused `prepare_input` to treat all previously generated items as unsent and resubmit them to the server, breaking conversation threading. **Issue**: Missing `track_server_items` call in streaming resumption path **Fix**: Added server conversation tracker priming logic in `_start_streaming` method (lines 1076-1079) to match the non-streaming path implementation (lines 553-556). The fix iterates through `run_state._model_responses` and calls `track_server_items(response)` to mark them as already sent to the server. **Impact**: Resolves message duplication when resuming interrupted streaming runs, ensuring proper conversation threading with server-side sessions. Fixes code review feedback from PR openai#2021 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

seratch · 2025-11-04T02:33:20Z

Thanks for sending this patch!

I currently don't have the bandwidth to check this in depth, but one thing I wanted to mention is that, while implementing the sessions feature in openai-agents-js project, I found that the internals of runner need to take various HITL patterns into consideration. There might not be necessary to make those changes in this Python SDK, but sufficient testing for the sessions scenarios is worth doing.

mjschock · 2025-11-04T17:45:50Z

Thanks for sending this patch!

I currently don't have the bandwidth to check this in depth, but one thing I wanted to mention is that, while implementing the sessions feature in openai-agents-js project, I found that the internals of runner need to take various HITL patterns into consideration. There might not be necessary to make those changes in this Python SDK, but sufficient testing for the sessions scenarios is worth doing.

Happy to contribute! I added a couple examples using SQLiteSession and OpenAIConversationsSession and made sure they work:

OPENAI_API_KEY="your_api_key_here" uv run python examples/memory/memory_session_hitl_example.py 
=== Memory Session + HITL Example ===
Session id: :memory:
Enter a message to chat with the agent. Submit an empty line to exit.
The agent will ask for approval before using tools.

You: What cities does the Bay Bridge connect?
Assistant: The Bay Bridge connects San Francisco and Oakland in California.

You: What's the weather in those cities?

Agent HITL Assistant wants to call 'get_weather' with {"location":"San Francisco, CA"}. Approve? (y/n): y
Approved tool call.

Agent HITL Assistant wants to call 'get_weather' with {"location":"Oakland, CA"}. Approve? (y/n): y
Approved tool call.
Assistant: San Francisco is currently foggy with a temperature of 58°F. Oakland is sunny with a temperature of 72°F.

You:

OPENAI_API_KEY="your_api_key_here" uv run python examples/memory/openai_session_hitl_example.py 
=== OpenAI Session + HITL Example ===
Enter a message to chat with the agent. Submit an empty line to exit.
The agent will ask for approval before using tools.

You: What cities does the Bay Bridge connect?
Assistant: The Bay Bridge, officially known as the San Francisco–Oakland Bay Bridge, connects the cities of **San Francisco** and **Oakland** in California.

You: What's the weather in those cities?

Agent HITL Assistant wants to call 'get_weather' with {"location":"San Francisco, CA"}. Approve? (y/n): y
Approved tool call.

Agent HITL Assistant wants to call 'get_weather' with {"location":"Oakland, CA"}. Approve? (y/n): y
Approved tool call.
Assistant: San Francisco is currently foggy and 58°F, while Oakland is sunny and 72°F.

You:

I'm hoping that just about covers everything but lemme know if there are other areas I should make sure address.

…essage duplication When resuming a streaming run from RunState, the server conversation tracker was not being primed with previously sent model responses. This caused `prepare_input` to treat all previously generated items as unsent and resubmit them to the server, breaking conversation threading. **Issue**: Missing `track_server_items` call in streaming resumption path **Fix**: Added server conversation tracker priming logic in `_start_streaming` method (lines 1076-1079) to match the non-streaming path implementation (lines 553-556). The fix iterates through `run_state._model_responses` and calls `track_server_items(response)` to mark them as already sent to the server. **Impact**: Resolves message duplication when resuming interrupted streaming runs, ensuring proper conversation threading with server-side sessions. Fixes code review feedback from PR openai#2021 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

mjschock · 2025-11-06T18:48:09Z

Thanks @seratch for adding it to the 0.6.x milestone! I'll make sure to address any feedback.

mjschock · 2025-11-08T19:52:35Z

Just a heads up. I noticed there are some discrepancies with the serialized RunState when interrupting in Python and resuming in TypeScript and visa versa. I'm addressing those this weekend and will push the update.

…essage duplication When resuming a streaming run from RunState, the server conversation tracker was not being primed with previously sent model responses. This caused `prepare_input` to treat all previously generated items as unsent and resubmit them to the server, breaking conversation threading. **Issue**: Missing `track_server_items` call in streaming resumption path **Fix**: Added server conversation tracker priming logic in `_start_streaming` method (lines 1076-1079) to match the non-streaming path implementation (lines 553-556). The fix iterates through `run_state._model_responses` and calls `track_server_items(response)` to mark them as already sent to the server. **Impact**: Resolves message duplication when resuming interrupted streaming runs, ensuring proper conversation threading with server-side sessions. Fixes code review feedback from PR openai#2021 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

seratch · 2025-11-16T03:27:40Z

@codex Can you do reviews on the code changes in this PR with fresh eyes? Please focus on the potential issues for existing apps and edge case scenarios rather than general feedback.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-11-16T03:31:21Z

src/agents/run.py


            await AgentRunner._save_result_to_session(session, starting_input, [])


Avoid duplicating session history when resuming streaming runs

When Runner.run_streamed is resumed from a RunState, _start_streaming still calls _prepare_input_with_session and _save_result_to_session unconditionally (streamed_result.input is recomputed and the original prompt is saved again). In the non‑streaming path these calls are skipped when a RunState is supplied so that the session history isn’t duplicated. Re‑preparing and saving here means a resumed streaming run writes the same user message back into the session and the next turn will see duplicate user prompts, breaking conversations for apps that rely on Session or session_input_callback. Consider gating these calls with run_state is None as the synchronous path does.

Useful? React with 👍 / 👎.

👍 Fixed in 466c306

…essage duplication When resuming a streaming run from RunState, the server conversation tracker was not being primed with previously sent model responses. This caused `prepare_input` to treat all previously generated items as unsent and resubmit them to the server, breaking conversation threading. **Issue**: Missing `track_server_items` call in streaming resumption path **Fix**: Added server conversation tracker priming logic in `_start_streaming` method (lines 1076-1079) to match the non-streaming path implementation (lines 553-556). The fix iterates through `run_state._model_responses` and calls `track_server_items(response)` to mark them as already sent to the server. **Impact**: Resolves message duplication when resuming interrupted streaming runs, ensuring proper conversation threading with server-side sessions. Fixes code review feedback from PR openai#2021 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

This commit adds the foundational components for human-in-the-loop functionality in the Python OpenAI Agents SDK, matching the TypeScript implementation. **Completed Components:** 1. **Tool Approval Field** (tool.py) - Added `needs_approval` field to FunctionTool - Supports boolean or callable (dynamic approval) - Updated function_tool() decorator 2. **ToolApprovalItem Class** (items.py) - New item type for tool calls requiring approval - Added to RunItem union type 3. **Approval Tracking** (run_context.py) - Created ApprovalRecord class - Added approval infrastructure to RunContextWrapper - Methods: is_tool_approved(), approve_tool(), reject_tool() - Supports individual and permanent approvals/rejections 4. **RunState Class** (run_state.py) - NEW FILE - Complete serialization/deserialization support - approve() and reject() methods - get_interruptions() method - Agent map building for name resolution - 567 lines of serialization logic 5. **Interruptions Support** (result.py) - Added interruptions field to RunResultBase - Will contain ToolApprovalItem instances when paused 6. **NextStepInterruption** (run_state.py) - New step type for representing interruptions **Remaining Work:** 1. Add NextStepInterruption to NextStep union in _run_impl.py 2. Implement tool approval checking in run execution 3. Update run methods to accept RunState 4. Add comprehensive tests 5. Update documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

This commit integrates the human-in-the-loop infrastructure into the actual run execution flow, making tool approval functional. **Changes:** 1. **NextStepInterruption Type** (_run_impl.py:205-210) - Added NextStepInterruption dataclass - Includes interruptions list (ToolApprovalItems) - Added to NextStep union type 2. **ProcessedResponse Enhancement** (_run_impl.py:167-192) - Added interruptions field - Added has_interruptions() method 3. **Tool Approval Checking** (_run_impl.py:773-848) - Check needs_approval before tool execution - Support dynamic approval functions - If approval needed: * Check approval status via context * If None: Create ToolApprovalItem, return for interruption * If False: Return rejection message * If True: Continue with execution 4. **Interruption Handling** (_run_impl.py:311-333) - After tool execution, check for ToolApprovalItems - If found, create NextStepInterruption and return immediately - Prevents execution of remaining tools when approval pending **Flow:** Tool Call → Check needs_approval → Check approval status → If None: Create interruption, pause run → User approves/rejects → Resume run → If approved: Execute tool If rejected: Return rejection message **Remaining Work:** - Update Runner.run() to accept RunState - Handle interruptions in result creation - Add tests - Add documentation/examples 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

This commit integrates RunState into the Runner API, allowing runs to be resumed from a saved state. This is the final piece needed to make human-in-the-loop (HITL) tool approval fully functional. **Changes:** 1. **Import NextStepInterruption** (run.py:21-32) - Added NextStepInterruption to imports from _run_impl - Added RunState import 2. **Updated Method Signatures** (run.py:285-444) - Runner.run(): Added `RunState[TContext]` to input union type - Runner.run_sync(): Added `RunState[TContext]` to input union type - Runner.run_streamed(): Added `RunState[TContext]` to input union type - AgentRunner.run(): Added `RunState[TContext]` to input union type - AgentRunner.run_sync(): Added `RunState[TContext]` to input union type - AgentRunner.run_streamed(): Added `RunState[TContext]` to input union type 3. **RunState Resumption Logic** (run.py:524-584) - Check if input is RunState instance - Extract state fields when resuming: current_turn, original_input, generated_items, model_responses, context_wrapper - Prime server conversation tracker from model_responses if resuming - Cast context_wrapper to correct type after extraction 4. **Interruption Handling** (run.py:689-726) - Added `interruptions=[]` to successful RunResult creation - Added elif branch for NextStepInterruption - Return RunResult with interruptions when tool approval needed - Set final_output to None for interrupted runs 5. **RunResultStreaming Support** (run.py:879-918) - Handle RunState input for streaming runs - Added `interruptions=[]` field to RunResultStreaming creation - Extract original_input from RunState for result **How It Works:** When resuming from RunState: ```python run_state.approve(approval_item) result = await Runner.run(agent, run_state) ``` When a tool needs approval: 1. Run pauses at tool execution 2. Returns RunResult with interruptions=[ToolApprovalItem(...)] 3. User can inspect interruptions and approve/reject 4. User resumes by passing RunResult back to Runner.run() **Remaining Work:** - Add `state` property to RunResult for creating RunState from results - Add comprehensive tests - Add documentation/examples 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

This commit adds a method to convert a RunResult back into a RunState, enabling the resume workflow for interrupted runs. **Changes:** 1. **to_state() Method** (result.py:125-165) - Added method to RunResult class - Creates a new RunState from the result's data - Populates generated_items, model_responses, and guardrail results - Includes comprehensive docstring with usage example **How to Use:** ```python # Run agent until it needs approval result = await Runner.run(agent, "Use the delete_file tool") if result.interruptions: # Convert result to state state = result.to_state() # Approve the tool call state.approve(result.interruptions[0]) # Resume the run result = await Runner.run(agent, state) ``` **Complete HITL Flow:** 1. Run agent with tool that needs_approval=True 2. Run pauses, returns RunResult with interruptions 3. User calls result.to_state() to get RunState 4. User calls state.approve() or state.reject() 5. User passes state back to Runner.run() to resume 6. Run continues from where it left off **Remaining Work:** - Add comprehensive tests - Create example demonstrating HITL - Add documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…mentation This commit completes the human-in-the-loop (HITL) implementation by adding full streaming support, matching the TypeScript SDK functionality. **Streaming HITL Support:** 1. **ToolApprovalItem Handling** (_run_impl.py:67, 1282-1284) - Added ToolApprovalItem to imports - Handle ToolApprovalItem in stream_step_items_to_queue - Prevents "Unexpected item type" errors during streaming 2. **NextStepInterruption in Streaming** (run.py:1222-1226) - Added NextStepInterruption case in streaming turn loop - Sets interruptions and completes stream when approval needed - Matches non-streaming interruption handling 3. **RunState Support in run_streamed** (run.py:890-905) - Added full RunState input handling - Restores context wrapper from RunState - Enables streaming resumption after approval 4. **Streaming Tool Execution** (run.py:1044-1101) - Added run_state parameter to _start_streaming - Execute approved tools when resuming from interruption - Created _execute_approved_tools instance method - Created _execute_approved_tools_static classmethod for streaming 5. **RunResultStreaming.to_state()** (result.py:401-451) - Added to_state() method to RunResultStreaming - Enables state serialization from streaming results - Includes current_turn for proper state restoration - Complete parity with non-streaming RunResult.to_state() **RunState Enhancements:** 6. **Runtime Imports** (run_state.py:108, 238, 369, 461) - Added runtime imports for NextStepInterruption - Fixes NameError when serializing/deserializing interruptions - Keeps TYPE_CHECKING imports for type hints 7. **from_json() Method** (run_state.py:385-475) - Added from_json() static method for dict deserialization - Complements existing from_string() method - Matches TypeScript API: to_json() / from_json() **Examples:** 8. **human_in_the_loop.py** (examples/agent_patterns/) - Complete non-streaming HITL example - Demonstrates state serialization to JSON file - Shows approve/reject workflow with while loop - Matches TypeScript non-streaming example behavior 9. **human_in_the_loop_stream.py** (examples/agent_patterns/) - Complete streaming HITL example - Uses Runner.run_streamed() for streaming output - Shows streaming with interruption handling - Updated docstring to reflect streaming support - Includes while loop for rejection handling - Matches TypeScript streaming example behavior **Key Design Decisions:** - Kept _start_streaming as @classmethod (existing pattern) - Separate instance/classmethod for tool execution (additive only) - No breaking changes to existing functionality - Complete API parity with TypeScript SDK - Rejection returns error message to LLM for retry - While loops in examples handle rejection/retry flow **Testing:** - ✅ Streaming HITL: interruption, approval, resumption - ✅ Non-streaming HITL: interruption, approval, resumption - ✅ State serialization: to_json() / from_json() - ✅ Tool rejection: message returned, retry possible - ✅ Examples: both streaming and non-streaming work - ✅ Code quality: ruff format, ruff check, mypy pass 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…essage duplication When resuming a streaming run from RunState, the server conversation tracker was not being primed with previously sent model responses. This caused `prepare_input` to treat all previously generated items as unsent and resubmit them to the server, breaking conversation threading. **Issue**: Missing `track_server_items` call in streaming resumption path **Fix**: Added server conversation tracker priming logic in `_start_streaming` method (lines 1076-1079) to match the non-streaming path implementation (lines 553-556). The fix iterates through `run_state._model_responses` and calls `track_server_items(response)` to mark them as already sent to the server. **Impact**: Resolves message duplication when resuming interrupted streaming runs, ensuring proper conversation threading with server-side sessions. Fixes code review feedback from PR openai#2021 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Updated the call_id extraction logic in the _ServerConversationTracker class to consistently use the "call_id" key from output items, removing the fallback to "callId". This change enhances code clarity and ensures uniformity in handling tool call items.

… coverage

…codex-connector

…gent usage and saving tool outputs to session

…fresh_run_exercises_persistence

seratch · 2025-12-08T08:43:35Z

Thanks for working on this. I ran codex reviews many times and it detected these issues:

[P1] Resumed HITL flow never executes approved non-function tools — src/agents/_run_impl.py:626-716
When resuming an interrupted turn after a tool approval, resolve_interrupted_turn only re-runs function tools and computer actions; it never executes queued shell, local
shell, or apply_patch calls stored in processed_response (execute_shell_calls/execute_apply_patch_calls are not invoked). Any ShellTool/ApplyPatchTool configured with
needs_approval will therefore return a ToolApprovalItem, but after the user approves and re-runs the agent the tool output is never produced, leaving the workflow stuck
without the requested side effect or tool output. This affects any HITL approval scenario for these tools.
[P1] Resuming pending MCP approvals raises TypeError — src/agents/_run_impl.py:733-784
When resuming a turn that contains hosted MCP approval requests, resolve_interrupted_turn stores the pending ToolApprovalItem objects in a set (pending_hosted_mcp_approvals).
ToolApprovalItem is not hashable, so the first add() at this point raises TypeError: unhashable type: 'ToolApprovalItem', aborting the resume flow for any run interrupted on
an MCP approval. This breaks HITL resumption whenever a hosted MCP approval is still pending.
[P1] Route local shell calls to remote shell tool — src/agents/_run_impl.py:1044-1052
When processing model output, LocalShellCall items are dispatched to shell_tool if one exists (lines 1044–1050) before falling back to local_shell_tool. This means that if
an agent registers both ShellTool and LocalShellTool, or if the model emits a local_shell_call despite only intending local execution, the call will be run through the remote
shell tool instead of the local one, bypassing the local tool’s approval/on_approval hooks and executing in the wrong environment. Local shell calls should be handled by
LocalShellTool; prioritizing shell_tool here misroutes the execution and can break HITL approval or shell behavior for any agent that enables both tools.
[P1] Streaming drops prior turn items — src/agents/run.py:3618-3624
In _run_single_turn_streamed the pre_step_items list was changed to always be empty, so SingleStepResult.generated_items now contains only the current turn’s items instead
of the accumulated history (streamed_result.new_items). On any streaming run with multiple turns, previous tool calls/outputs are dropped from both the next model request and
the items persisted/emitted after the first turn, breaking conversation continuity and session persistence. This regression occurs whenever a streamed run progresses beyond
the first turn.
[P1] Preserve max_turns when resuming from RunResult state — src/agents/result.py:213-223
RunResult.to_state hardcodes max_turns=10 when creating a RunState, assuming the runner will override it. However Runner.run does not pull max_turns from the state—
it uses the value passed by the caller or defaults to 10. If a run configured with a higher max_turns (e.g., 20) is interrupted for tool approval and resumed via state =
result.to_state(); Runner.run(agent, state) without re‑passing max_turns, the resume path will treat current_turn from the state against a max_turns of 10 and will raise
MaxTurnsExceeded on the next turn, even though budget remains. The max_turns used for the run needs to be persisted in the RunState instead of being reset to 10.
[P1] Deserialize only function approvals, breaks HITL for other tools — src/agents/run_state.py:983-999
When restoring a run from JSON, the interruption reconstruction assumes every approval is a ResponseFunctionToolCall and blindly instantiates that type for each rawItem. Any
HITL approval for shell, apply_patch, MCP, etc. will either fail validation or raise because their rawItem payloads do not match ResponseFunctionToolCall, making resuming
from saved state impossible for those tool types. The deserializer needs to branch on the tool call type (or keep the original dict) instead of always coercing to a function
call.
[P1] Deserializing interruptions assumes function tool calls — src/agents/run_state.py:1168-1177
Resuming a saved RunState fails for approvals on non-function tools because from_json blindly instantiates ResponseFunctionToolCall for every interruption. If the pending
approval was for a shell/apply_patch/hosted tool call (all supported by the new HITL flow), this constructor raises validation errors and the run cannot be resumed. Please
deserialize rawItem without forcing it to a function call or branch by type so non-function approvals survive round‑tripping.
[P2] Avoid error-level logging on tracker creation — src/agents/run.py:161-168
_ServerConversationTracker.post_init logs an error with a full stack trace every time a tracker is instantiated (lines 161‑168). This runs for all server-managed
conversations, so normal runs now emit error-level messages even when nothing is wrong, polluting logs and alerting systems. Consider removing or downgrading this log, or
guarding it behind a debug flag.

mjschock mentioned this pull request Nov 1, 2025

Human-In-The-Loop Architecture should be implemented on top priority! #636

Open

mjschock marked this pull request as ready for review November 1, 2025 18:16

chatgpt-codex-connector bot reviewed Nov 1, 2025

View reviewed changes

seratch added enhancement New feature or request feature:core labels Nov 4, 2025

mjschock force-pushed the main branch from b262f66 to 3d52667 Compare November 4, 2025 17:47

mjschock force-pushed the main branch from 3d52667 to f30310e Compare November 4, 2025 17:55

mjschock force-pushed the main branch from f30310e to 110dcbd Compare November 5, 2025 02:15

seratch added this to the 0.6.x milestone Nov 5, 2025

mjschock force-pushed the main branch from 110dcbd to b38b2f7 Compare November 6, 2025 18:45

mjschock force-pushed the main branch from 2bc0ce0 to 9f37210 Compare November 14, 2025 02:12

mjschock force-pushed the main branch from 9f37210 to 1947718 Compare November 14, 2025 16:54

mjschock force-pushed the main branch from 1947718 to def65ac Compare November 16, 2025 01:42

chatgpt-codex-connector bot reviewed Nov 16, 2025

View reviewed changes

mjschock force-pushed the main branch from 66df149 to 9ca00a8 Compare November 17, 2025 23:43

seratch removed this from the 0.6.x milestone Nov 18, 2025

mjschock force-pushed the main branch from 4cad6dd to 651cf1f Compare November 26, 2025 04:08

mjschock and others added 25 commits December 5, 2025 18:59

ci: fix issues that surfaced in CI

dc8c948

fix: Bring up coverage to minimum and add session hitl examples

f791823

fix: ensure RunState serialization compatibility with openai-agents-js

b227de6

test: add tests for RunState resumption and serialization to bring up…

fde8675

… coverage

fix: Updates following rebase, include test coverage

f57493b

fix: address issues around resuming run state with conversation history

1c23029

fix: address duplicating session history issue mentioned by @chatgpt-…

1406aa0

…codex-connector

fix: update RunState with current turn persisted item tracking

c7b3988

fix: addressing edge cases when resuming

e8f4cdb

fix: addressing edge cases when resuming (continued)

321bb68

fix: addressing rebase issues

d383d77

fix: improving parity with openai-agent-js hitl functionality

33c41d6

fix: bring coverage back up, addressing edge cases

47851ef

fix: cleanup

4926092

fix: rename summary_message back to assistant_message

2874402

fix: enhance agent state management during resume, ensuring correct a…

5dd00a2

…gent usage and saving tool outputs to session

fix: finish up human-in-the-loop port

427ff7c

fix: add auto_previous_response_id parameter to test_start_streaming_…

86f09e9

…fresh_run_exercises_persistence

mjschock force-pushed the main branch from 651cf1f to 86f09e9 Compare December 6, 2025 03:21

fix: typing updates to pass make old_version_tests

bf3fb94


		await AgentRunner._save_result_to_session(session, starting_input, [])

Human-in-the-Loop Implementation #2021

Are you sure you want to change the base?

Human-in-the-Loop Implementation #2021

Conversation

mjschock commented Nov 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

mjschock Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

mjschock Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

seratch commented Nov 4, 2025

Uh oh!

mjschock commented Nov 4, 2025

Uh oh!

mjschock commented Nov 6, 2025

Uh oh!

mjschock commented Nov 8, 2025

Uh oh!

seratch commented Nov 16, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Nov 16, 2025

Choose a reason for hiding this comment

Uh oh!

mjschock Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

seratch commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mjschock commented Nov 1, 2025 •

edited

Loading