Skip to content

Comments

feat(cli): add --execution-context for automated remote Ray dispatch#593

Draft
michael-johnston wants to merge 10 commits intomainfrom
maj_execution_contexts
Draft

feat(cli): add --execution-context for automated remote Ray dispatch#593
michael-johnston wants to merge 10 commits intomainfrom
maj_execution_contexts

Conversation

@michael-johnston
Copy link
Member

What's Added

A new global --execution-context <file> option for the ado CLI. When provided, any ado command is automatically dispatched to a remote Ray cluster instead of running locally. The entire workflow — copying files, building plugin wheels, generating a Ray runtime environment, setting up a port-forward, submitting the job, and tearing down — is handled by ado with no manual steps required.

New submodules

  • orchestrator/core/executioncontext/
    • Defines the ExecutionContext Pydantic model and its sub-models
  • orchestrator/cli/utils/remote/
    • Implements the remote dispatch pipeline

Main changes to existing code

File Change Reason
orchestrator/cli/core/cli.py Added --execution-context option to common_options; added _handle_remote_dispatch() that validates the active context is non-SQLite, loads the ExecutionContext, and calls dispatch() Entry point for intercepting any ado command for remote execution
orchestrator/cli/core/config.py Added set_execution_context() method and execution_context property to AdoConfiguration Avoids external code setting private attributes directly
orchestrator/cli/utils/output/prints.py Added spinner message constants ADO_SPINNER_REMOTE_PREPARING_FILES, ADO_SPINNER_REMOTE_PORT_FORWARD Consistent progress display during long-running dispatch phases
website/docs/getting-started/remote_run.md Added a new recommended --execution-context section with YAML examples Replaces the existing multi-step manual workflow with the automated one

Example

Running an optimisation operation (using examples/optimization_test_functions/) with the port-forward execution context.

: # From examples/optimization_test_functions/
ado --execution-context execution_context.yaml \
    create operation -f operation_nevergrad.yaml \
    --with space=space.yaml

Where execution_context.yaml contains:

executionType:
  type: cluster
  clusterUrl: "http://localhost:8265"
  portForward:
    namespace: discovery-dev
    serviceName: ray-disorch-head-svc
    localPort: 8265
packages:
  fromPyPI:
    - ray==2.52.1 # Required to match cluster ray
  fromSource:
    - . ./../          # ado-core 
    - custom_experiments  # in-tree custom experiments
wait: true
envVars:
  PYTHONUNBUFFERED: "x"
  OMP_NUM_THREADS: "1"
  OPENBLAS_NUM_THREADS: "1"
  RAY_AIR_NEW_PERSISTENCE_MODE: "0"

@michael-johnston
Copy link
Member Author

@AlessandroPomponio Docs not complete but take a look and test when you have a moment.

@michael-johnston michael-johnston added the enhancement New feature or request label Feb 20, 2026
@michael-johnston michael-johnston linked an issue Feb 20, 2026 that may be closed by this pull request
def _handle_remote_dispatch(
execution_context_file: pathlib.Path,
ado_config: AdoConfiguration,
project_context_file: pathlib.Path | None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't needed as it's handled by the call to AdoConfiguration.load (via the from_project_context parameter)

Comment on lines +247 to +253
if project_context is None:
console_print(
f"{ERROR}Cannot use --execution-context: no project context is active.\n"
"Activate a context with 'ado context set' or provide one with -c.",
stderr=True,
)
raise typer.Exit(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ado_config.project_context cannot be None when AdoConfiguration.load() is used. Also, ado context set doesn't exist, it's ado context

Suggested change
if project_context is None:
console_print(
f"{ERROR}Cannot use --execution-context: no project context is active.\n"
"Activate a context with 'ado context set' or provide one with -c.",
stderr=True,
)
raise typer.Exit(1)

Comment on lines +279 to +291
# Resolve the project context file path
if project_context_file is not None:
resolved_ctx_file = project_context_file.resolve()
else:
resolved_ctx_file = ado_config.project_context_path_from_active_context()

if resolved_ctx_file is None or not resolved_ctx_file.is_file():
console_print(
f"{ERROR}Cannot resolve the project context file for remote dispatch. "
"Ensure a context is active or provide one with -c.",
stderr=True,
)
raise typer.Exit(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ProjectContext is already checked by AdoConfiguration. The simplest thing is to dump it and avoid these checks

Comment on lines +276 to +277
# Store on ado_config for potential downstream use
ado_config.set_execution_context(execution_context)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The execution context gets removed from the CLI args so I don't think it's worth saving, right?

Comment on lines +181 to +206
def _strip_context_flag(args: list[str]) -> list[str]:
"""Return *args* with any ``-c``/``--context`` flag and its value removed.

Parameters
----------
args:
Argument list to strip.

Returns
-------
list[str]
A copy of *args* without any ``-c``/``--context`` flag.
"""
result: list[str] = []
skip_next = False
for arg in args:
if skip_next:
skip_next = False
continue
if arg in _CONTEXT_FLAGS:
skip_next = True
continue
if any(arg.startswith(f"{flag}=") for flag in _CONTEXT_FLAGS):
continue
result.append(arg)
return result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems very similar to remove_execution_context_from_argv

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My worry with this code is that it's doing argument parsing again from scratch...

Unfortunately it seems to me that Typer/Click do forward checks with the options/arguments passed to a command and that the top-level context cannot see the totality of the command that has been provided in a Typer-native way. With this, I mean that from the ado callback I can't see the values that are passed to the create callback and override them. Or at least, I haven't been able to figure out a way to do so.

What I would suggest trying is to have a look at argparse.parse_known_args (https://docs.python.org/3/library/argparse.html#argparse.ArgumentParser.parse_known_args) by instantiating an ArgumentParser and feeding it well known arguments that we look for (i.e., --override-ado-app-dir, --with, etc) so that we reduce the manual parsing effort.

Comment on lines +76 to +101
@pydantic.field_validator("clusterUrl", mode="before")
@classmethod
def validate_cluster_url(cls, value: object) -> object:
"""Validate that clusterUrl is a well-formed URL with a scheme and host.

Raises
------
ValueError
If the value is not a string or does not have a recognisable URL
scheme and host component.
"""
if not isinstance(value, str):
return value
try:
parsed = pydantic.AnyUrl(value)
except Exception as exc:
raise ValueError(
f"clusterUrl '{value}' is not a valid URL. "
"Expected a full URL including scheme, e.g. http://localhost:8265"
) from exc
if not parsed.host:
raise ValueError(
f"clusterUrl '{value}' must include a host, "
"e.g. http://localhost:8265"
)
return value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason we can't annotate the field with HttpUrl instead of having this validator?

]


class ExecutionContext(pydantic.BaseModel):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have ProjectContext already and the names are very similar. I think this will likely cause the users to be confused

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not for this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also maybe not for this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: cli support for running remote operations

2 participants