feat(cli): add --execution-context for automated remote Ray dispatch#593
feat(cli): add --execution-context for automated remote Ray dispatch#593michael-johnston wants to merge 10 commits intomainfrom
Conversation
|
@AlessandroPomponio Docs not complete but take a look and test when you have a moment. |
| def _handle_remote_dispatch( | ||
| execution_context_file: pathlib.Path, | ||
| ado_config: AdoConfiguration, | ||
| project_context_file: pathlib.Path | None, |
There was a problem hiding this comment.
This isn't needed as it's handled by the call to AdoConfiguration.load (via the from_project_context parameter)
| if project_context is None: | ||
| console_print( | ||
| f"{ERROR}Cannot use --execution-context: no project context is active.\n" | ||
| "Activate a context with 'ado context set' or provide one with -c.", | ||
| stderr=True, | ||
| ) | ||
| raise typer.Exit(1) |
There was a problem hiding this comment.
ado_config.project_context cannot be None when AdoConfiguration.load() is used. Also, ado context set doesn't exist, it's ado context
| if project_context is None: | |
| console_print( | |
| f"{ERROR}Cannot use --execution-context: no project context is active.\n" | |
| "Activate a context with 'ado context set' or provide one with -c.", | |
| stderr=True, | |
| ) | |
| raise typer.Exit(1) |
| # Resolve the project context file path | ||
| if project_context_file is not None: | ||
| resolved_ctx_file = project_context_file.resolve() | ||
| else: | ||
| resolved_ctx_file = ado_config.project_context_path_from_active_context() | ||
|
|
||
| if resolved_ctx_file is None or not resolved_ctx_file.is_file(): | ||
| console_print( | ||
| f"{ERROR}Cannot resolve the project context file for remote dispatch. " | ||
| "Ensure a context is active or provide one with -c.", | ||
| stderr=True, | ||
| ) | ||
| raise typer.Exit(1) |
There was a problem hiding this comment.
The ProjectContext is already checked by AdoConfiguration. The simplest thing is to dump it and avoid these checks
| # Store on ado_config for potential downstream use | ||
| ado_config.set_execution_context(execution_context) |
There was a problem hiding this comment.
The execution context gets removed from the CLI args so I don't think it's worth saving, right?
| def _strip_context_flag(args: list[str]) -> list[str]: | ||
| """Return *args* with any ``-c``/``--context`` flag and its value removed. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| args: | ||
| Argument list to strip. | ||
|
|
||
| Returns | ||
| ------- | ||
| list[str] | ||
| A copy of *args* without any ``-c``/``--context`` flag. | ||
| """ | ||
| result: list[str] = [] | ||
| skip_next = False | ||
| for arg in args: | ||
| if skip_next: | ||
| skip_next = False | ||
| continue | ||
| if arg in _CONTEXT_FLAGS: | ||
| skip_next = True | ||
| continue | ||
| if any(arg.startswith(f"{flag}=") for flag in _CONTEXT_FLAGS): | ||
| continue | ||
| result.append(arg) | ||
| return result |
There was a problem hiding this comment.
This seems very similar to remove_execution_context_from_argv
There was a problem hiding this comment.
My worry with this code is that it's doing argument parsing again from scratch...
Unfortunately it seems to me that Typer/Click do forward checks with the options/arguments passed to a command and that the top-level context cannot see the totality of the command that has been provided in a Typer-native way. With this, I mean that from the ado callback I can't see the values that are passed to the create callback and override them. Or at least, I haven't been able to figure out a way to do so.
What I would suggest trying is to have a look at argparse.parse_known_args (https://docs.python.org/3/library/argparse.html#argparse.ArgumentParser.parse_known_args) by instantiating an ArgumentParser and feeding it well known arguments that we look for (i.e., --override-ado-app-dir, --with, etc) so that we reduce the manual parsing effort.
| @pydantic.field_validator("clusterUrl", mode="before") | ||
| @classmethod | ||
| def validate_cluster_url(cls, value: object) -> object: | ||
| """Validate that clusterUrl is a well-formed URL with a scheme and host. | ||
|
|
||
| Raises | ||
| ------ | ||
| ValueError | ||
| If the value is not a string or does not have a recognisable URL | ||
| scheme and host component. | ||
| """ | ||
| if not isinstance(value, str): | ||
| return value | ||
| try: | ||
| parsed = pydantic.AnyUrl(value) | ||
| except Exception as exc: | ||
| raise ValueError( | ||
| f"clusterUrl '{value}' is not a valid URL. " | ||
| "Expected a full URL including scheme, e.g. http://localhost:8265" | ||
| ) from exc | ||
| if not parsed.host: | ||
| raise ValueError( | ||
| f"clusterUrl '{value}' must include a host, " | ||
| "e.g. http://localhost:8265" | ||
| ) | ||
| return value |
There was a problem hiding this comment.
Is there a specific reason we can't annotate the field with HttpUrl instead of having this validator?
| ] | ||
|
|
||
|
|
||
| class ExecutionContext(pydantic.BaseModel): |
There was a problem hiding this comment.
We have ProjectContext already and the names are very similar. I think this will likely cause the users to be confused
There was a problem hiding this comment.
Maybe not for this PR?
There was a problem hiding this comment.
Also maybe not for this PR?
What's Added
A new global
--execution-context <file>option for the ado CLI. When provided, any ado command is automatically dispatched to a remote Ray cluster instead of running locally. The entire workflow — copying files, building plugin wheels, generating a Ray runtime environment, setting up a port-forward, submitting the job, and tearing down — is handled by ado with no manual steps required.New submodules
Main changes to existing code
orchestrator/cli/core/cli.py--execution-contextoption tocommon_options; added_handle_remote_dispatch()that validates the active context is non-SQLite, loads theExecutionContext, and callsdispatch()adocommand for remote executionorchestrator/cli/core/config.pyset_execution_context()method andexecution_contextproperty toAdoConfigurationorchestrator/cli/utils/output/prints.pyADO_SPINNER_REMOTE_PREPARING_FILES,ADO_SPINNER_REMOTE_PORT_FORWARDwebsite/docs/getting-started/remote_run.md--execution-contextsection with YAML examplesExample
Running an optimisation operation (using examples/optimization_test_functions/) with the port-forward execution context.
Where
execution_context.yamlcontains: