AI-Dynamo updates. #791

greptile-apps · 2026-02-03T21:04:33Z

Check if tdef.cmd_args.genai_perf.args exists before accessing it

coderabbitai · 2026-02-01T10:42:19Z

⚠️ Potential issue | 🟡 Minor

Wrapper flag mismatch: --report_file isn’t parsed by genai_perf.sh.
The script accepts --report_name, so this flag is ignored. Use --report_name (or support both in the script) to avoid silent defaults.

🔧 Proposed fix

- "--report_file", + "--report_name",

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

report_file = "genai_perf_report.csv"

wrapper_cmd = [

"/bin/bash",

pod_wrapper_path,

"--result_dir",

genai_perf_results_path,

"--report_file",

report_file,

"--calc_percentile_csv_script",

pod_calc_csv_path,

report_file = "genai_perf_report.csv"

wrapper_cmd = [

"/bin/bash",

pod_wrapper_path,

"--result_dir",

genai_perf_results_path,

"--report_name",

report_file,

"--calc_percentile_csv_script",

pod_calc_csv_path,

🤖 Prompt for AI Agents

In `@src/cloudai/systems/kubernetes/kubernetes_system.py` around lines 347 - 356, The wrapper flag used in the wrapper_cmd array is incorrect: genai_perf.sh expects --report_name but the code passes --report_file (see variables wrapper_cmd, report_file, pod_wrapper_path); update the argument list in wrapper_cmd to use "--report_name" (or add both "--report_name" and "--report_file" if you want backward compatibility) so the script receives the intended report name (ensure the value remains report_file and keep pod_calc_csv_path and genai_perf_results_path unchanged).

-Original file line number
+Diff line change
@@ -1,5 +1,5 @@
     # SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES
-    # Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+    # Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
     # SPDX-License-Identifier: Apache-2.0
     #
     # Licensed under the Apache License, Version 2.0 (the "License");
@@ Expand Down Expand Up @@
       decode-cmd = 'python3 -m dynamo.vllm'
         [cmd_args.dynamo.decode_worker]
-        pipeline-parallel-size = 1
+        num-nodes = 1
+          [cmd_args.dynamo.decode_worker.args]
+          model = "Qwen/Qwen3-0.6B"
+          gpu-memory-utilization = 0.95
+          tensor-parallel-size = 8
+          pipeline-parallel-size = 1
+          data-parallel-size = 1
       [cmd_args.genai_perf]
       model = "Qwen/Qwen3-0.6B"
@@ Expand All @@
       concurrency = 2
       extra-args = "--streaming -- -v --async"
+      [cmd_args.lmcache]
+      [cmd_args.lmbench]
     [extra_env_vars]
     UCX_LOG_LEVEL = "warn"
     UCX_TLS = "cuda_copy,rc_x"
@@ Expand Down @@

-Original file line number
+Diff line change
@@ -1,5 +1,5 @@
     # SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES
-    # Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+    # Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
     # SPDX-License-Identifier: Apache-2.0
     #
     # Licensed under the Apache License, Version 2.0 (the "License");
@@ Expand All / @@ -24,7 +24,10 @@ test_name = "vLLM-Qwen3-0.6B" @@
         [Tests.cmd_args.dynamo]
           [Tests.cmd_args.dynamo.prefill_worker]
           num-nodes = 1
-          tensor-parallel-size = 8
+            [Tests.cmd_args.dynamo.prefill_worker.args]
+            tensor-parallel-size = 8
           [Tests.cmd_args.dynamo.decode_worker]
           num-nodes = 1
-          tensor-parallel-size = 8
+            [Tests.cmd_args.dynamo.decode_worker.args]
+            tensor-parallel-size = 8

-Original file line number
+Diff line change
@@ -1,5 +1,5 @@
     # SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES
-    # Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+    # Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
     # SPDX-License-Identifier: Apache-2.0
     #
     # Licensed under the Apache License, Version 2.0 (the "License");
@@ Expand All / @@ -25,13 +25,15 @@ time_limit = "00:10:00" @@
       [Tests.cmd_args.dynamo.prefill_worker]
       num-nodes = 1
-      tensor-parallel-size = 4
-      pipeline-parallel-size = 1
+        [Tests.cmd_args.dynamo.prefill_worker.args]
+        tensor-parallel-size = 4
+        pipeline-parallel-size = 1
       [Tests.cmd_args.dynamo.decode_worker]
       num-nodes = 1
-      tensor-parallel-size = 4
-      pipeline-parallel-size = 1
+        [Tests.cmd_args.dynamo.decode_worker.args]
+        tensor-parallel-size = 4
+        pipeline-parallel-size = 1
     [[Tests]]
     id = "test.disagg.multinode"
@@ Expand All / @@ -41,10 +43,12 @@ time_limit = "00:10:00" @@
       [Tests.cmd_args.dynamo.prefill_worker]
       num-nodes = 2
-      tensor-parallel-size = 4
-      pipeline-parallel-size = 1
+        [Tests.cmd_args.dynamo.prefill_worker.args]
+        tensor-parallel-size = 4
+        pipeline-parallel-size = 1
       [Tests.cmd_args.dynamo.decode_worker]
       num-nodes = 2
-      tensor-parallel-size = 4
-      pipeline-parallel-size = 1
+        [Tests.cmd_args.dynamo.decode_worker.args]
+        tensor-parallel-size = 4
+        pipeline-parallel-size = 1

-Original file line number
+Diff line change
@@ Expand Up @@
             super().__init__(system, test_run)
             self.system = cast(SlurmSystem, system)
             self.test_run = test_run
+            self.container_install_path = "/cloudai_install"
+            self.container_results_path = "/cloudai_run_results"
             self._node_spec_cache: dict[str, tuple[int, list[str]]] = {}
@@ Expand Down Expand Up / @@ -79,8 +81,8 @@ def container_mounts(self) -> list[str]: @@
                 repo_mounts.append(f"{path}:{repo.container_mount}")
             mounts = [
-                f"{self.test_run.output_path.absolute()}:/cloudai_run_results",
-                f"{self.system.install_path.absolute()}:/cloudai_install",
+                f"{self.test_run.output_path.absolute()}:{self.container_results_path}",
+                f"{self.system.install_path.absolute()}:{self.container_install_path}",
                 f"{self.test_run.output_path.absolute()}",
                 *tdef.extra_container_mounts,
                 *repo_mounts,
@@ Expand Down Expand Up / @@ -302,7 +304,7 @@ def _ranks_mapping_cmd(self) -> str: @@
         def _metadata_cmd(self) -> str:
             (self.test_run.output_path.absolute() / "metadata").mkdir(parents=True, exist_ok=True)
             num_nodes, _ = self.get_cached_nodes_spec()
-            metadata_script_path = "/cloudai_install"
+            metadata_script_path = self.container_install_path
             if not self.image_path():
                 metadata_script_path = str(self.system.install_path.absolute())
             return " ".join(
@@ Expand Down @@

-Original file line number
+Diff line change
@@ -1,5 +1,5 @@
     # SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES
-    # Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+    # Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
     # SPDX-License-Identifier: Apache-2.0
     #
     # Licensed under the Apache License, Version 2.0 (the "License");
@@ Expand All / @@ -19,8 +19,13 @@ @@
         AIDynamoCmdArgs,
         AIDynamoTestDefinition,
         DecodeWorkerArgs,
-        GenAIPerfArgs,
+        GenAIPerf,
+        LMBench,
+        LMCache,
+        LMCacheArgs,
         PrefillWorkerArgs,
+        WorkerBaseArgs,
+        WorkerConfig,
     )
     from .kubernetes_json_gen_strategy import AIDynamoKubernetesJsonGenStrategy
     from .report_generation_strategy import AIDynamoReportGenerationStrategy
@@ Expand All / @@ -34,6 +39,11 @@ @@
         "AIDynamoSlurmCommandGenStrategy",
         "AIDynamoTestDefinition",
         "DecodeWorkerArgs",
-        "GenAIPerfArgs",
+        "GenAIPerf",
+        "LMBench",
+        "LMCache",
+        "LMCacheArgs",
         "PrefillWorkerArgs",
+        "WorkerBaseArgs",
+        "WorkerConfig",
     ]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI-Dynamo updates. #791

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Uh oh!

greptile-apps bot Feb 3, 2026

Uh oh!

coderabbitai bot Feb 1, 2026

Uh oh!

Uh oh!

Uh oh!

AI-Dynamo updates. #791

Are you sure you want to change the base?

Uh oh!

AI-Dynamo updates. #791

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Uh oh!

greptile-apps bot Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!