Enhancement: add -- pass-through for arbitrary vLLM arguments

Implements Unix-style pass-through allowing any vLLM argument to be
passed after `--` separator. Arguments are appended verbatim to the
generated vLLM command.

Examples:
  ./run-recipe.py model --solo -- --load-format safetensors
  ./run-recipe.py model --solo -- --served-model-name my-api
  ./run-recipe.py model --solo -- -cc.cudagraph_mode=PIECEWISE

Features:
- Uses parse_known_args() to capture arguments after --
- Warns when extra args duplicate CLI overrides (--port, --tp, etc.)
- Works in both solo and cluster modes

Adds 10 integration tests covering:
- --load-format, --served-model-name, equals syntax
- Multiple arguments, empty --, cluster mode
- Duplicate detection warnings for port/tp/gpu-mem

Closes #30
This commit is contained in:
Raphael Amorim
2026-02-08 02:36:49 -05:00
parent 8cb956b972
commit b7c3cdcfcb
3 changed files with 327 additions and 3 deletions

View File

@@ -191,11 +191,36 @@ Launch options:
-t, --container IMAGE Override container from recipe
--nccl-debug LEVEL NCCL debug level (VERSION, WARN, INFO, TRACE)
Extra vLLM arguments:
-- ARGS... Pass additional arguments directly to vLLM
Other:
--dry-run Show what would be executed
--list, -l List available recipes
```
## Extra vLLM Arguments
Use the Unix-style `--` separator to pass additional arguments directly to vLLM. Any arguments after `--` are appended verbatim to the vLLM command.
```bash
# Override load format
./run-recipe.sh my-recipe --solo -- --load-format safetensors
# Set a custom served model name
./run-recipe.sh my-recipe --solo -- --served-model-name my-api-name
# Configure CUDA graph mode
./run-recipe.sh my-recipe --solo -- -cc.cudagraph_mode=PIECEWISE
# Multiple extra arguments
./run-recipe.sh my-recipe --solo -- --load-format auto --enforce-eager --seed 42
```
These arguments are appended to the end of the generated vLLM command after all template substitutions.
**Duplicate Detection**: If you pass an argument that conflicts with a CLI override (e.g., `--port` when you also used `--port`), a warning will be shown since your CLI override value may be replaced by the extra arg.
## Creating a Recipe
1. Create a new `.yaml` file in `recipes/`