Implements Unix-style pass-through allowing any vLLM argument to be
passed after `--` separator. Arguments are appended verbatim to the
generated vLLM command.
Examples:
./run-recipe.py model --solo -- --load-format safetensors
./run-recipe.py model --solo -- --served-model-name my-api
./run-recipe.py model --solo -- -cc.cudagraph_mode=PIECEWISE
Features:
- Uses parse_known_args() to capture arguments after --
- Warns when extra args duplicate CLI overrides (--port, --tp, etc.)
- Works in both solo and cluster modes
Adds 10 integration tests covering:
- --load-format, --served-model-name, equals syntax
- Multiple arguments, empty --, cluster mode
- Duplicate detection warnings for port/tp/gpu-mem
Closes#30