spark-vllm-docker

Author	SHA1	Message	Date
Eugene Rakhmatulin	00c16746e5	Handle new copy hosts setup in run-recipe.py	2026-03-26 16:45:35 -07:00
Eugene Rakhmatulin	a78e221de3	Autodiscovery refactoring with mesh support	2026-03-26 15:47:41 -07:00
Eugene Rakhmatulin	c08b34a218	add --config passthrough to run-recipe	2026-03-25 23:35:52 -07:00
Eugene Rakhmatulin	ad2cd3373f	.env configuration support for launch-cluster.sh	2026-03-25 14:18:00 -07:00
Eugene Rakhmatulin	7e4150feed	Added master-port argument	2026-03-18 16:57:55 -07:00
Eugene Rakhmatulin	2755b62d12	Fixes #108	2026-03-18 13:26:39 -07:00
Eugene Rakhmatulin	ccea2ba861	Bugfixes	2026-03-17 13:54:42 -07:00
Eugene Rakhmatulin	957605498c	Added extra passthrough variables to run-recipe	2026-03-17 13:41:40 -07:00
Eugene Rakhmatulin	b1eeefc0eb	Changed Nemotron-3-Nano-NVFP4 to Marlin backend	2026-03-17 13:10:48 -07:00
Eugene Rakhmatulin	03b055d7f0	Major cluster orchestration refactoring to support running without Ray	2026-03-13 11:55:18 -07:00
eugr	0019bdf5ed	Merge pull request #85 from saladinomario/feat/recipe-env-passthrough Add -e/--env passthrough to run-recipe.py	2026-03-10 10:28:29 -07:00
mariosaladino	f95beba566	Add -e/--env passthrough to run-recipe.py Fixes #81. Allows passing environment variables (e.g. HF_TOKEN) through to the container when launching via recipes, mirroring the existing -e flag in launch-cluster.sh. Usage: ./run-recipe.sh glm-4.7-flash-awq --solo -e HF_TOKEN=$HF_TOKEN	2026-03-06 21:50:29 +01:00
L.B.R.	50b3ca60f3	Fix shell quoting for exec command arguments Arguments with special characters (e.g. JSON strings) were passed unquoted, causing breakage for commands like: --speculative-config '{"method":"qwen3_next_mtp","num_speculative_tokens":2}' Use printf %q in launch-cluster.sh and shlex.quote() in run-recipe.py to properly escape arguments.	2026-03-04 15:22:42 +00:00
Raphael Amorim	d07ad5450f	Adding solo_only option to the recipe	2026-02-09 17:03:57 -05:00
Raphael Amorim	b7c3cdcfcb	Enhancement: add -- pass-through for arbitrary vLLM arguments Implements Unix-style pass-through allowing any vLLM argument to be passed after `--` separator. Arguments are appended verbatim to the generated vLLM command. Examples: ./run-recipe.py model --solo -- --load-format safetensors ./run-recipe.py model --solo -- --served-model-name my-api ./run-recipe.py model --solo -- -cc.cudagraph_mode=PIECEWISE Features: - Uses parse_known_args() to capture arguments after -- - Warns when extra args duplicate CLI overrides (--port, --tp, etc.) - Works in both solo and cluster modes Adds 10 integration tests covering: - --load-format, --served-model-name, equals syntax - Multiple arguments, empty --, cluster mode - Duplicate detection warnings for port/tp/gpu-mem Closes #30	2026-02-08 02:36:49 -05:00
Raphael Amorim	b1516f688a	fix: Allow PR tests from any branch and add manual trigger	2026-02-03 17:42:09 -05:00
Raphael Amorim	28ba6090fc	Adding suggestions from Eugr and unit tests	2026-02-03 17:32:59 -05:00
Raphael Amorim	30f16f1d4e	feat: Add recipe-based one-click model deployment system Introduces a YAML recipe system for simplified model deployment: - run-recipe.py: Main script handling build, download, and launch - run-recipe.sh: Bash wrapper for dependency management - recipes/: Pre-configured recipes for common models - glm-4.7-flash-awq.yaml: GLM-4.7-Flash with AWQ quantization - glm-4.7-nvfp4.yaml: GLM-4.7 with NVFP4 (cluster-only) - minimax-m2-awq.yaml: MiniMax M2 with AWQ - openai-gpt-oss-120b.yaml: OpenAI GPT-OSS 120B with MXFP4 Key features: - Auto-discover cluster nodes with --discover, saves to .env - Load nodes from .env automatically on subsequent runs - cluster_only flag for models requiring multi-node setup - build_args field for Dockerfile selection (--pre-tf, --exp-mxfp4) - Solo mode auto-strips --distributed-executor-backend ray - --setup flag for full build + download + run workflow - --dry-run to preview execution without running Usage: ./run-recipe.sh --discover # Find and save cluster nodes ./run-recipe.sh glm-4.7-flash-awq --solo --setup ./run-recipe.sh glm-4.7-nvfp4 --setup # Uses nodes from .env	2026-02-03 16:09:12 -05:00

18 Commits