Tim Messerschmidt
b9fc32ec34
fix: skip empty lines in wheel download read loop
...
Add a guard to skip empty lines (e.g. trailing newlines) in the
while-read loop to prevent try_download_wheels from breaking on
unexpected blank input.
2026-03-07 05:06:12 +01:00
Eugene Rakhmatulin
9dc09bd04b
Renamed recipe for qwen3.5-35b-a3b-fp8 to match others
2026-03-06 13:56:06 -08:00
eugr
e88426646b
Merge pull request #76 from mmonad/fix-exec-arg-quoting
...
Fix shell quoting for exec command arguments
2026-03-06 13:45:53 -08:00
Olivier Paroz
eb8abcca7f
Prevent 169.254.x.x fallback when setting fix IP address ( #84 )
...
* Prevent 169.254.x.x fallback when setting fix IP address
To force the use of the IP we've chosen to be assigned to the interface, it's safer to disable the fallback to avoid problems down the line
* Prevent 169.254.x.x fallback when setting fix IP address
To force the use of the static IP address we've chosen to be assigned to the interface, it's safer to disable the fallback to avoid problems down the line
2026-03-06 11:47:47 -08:00
eugr
d148d95a19
Merge pull request #80 from oliverjohnwilson/recipe-add_minimax-m2.5_qwen3.5-397b-a17B-fp8
...
added minimax-m2.5 and qwen3.5-397b-a17B-fp8 recipes to a recipes/4x-spark-cluster/ subdirectory
2026-03-06 11:46:37 -08:00
Eugene Rakhmatulin
5346372f14
More robust wheels check before download
2026-03-05 17:06:57 -08:00
Eugene Rakhmatulin
5f8f988d91
Merge branch 'main' of github.com:eugr/spark-vllm-docker
2026-03-05 16:29:00 -08:00
eugr
3fabd3fb1c
Merge pull request #72 from erikvullings/main
...
Add Qwen35-35B-A3B recipe in FP8 format
2026-03-05 16:27:50 -08:00
Eugene Rakhmatulin
2d03bc138d
saving flashinfer and vllm commits in wheels directories
2026-03-05 14:41:25 -08:00
Eugene Rakhmatulin
a749fcce87
Added a recipe for qwen3.5-122B-FP8
staging-current-1772696417
staging-current-1772696532
2026-03-04 16:49:39 -08:00
Eugene Rakhmatulin
505a060a7d
vLLM prebuilt wheels support
2026-03-04 16:01:50 -08:00
Eugene Rakhmatulin
ca34ebcffc
Merge branch 'main' into vllm-wheels
2026-03-04 15:59:16 -08:00
oliverjohnwilson
4303f8b6d0
added minimax-m2.5 and qwen3.5-397b-a17B-fp8 recipes to a recipes/4x-spark-cluster/ subdirectory
2026-03-04 16:01:37 -06:00
Eugene Rakhmatulin
2152ef127d
Now can use prebuilt vLLM wheels
2026-03-04 13:33:32 -08:00
Eugene Rakhmatulin
19f06a0d16
Fixed a bug with checking whether we need to download remote wheels
staging-current-1772668424
staging-current-1772668553
2026-03-04 13:00:40 -08:00
Eugene Rakhmatulin
bbd7db2813
revert bumping up base image
staging-current-1772642670
staging-current-1772642791
2026-03-04 07:29:53 -08:00
L.B.R.
50b3ca60f3
Fix shell quoting for exec command arguments
...
Arguments with special characters (e.g. JSON strings) were passed
unquoted, causing breakage for commands like:
--speculative-config '{"method":"qwen3_next_mtp","num_speculative_tokens":2}'
Use printf %q in launch-cluster.sh and shlex.quote() in run-recipe.py
to properly escape arguments.
2026-03-04 15:22:42 +00:00
Eugene Rakhmatulin
fff1a24982
Rolling back base image
2026-03-04 07:19:43 -08:00
Eugene Rakhmatulin
ae19b66fdd
Bumped base image version
2026-03-03 23:31:51 -08:00
Erik Vullings
163f23d85b
Update qwen35-35b-a3b-fp8.yaml
...
--max_num_batched_tokens is a default variable now, which can be overriden via the CLI
2026-03-03 12:46:12 +01:00
Eugene Rakhmatulin
7d8465fd9c
Added recipe for qwen3.5-122b-int4-autoround, updated README
staging-current-1772608818
staging-current-1772608894
staging-current-1772609005
2026-03-02 12:18:16 -08:00
Eugene Rakhmatulin
8f11e7e5ed
Intel/Qwen3.5-122B-A10B-int4-AutoRound support via mods/fix-qwen3.5-autoround
2026-02-27 10:55:42 -08:00
Erik Vullings
e8f94d6b8b
Add Qwen35-35B-A3B recipe in FP8 format
2026-02-27 17:46:06 +01:00
Eugene Rakhmatulin
df88997449
piping exec command to docker logs when running in the daemon mode.
2026-02-26 18:19:38 -08:00
Eugene Rakhmatulin
15888c407a
Merge pull request #62
2026-02-26 15:24:42 -08:00
Eugene Rakhmatulin
c1c3b9d66a
support for daemon mode with exec command
2026-02-26 15:23:08 -08:00
Eugene Rakhmatulin
e9aa411e6c
Merge branch 'main' into pr-62
2026-02-26 14:57:32 -08:00
eugr
4593931421
Merge pull request #70 from hoesing/fix-rsync-path
...
Fix rsync failure if destination dir doesn't exist
2026-02-26 08:59:05 -08:00
J.J. Hoesing
358b4795b6
Add --mkpath to rsync args to handle the case where .cache/huggingface/hub doesn't already exist on the destination.
2026-02-26 03:12:34 -08:00
Eugene Rakhmatulin
dbd3d21fb8
allows $HF_HOME in hf-download.sh
2026-02-25 16:39:12 -08:00
Eugene Rakhmatulin
1c853b725e
allows to use $HF_HOME as huggingface cache directory, closes #68
2026-02-25 16:38:04 -08:00
Eugene Rakhmatulin
5a3536b38e
Fixed a bug where updated tags would cause git fetch to fail
2026-02-24 20:59:54 -08:00
Eugene Rakhmatulin
5ed2c23d0d
Mod for Intel/Qwen3-Coder-Next-INT4-Autoround model
2026-02-24 18:24:42 -08:00
Drew Botwinick
a276a76be2
support daemon mode for ACTION == exec
2026-02-23 23:12:52 -06:00
Eugene Rakhmatulin
3c27d521bb
Reverting another breaking vLLM PR, fixes #60
2026-02-23 09:51:45 -08:00
Eugene Rakhmatulin
4c8f90395b
Changed reasoning parser in MInimax for better compatibility with modern clients (like coding tools).
2026-02-21 11:53:13 -08:00
Eugene Rakhmatulin
349a270c1e
More robust handling of wheels downloads
2026-02-19 13:47:59 -08:00
Eugene Rakhmatulin
ad662f9bab
Changed MXFP4 CUTLASS SHA
2026-02-18 18:20:15 -08:00
Eugene Rakhmatulin
b959818536
MXFP4 fix cache bug
2026-02-18 16:53:57 -08:00
Eugene Rakhmatulin
c60c16e867
Temporary patch to reverse PR that fails builds
2026-02-18 16:20:20 -08:00
Eugene Rakhmatulin
f09c2c3ac8
Refactoring, updated README
2026-02-18 15:58:53 -08:00
Eugene Rakhmatulin
8873a0d959
Handle failed downloads properly
2026-02-18 14:55:43 -08:00
Eugene Rakhmatulin
12fd8a4503
Merge branch 'flashinfer-gen' of gitlab.home.eugr.net:ai/spark-vllm into flashinfer-gen
2026-02-18 14:47:20 -08:00
Eugene Rakhmatulin
34fff7b3fb
Download flashinfer wheels from releases
2026-02-18 14:46:01 -08:00
Eugene Rakhmatulin
a6fdf58a82
Merge branch 'main' into flashinfer-gen
2026-02-18 13:35:41 -08:00
Eugene Rakhmatulin
bd3f45f920
Updated MXFP4 build to use fresh repo references
2026-02-18 13:35:09 -08:00
Eugene Rakhmatulin
b06531f70b
Backup old wheels before rebuilding and restore on failure
2026-02-17 23:13:25 -08:00
Eugene Rakhmatulin
a49b89a0e5
Remove old wheels before rebuilding
2026-02-17 23:04:58 -08:00
Eugene Rakhmatulin
ec0f189256
Initial refactoring to enable separate wheel builds
2026-02-17 19:15:32 -08:00
Eugene Rakhmatulin
5b2313dddb
Changed KV type to fp8 in qwen3-coder-next recipe and reduced default context size to 131072 to ensure it all fits in a single Spark.
2026-02-17 13:07:54 -08:00