Eugene Rakhmatulin
|
15d295887c
|
Updated README to reflect --master-port parameter
|
2026-03-18 21:23:28 -07:00 |
|
Eugene Rakhmatulin
|
57b458570e
|
Added experimental Qwen3.5-397B support for dual Spark configuration
|
2026-03-17 19:05:36 -07:00 |
|
Eugene Rakhmatulin
|
57ed099465
|
Updated README file to reflect new launch-cluster options.
|
2026-03-17 16:16:04 -07:00 |
|
Eugene Rakhmatulin
|
fb0687cd1b
|
Updated README to describe no-ray mode
|
2026-03-17 15:27:22 -07:00 |
|
Eugene Rakhmatulin
|
8fec9bed06
|
Updated Nemotron to support dual sparks
|
2026-03-12 13:30:15 -07:00 |
|
Eugene Rakhmatulin
|
45066e2b16
|
Updated README
|
2026-03-11 09:57:34 -07:00 |
|
Eugene Rakhmatulin
|
505a060a7d
|
vLLM prebuilt wheels support
|
2026-03-04 16:01:50 -08:00 |
|
Eugene Rakhmatulin
|
7d8465fd9c
|
Added recipe for qwen3.5-122b-int4-autoround, updated README
|
2026-03-02 12:18:16 -08:00 |
|
Eugene Rakhmatulin
|
f09c2c3ac8
|
Refactoring, updated README
|
2026-02-18 15:58:53 -08:00 |
|
Eugene Rakhmatulin
|
bd3f45f920
|
Updated MXFP4 build to use fresh repo references
|
2026-02-18 13:35:09 -08:00 |
|
Eugene Rakhmatulin
|
5b2313dddb
|
Changed KV type to fp8 in qwen3-coder-next recipe and reduced default context size to 131072 to ensure it all fits in a single Spark.
|
2026-02-17 13:07:54 -08:00 |
|
Eugene Rakhmatulin
|
f886505436
|
Added --non-privileged flag to launch-cluster.sh
|
2026-02-15 00:12:06 -08:00 |
|
Eugene Rakhmatulin
|
701147b1eb
|
Qwen3-Coder-Next fixes and updated recipe
|
2026-02-12 15:56:32 -08:00 |
|
Eugene Rakhmatulin
|
3b1e49dcb0
|
Supporting other CUDA archs via --gpu-arch flag
|
2026-02-11 13:10:41 -08:00 |
|
Eugene Rakhmatulin
|
6d3f5dfd5c
|
map flashinfer/torch/triton cache directories by default
|
2026-02-10 16:36:02 -08:00 |
|
Eugene Rakhmatulin
|
ace16f3a8f
|
Applied new fastsafetensors fix to mxfp4 build; disabled wheel builds by default
|
2026-02-09 23:47:06 -08:00 |
|
Eugene Rakhmatulin
|
2923fe6ea5
|
Removed temp fastsafetensors patch
|
2026-02-09 10:21:14 -08:00 |
|
Eugene Rakhmatulin
|
ec987259a0
|
Recipes and Launch Script support
|
2026-02-04 12:01:53 -08:00 |
|
Eugene Rakhmatulin
|
ef6a5eca29
|
Merge branch 'main' into pr-19
|
2026-02-04 11:36:59 -08:00 |
|
Eugene Rakhmatulin
|
1e5aa060b8
|
Updated README to include networking guide
|
2026-02-03 14:14:05 -08:00 |
|
Eugene Rakhmatulin
|
f8eb294c58
|
Updated README.md and added Networking Guide.
|
2026-02-03 12:54:38 -08:00 |
|
Eugene Rakhmatulin
|
4b9ab0de7c
|
Added ability to launch NGC container in the cluster
|
2026-02-02 16:57:04 -08:00 |
|
Eugene Rakhmatulin
|
4634ee92a2
|
Added a mod for Nemotron Nano
|
2026-02-02 11:58:07 -08:00 |
|
Raphael Amorim
|
751bc5a47a
|
Adding sample profile and profile loader
|
2026-02-02 10:25:53 -05:00 |
|
Eugene Rakhmatulin
|
ace61c2d55
|
added new mod for glm4.7-flash-awq, solo model support.
|
2026-01-29 18:18:00 -08:00 |
|
Eugene Rakhmatulin
|
7a81e90cd2
|
added -e parameter
|
2026-01-29 13:06:22 -08:00 |
|
Eugene Rakhmatulin
|
a3afb6f313
|
Merge branch 'main' into mxfp4
|
2026-01-28 13:25:26 -08:00 |
|
Eugene Rakhmatulin
|
74c02c37c2
|
warning message about wheel builds
|
2026-01-28 13:25:02 -08:00 |
|
Eugene Rakhmatulin
|
6b11902cc8
|
Updated README
|
2026-01-26 23:18:27 -08:00 |
|
Eugene Rakhmatulin
|
18a25c8382
|
Updated README
|
2026-01-08 14:38:12 -08:00 |
|
Eugene Rakhmatulin
|
4ee090f632
|
Updated README re: hf-download option
|
2025-12-24 08:37:33 -08:00 |
|
Eugene Rakhmatulin
|
04e6d27b84
|
Updated README re: mods functionality
|
2025-12-23 18:09:59 -08:00 |
|
Eugene Rakhmatulin
|
786a50c5c7
|
Updated README
|
2025-12-21 22:41:48 -08:00 |
|
Eugene Rakhmatulin
|
1139a37324
|
Added transformers v5 support
|
2025-12-21 22:41:03 -08:00 |
|
Eugene Rakhmatulin
|
c37053adf6
|
Updated README
|
2025-12-21 14:57:35 -08:00 |
|
Eugene Rakhmatulin
|
82802f0cad
|
Added Quickstart section to README
|
2025-12-21 14:53:05 -08:00 |
|
Eugene Rakhmatulin
|
bbd3469549
|
Support vLLM release wheels
|
2025-12-21 11:15:52 -08:00 |
|
Eugene Rakhmatulin
|
2aa545a810
|
Added PSA about build cache
|
2025-12-21 00:49:59 -08:00 |
|
Eugene Rakhmatulin
|
63a1a6a97c
|
Update README to reflect reduced build time and container size for vLLM
|
2025-12-20 23:16:12 -08:00 |
|
Eugene Rakhmatulin
|
dfe426e912
|
Add support for pre-release FlashInfer packages in Docker builds
|
2025-12-20 23:13:26 -08:00 |
|
Eugene Rakhmatulin
|
76988e0c75
|
Added --use-wheels to use precompiled vLLM wheels instead of compiling from the source
|
2025-12-20 20:25:07 -08:00 |
|
Eugene Rakhmatulin
|
0cac77c286
|
Fixed contributor username
|
2025-12-19 10:41:03 -08:00 |
|
Eugene Rakhmatulin
|
3eb57a6d49
|
Updated README - autodiscovery in copy ops
|
2025-12-19 10:39:28 -08:00 |
|
Eugene Rakhmatulin
|
244ad758d2
|
Updated README
|
2025-12-19 09:56:24 -08:00 |
|
Eugene Rakhmatulin
|
23858a3c7f
|
Merge branch 'main' into pr-2
|
2025-12-19 08:51:52 -08:00 |
|
Eugene Rakhmatulin
|
de055928b8
|
Update CHANGELOG: Document --nccl-debug option for NCCL debug level control
|
2025-12-18 23:29:03 -08:00 |
|
Eugene Rakhmatulin
|
294d155532
|
Add NCCL debug level option to launch-cluster.sh
|
2025-12-18 23:28:12 -08:00 |
|
Eugene Rakhmatulin
|
8c53179cc2
|
changed extra docker args variable to VLLM_SPARK_EXTRA_DOCKER_ARGS for consistency
|
2025-12-18 22:27:27 -08:00 |
|
Eugene Rakhmatulin
|
cf9da89545
|
Updated README
|
2025-12-18 22:03:46 -08:00 |
|
Eugene Rakhmatulin
|
e6efd668cd
|
Added Table of Contents to README
|
2025-12-18 15:43:09 -08:00 |
|