Eugene Rakhmatulin
|
7f0be29fcc
|
Handle edge case when two sparks have both cables plugged and assigned IPs
|
2026-03-31 11:59:03 -07:00 |
|
Eugene Rakhmatulin
|
41c0ce2c9a
|
Fixed FI PR
|
2026-03-30 14:25:42 -07:00 |
|
Eugene Rakhmatulin
|
45494688d1
|
Updated README, added NVFP4 fix
|
2026-03-30 11:45:40 -07:00 |
|
Eugene Rakhmatulin
|
a3201f8873
|
--flashinfer-ref / --apply-flashinfer-pr
|
2026-03-29 22:40:35 -07:00 |
|
Eugene Rakhmatulin
|
e471ca2436
|
Don't copy if -c is not specified
|
2026-03-28 18:12:32 -07:00 |
|
Eugene Rakhmatulin
|
32674c2619
|
removed temporary patch as it causes more issues.
|
2026-03-28 17:49:17 -07:00 |
|
Eugene Rakhmatulin
|
47f5f931b5
|
Allow to specify config file when doing setup
|
2026-03-28 14:55:31 -07:00 |
|
Eugene Rakhmatulin
|
d37217bad0
|
moved PR patch before the requirements patching
|
2026-03-28 09:22:19 -07:00 |
|
Eugene Rakhmatulin
|
e70c87b4f6
|
Added PR38423 (temp)
|
2026-03-28 08:50:54 -07:00 |
|
Eugene Rakhmatulin
|
c1a6cec074
|
Updated documentation; default image tags in build script
|
2026-03-27 16:41:09 -07:00 |
|
Eugene Rakhmatulin
|
51d69c5c17
|
commenting out non-applicable PRs
|
2026-03-27 16:15:54 -07:00 |
|
Eugene Rakhmatulin
|
e7f2ee692f
|
Added temporary patch to apply PR38126 that fixes broken NVFP4 quants
|
2026-03-27 09:30:26 -07:00 |
|
Eugene Rakhmatulin
|
101ae6fd56
|
Merge branch 'main' into 3-node-autodiscover
|
2026-03-27 09:02:10 -07:00 |
|
Eugene Rakhmatulin
|
f4ca15ce18
|
Made autoround mod optional to support latest version of vLLM. Fixes #144.
|
2026-03-27 09:00:50 -07:00 |
|
Eugene Rakhmatulin
|
3d918e0b82
|
Merge branch '3-node' into 3-node-autodiscover
|
2026-03-27 07:51:08 -07:00 |
|
eugr
|
47a896d722
|
Removed expert-parallel from 3x-node Qwen
|
2026-03-26 22:44:48 -07:00 |
|
Eugene Rakhmatulin
|
0fa585f909
|
Fix typo in pipeline_parallel setting in Qwen3.5-397B-INT4-Autoround recipe
|
2026-03-26 18:43:17 -07:00 |
|
Eugene Rakhmatulin
|
cecec74828
|
Add recipe for Qwen3.5-397B-INT4-Autoround in pipeline-parallel mode
|
2026-03-26 18:41:57 -07:00 |
|
Eugene Rakhmatulin
|
c8ee2a2511
|
Perform node count check in any mode
|
2026-03-26 18:15:09 -07:00 |
|
Eugene Rakhmatulin
|
ce293b5f05
|
Additional checks for parallelism and cluster size
|
2026-03-26 17:52:47 -07:00 |
|
Eugene Rakhmatulin
|
f872cc17a8
|
Fix for --setup behavior
|
2026-03-26 16:49:09 -07:00 |
|
Eugene Rakhmatulin
|
00c16746e5
|
Handle new copy hosts setup in run-recipe.py
|
2026-03-26 16:45:35 -07:00 |
|
Eugene Rakhmatulin
|
f163ca69de
|
Autodiscover tweaks
|
2026-03-26 16:30:05 -07:00 |
|
Eugene Rakhmatulin
|
a78e221de3
|
Autodiscovery refactoring with mesh support
|
2026-03-26 15:47:41 -07:00 |
|
Eugene Rakhmatulin
|
e6ee108cdf
|
Temporary patch for NVFP4
|
2026-03-26 11:43:44 -07:00 |
|
Eugene Rakhmatulin
|
174de6f0a8
|
temporary patch for PR38126
|
2026-03-26 08:58:04 -07:00 |
|
Eugene Rakhmatulin
|
83a74bccec
|
Removed extra solo mode check
|
2026-03-26 07:45:23 -07:00 |
|
Eugene Rakhmatulin
|
ff18a9ad5b
|
Merge branch '3-node' of gitlab.home.eugr.net:ai/spark-vllm into 3-node
|
2026-03-25 23:38:44 -07:00 |
|
Eugene Rakhmatulin
|
c08b34a218
|
add --config passthrough to run-recipe
|
2026-03-25 23:35:52 -07:00 |
|
Eugene Rakhmatulin
|
23cca2a11a
|
Merge branch '3-node' of gitlab.home.eugr.net:ai/spark-vllm into 3-node
|
2026-03-25 23:17:25 -07:00 |
|
Eugene Rakhmatulin
|
c2fe579ccc
|
Enhance .env file handling and validation in scripts
|
2026-03-25 23:16:56 -07:00 |
|
Eugene Rakhmatulin
|
8b7c02aa25
|
add .env support to build-and-copy.sh
|
2026-03-25 22:47:02 -07:00 |
|
Eugene Rakhmatulin
|
73fec1bdf8
|
bugfix
|
2026-03-25 15:40:09 -07:00 |
|
Eugene Rakhmatulin
|
2f5ff0211e
|
Cleanup in build script
|
2026-03-25 15:39:23 -07:00 |
|
Eugene Rakhmatulin
|
63ee72e729
|
Merge branch '3-node' of gitlab.home.eugr.net:ai/spark-vllm into 3-node
|
2026-03-25 15:36:31 -07:00 |
|
Eugene Rakhmatulin
|
4a0feea6c3
|
Added --cleanup option to build script
|
2026-03-25 15:35:32 -07:00 |
|
Eugene Rakhmatulin
|
429042b7dc
|
Revert "Added --cleanup option"
This reverts commit b8930b05a1.
|
2026-03-25 15:35:15 -07:00 |
|
Eugene Rakhmatulin
|
ef95336937
|
Merge branch '3-node' of gitlab.home.eugr.net:ai/spark-vllm into 3-node
|
2026-03-25 15:25:19 -07:00 |
|
Eugene Rakhmatulin
|
b8930b05a1
|
Added --cleanup option
|
2026-03-25 15:24:59 -07:00 |
|
Eugene Rakhmatulin
|
49d505ad14
|
Merge branch '3-node' of gitlab.home.eugr.net:ai/spark-vllm into 3-node
|
2026-03-25 15:16:47 -07:00 |
|
Eugene Rakhmatulin
|
1755dfd114
|
Added LOCAL_IP support
|
2026-03-25 15:16:06 -07:00 |
|
Eugene Rakhmatulin
|
3d4dc4c82e
|
Merge branch '3-node' of gitlab.home.eugr.net:ai/spark-vllm into 3-node
|
2026-03-25 14:42:37 -07:00 |
|
Eugene Rakhmatulin
|
07fac71dac
|
Fixed bug with CONTAINER_NAME variable
|
2026-03-25 14:42:01 -07:00 |
|
Eugene Rakhmatulin
|
1702f47df6
|
Merge branch '3-node' of gitlab.home.eugr.net:ai/spark-vllm into 3-node
|
2026-03-25 14:18:32 -07:00 |
|
Eugene Rakhmatulin
|
ad2cd3373f
|
.env configuration support for launch-cluster.sh
|
2026-03-25 14:18:00 -07:00 |
|
Eugene Rakhmatulin
|
1fd8c7afc3
|
Merge branch 'main' into 3-node
|
2026-03-25 12:45:40 -07:00 |
|
Eugene Rakhmatulin
|
3dcd2a90c1
|
Updated Nemotron-3-Super recipe
|
2026-03-25 12:44:44 -07:00 |
|
Eugene Rakhmatulin
|
efacbd69f2
|
Updated Nemotron3-Super recipe
|
2026-03-25 12:43:12 -07:00 |
|
Eugene Rakhmatulin
|
c4b078b868
|
Merge branch 'main' into 3-node
|
2026-03-24 22:21:25 -07:00 |
|
Eugene Rakhmatulin
|
3be2fb24a8
|
Merge pull request #122
|
2026-03-24 22:18:52 -07:00 |
|