Commit Graph

  • e9aa411e6c Merge branch 'main' into pr-62 Eugene Rakhmatulin 2026-02-26 14:57:32 -08:00
  • 4593931421 Merge pull request #70 from hoesing/fix-rsync-path eugr 2026-02-26 08:59:05 -08:00
  • 358b4795b6 Add --mkpath to rsync args to handle the case where .cache/huggingface/hub doesn't already exist on the destination. J.J. Hoesing 2026-02-26 03:12:34 -08:00
  • dbd3d21fb8 allows $HF_HOME in hf-download.sh Eugene Rakhmatulin 2026-02-25 16:39:12 -08:00
  • 1c853b725e allows to use $HF_HOME as huggingface cache directory, closes #68 Eugene Rakhmatulin 2026-02-25 16:38:04 -08:00
  • 5a3536b38e Fixed a bug where updated tags would cause git fetch to fail Eugene Rakhmatulin 2026-02-24 20:59:54 -08:00
  • 5ed2c23d0d Mod for Intel/Qwen3-Coder-Next-INT4-Autoround model Eugene Rakhmatulin 2026-02-24 18:24:42 -08:00
  • a276a76be2 support daemon mode for ACTION == exec Drew Botwinick 2026-02-23 23:12:52 -06:00
  • 3c27d521bb Reverting another breaking vLLM PR, fixes #60 Eugene Rakhmatulin 2026-02-23 09:51:45 -08:00
  • 4c8f90395b Changed reasoning parser in MInimax for better compatibility with modern clients (like coding tools). Eugene Rakhmatulin 2026-02-21 11:53:13 -08:00
  • 349a270c1e More robust handling of wheels downloads Eugene Rakhmatulin 2026-02-19 13:47:59 -08:00
  • ad662f9bab Changed MXFP4 CUTLASS SHA Eugene Rakhmatulin 2026-02-18 18:20:15 -08:00
  • b959818536 MXFP4 fix cache bug Eugene Rakhmatulin 2026-02-18 16:53:57 -08:00
  • c60c16e867 Temporary patch to reverse PR that fails builds Eugene Rakhmatulin 2026-02-18 16:20:20 -08:00
  • f09c2c3ac8 Refactoring, updated README Eugene Rakhmatulin 2026-02-18 15:58:53 -08:00
  • 8873a0d959 Handle failed downloads properly Eugene Rakhmatulin 2026-02-18 14:55:43 -08:00
  • 12fd8a4503 Merge branch 'flashinfer-gen' of gitlab.home.eugr.net:ai/spark-vllm into flashinfer-gen Eugene Rakhmatulin 2026-02-18 14:47:20 -08:00
  • 34fff7b3fb Download flashinfer wheels from releases Eugene Rakhmatulin 2026-02-18 14:46:01 -08:00
  • a6fdf58a82 Merge branch 'main' into flashinfer-gen Eugene Rakhmatulin 2026-02-18 13:35:41 -08:00
  • bd3f45f920 Updated MXFP4 build to use fresh repo references Eugene Rakhmatulin 2026-02-18 13:35:09 -08:00
  • b06531f70b Backup old wheels before rebuilding and restore on failure Eugene Rakhmatulin 2026-02-17 23:13:25 -08:00
  • a49b89a0e5 Remove old wheels before rebuilding Eugene Rakhmatulin 2026-02-17 23:04:58 -08:00
  • ec0f189256 Initial refactoring to enable separate wheel builds Eugene Rakhmatulin 2026-02-17 19:15:32 -08:00
  • 5b2313dddb Changed KV type to fp8 in qwen3-coder-next recipe and reduced default context size to 131072 to ensure it all fits in a single Spark. Eugene Rakhmatulin 2026-02-17 13:07:54 -08:00
  • 0249f1fdde Merge branch 'main' into privileged Eugene Rakhmatulin 2026-02-17 13:01:31 -08:00
  • ef07046d51 Now using an opened PR for glm-4.7-flash crash fix in the mod Eugene Rakhmatulin 2026-02-17 12:45:17 -08:00
  • 6aafc9c7d3 Merge branch 'main' into privileged Eugene Rakhmatulin 2026-02-16 11:38:41 -08:00
  • 1e7f2d5640 Small fix for M2.5 recipe Eugene Rakhmatulin 2026-02-16 11:38:34 -08:00
  • bd2085d783 Merge branch 'main' into privileged Eugene Rakhmatulin 2026-02-16 11:36:06 -08:00
  • 24f42be5cc Added a recipe for MiniMax M2.5 AWQ Eugene Rakhmatulin 2026-02-16 11:35:53 -08:00
  • 88a5d09748 Merge branch 'main' into privileged Eugene Rakhmatulin 2026-02-16 09:29:09 -08:00
  • c23aff91d3 Temporary fix for #38 Eugene Rakhmatulin 2026-02-16 09:23:10 -08:00
  • f886505436 Added --non-privileged flag to launch-cluster.sh Eugene Rakhmatulin 2026-02-15 00:12:06 -08:00
  • 4214d4fefe Caching cubins during build for reuse Eugene Rakhmatulin 2026-02-13 19:30:28 -08:00
  • 3470345624 Another fix for the Qwen mod as the slow PR was reversed in main Eugene Rakhmatulin 2026-02-13 13:46:00 -08:00
  • c0524608c2 Qwen3-coder-next mod - use a new PR instead of reverting previous one Eugene Rakhmatulin 2026-02-13 12:03:44 -08:00
  • 701147b1eb Qwen3-Coder-Next fixes and updated recipe Eugene Rakhmatulin 2026-02-12 15:56:32 -08:00
  • da4185cb12 Fixed an issue with fetching latest vLLM code Eugene Rakhmatulin 2026-02-11 22:35:49 -08:00
  • 3b1e49dcb0 Supporting other CUDA archs via --gpu-arch flag Eugene Rakhmatulin 2026-02-11 13:10:41 -08:00
  • c6b245cfe8 Added prefix caching to nemotron recipe Eugene Rakhmatulin 2026-02-10 18:25:01 -08:00
  • 6d3f5dfd5c map flashinfer/torch/triton cache directories by default Eugene Rakhmatulin 2026-02-10 16:36:02 -08:00
  • b990a1b8ac Fixed #37 Eugene Rakhmatulin 2026-02-10 14:31:43 -08:00
  • ace16f3a8f Applied new fastsafetensors fix to mxfp4 build; disabled wheel builds by default Eugene Rakhmatulin 2026-02-09 23:47:06 -08:00
  • 74876dd442 Added recipes for nemotron-nano-3 and qwen3-coder-next Eugene Rakhmatulin 2026-02-09 14:33:35 -08:00
  • 3aa5e5dce4 Merge pull request #34 Eugene Rakhmatulin 2026-02-09 14:28:30 -08:00
  • 6943a51ced Adding tests and refactoring repeated methods Raphael Amorim 2026-02-09 17:21:32 -05:00
  • d07ad5450f Adding solo_only option to the recipe Raphael Amorim 2026-02-09 17:03:57 -05:00
  • 2923fe6ea5 Removed temp fastsafetensors patch Eugene Rakhmatulin 2026-02-09 10:21:14 -08:00
  • 06e8817f18 Triton 3.6.0 is now default Eugene Rakhmatulin 2026-02-08 22:38:31 -08:00
  • d845cd0401 changed arch to 12.1a again Eugene Rakhmatulin 2026-02-08 14:18:12 -08:00
  • 5bf422a2ca Merge branch 'main' into pytorch-base Eugene Rakhmatulin 2026-02-08 13:01:17 -08:00
  • 15c1506d0c Merge pull request #32 Eugene Rakhmatulin 2026-02-08 07:17:20 -08:00
  • b7c3cdcfcb Enhancement: add -- pass-through for arbitrary vLLM arguments Raphael Amorim 2026-02-08 02:36:49 -05:00
  • dfb300e51a Merge branch 'main' into pytorch-base Eugene Rakhmatulin 2026-02-05 13:54:12 -08:00
  • 8cb956b972 Updated networking guide Eugene Rakhmatulin 2026-02-05 13:53:57 -08:00
  • 66210e641d Merge branch 'main' into pytorch-base Eugene Rakhmatulin 2026-02-04 12:07:06 -08:00
  • f139c4b55d Updated tests Eugene Rakhmatulin 2026-02-04 12:06:30 -08:00
  • c7d45157e0 Merge pull request #19 Eugene Rakhmatulin 2026-02-04 12:03:20 -08:00
  • ec987259a0 Recipes and Launch Script support Eugene Rakhmatulin 2026-02-04 12:01:53 -08:00
  • ef6a5eca29 Merge branch 'main' into pr-19 Eugene Rakhmatulin 2026-02-04 11:36:59 -08:00
  • f7830636af Cleaning up launch-cluster changes Eugene Rakhmatulin 2026-02-04 11:36:55 -08:00
  • b1516f688a fix: Allow PR tests from any branch and add manual trigger Raphael Amorim 2026-02-03 17:35:33 -05:00
  • 28ba6090fc Adding suggestions from Eugr and unit tests Raphael Amorim 2026-02-03 17:32:59 -05:00
  • d8e183cc9b Merge branch 'apply-pr' into pytorch-base Eugene Rakhmatulin 2026-02-03 14:17:46 -08:00
  • c42cc56d34 bugfix Eugene Rakhmatulin 2026-02-03 14:17:30 -08:00
  • 79e646e833 Merge branch 'apply-pr' into pytorch-base Eugene Rakhmatulin 2026-02-03 14:14:45 -08:00
  • d7e9f17c2e vLLM build-time PRs support Eugene Rakhmatulin 2026-02-03 14:14:11 -08:00
  • 1e5aa060b8 Updated README to include networking guide Eugene Rakhmatulin 2026-02-03 14:14:05 -08:00
  • 30f16f1d4e feat: Add recipe-based one-click model deployment system Raphael Amorim 2026-02-03 15:32:28 -05:00
  • ecf7f5f7b5 Merge branch 'main' into pytorch-base Eugene Rakhmatulin 2026-02-03 12:55:03 -08:00
  • f8eb294c58 Updated README.md and added Networking Guide. Eugene Rakhmatulin 2026-02-03 12:54:38 -08:00
  • 4b9ab0de7c Added ability to launch NGC container in the cluster Eugene Rakhmatulin 2026-02-02 16:57:04 -08:00
  • 997bf9ea0e Merge branch 'main' into pytorch-base Eugene Rakhmatulin 2026-02-02 12:44:15 -08:00
  • 4634ee92a2 Added a mod for Nemotron Nano Eugene Rakhmatulin 2026-02-02 11:58:07 -08:00
  • 37953478f0 changed arch codes again to be in line with upcoming PR Eugene Rakhmatulin 2026-02-02 09:21:48 -08:00
  • 751bc5a47a Adding sample profile and profile loader Raphael Amorim 2026-01-25 21:22:45 -05:00
  • 3c7f91081d changed arch flags Eugene Rakhmatulin 2026-02-01 16:37:01 -08:00
  • 5f7d480801 Reverted Triton removal to use system triton package Eugene Rakhmatulin 2026-01-31 23:23:59 -08:00
  • 133ed9cfb9 bumped up MXFP4 base image version Eugene Rakhmatulin 2026-01-31 16:12:33 -08:00
  • c81edce091 bumped up MXFP4 base image version Eugene Rakhmatulin 2026-01-31 16:12:33 -08:00
  • 9691eed1b0 Disabled Triton build for now Eugene Rakhmatulin 2026-01-31 00:10:52 -08:00
  • 7c61b4057c Added Triton compilation to custom build Eugene Rakhmatulin 2026-01-30 23:44:20 -08:00
  • 0482435848 Restore previous wheels build Eugene Rakhmatulin 2026-01-30 18:43:39 -08:00
  • a6d6bafa69 Merge branch 'main' into pytorch-base Eugene Rakhmatulin 2026-01-30 17:06:29 -08:00
  • 4a4b4e7610 Fixed a bug when solo mode failed on a standalone Spark without configured RoCE. Eugene Rakhmatulin 2026-01-30 16:39:11 -08:00
  • a4b524625a using "from scratch" build for wheels to reduce image size Eugene Rakhmatulin 2026-01-30 16:29:47 -08:00
  • 518dc0108b moved deps buster Eugene Rakhmatulin 2026-01-30 15:25:54 -08:00
  • 57c890b10c Reduced MXFP4 container size Eugene Rakhmatulin 2026-01-30 15:18:42 -08:00
  • 008af21383 Merge branch 'main' into pytorch-base Eugene Rakhmatulin 2026-01-30 13:37:03 -08:00
  • a13c7d3007 cosmetic changes Eugene Rakhmatulin 2026-01-30 13:26:57 -08:00
  • 7dd0642621 Reduced final image size Eugene Rakhmatulin 2026-01-30 13:16:55 -08:00
  • be19675980 Fixed initial vllm source fetch if not using main branch Eugene Rakhmatulin 2026-01-30 11:24:51 -08:00
  • 3a68e1ca46 Fixed #25 Eugene Rakhmatulin 2026-01-30 11:20:29 -08:00
  • af6d5eae32 Temporarily removing incompatible triton-kernels Eugene Rakhmatulin 2026-01-30 11:17:38 -08:00
  • 7d232a305a Reverted to Torch 2.9.1 in the source build to address #24 Eugene Rakhmatulin 2026-01-30 10:43:12 -08:00
  • 34bd3ae39c Fixed fetching vllm source code in MXFP4 version. Eugene Rakhmatulin 2026-01-30 09:07:01 -08:00
  • 458439706a Build flashinfer from source Eugene Rakhmatulin 2026-01-30 09:05:22 -08:00
  • ef0f996df6 Bumped base image version; reverted Triton to 3.5.1 Eugene Rakhmatulin 2026-01-29 23:14:43 -08:00
  • 0ac438b4dd Some optimizations Eugene Rakhmatulin 2026-01-29 22:08:05 -08:00
  • a5b693cc1e Merge branch 'main' into pytorch-base Eugene Rakhmatulin 2026-01-29 18:18:35 -08:00