a5b1c7006e
fix image name
Build and Push spark-vllm / docker (push) Failing after 6m2s
2026-05-11 15:04:37 -05:00
ee6129d54e
push run num
Build and Push spark-vllm / docker (push) Has been cancelled
2026-05-11 15:03:02 -05:00
f30289ec57
fix tag and push
Build and Push spark-vllm / docker (push) Failing after 30s
2026-05-11 15:01:19 -05:00
97e6afcf3b
fix label
Build and Push spark-vllm / docker (push) Failing after 7m12s
2026-05-11 14:50:08 -05:00
eae788259a
run job on arm64
Build and Push spark-vllm / docker (push) Has been cancelled
2026-05-11 14:48:16 -05:00
896cdefedf
build on arm64
Build and Push spark-vllm / docker (push) Failing after 1m30s
2026-05-11 14:40:20 -05:00
d3dbfb682a
set docker platform to arm64
Build and Push spark-vllm / docker (push) Failing after 4m39s
2026-05-11 14:34:17 -05:00
0bb0da779e
run using bash
Build and Push spark-vllm / docker (push) Failing after 15m0s
2026-05-11 13:40:42 -05:00
f307d8dc76
Merge branch 'main' of gitea.corredorconect.com:software-engineering/spark-vllm-docker
Build and Push spark-vllm / docker (push) Failing after 4s
2026-05-11 13:21:53 -05:00
1d0fe50d46
login using action
2026-05-11 13:21:19 -05:00
f24d177802
Update README.md
Build and Push spark-vllm / docker (push) Failing after 4s
2026-05-11 18:20:33 +00:00
bb0d120177
gitea workflow
2026-05-11 13:16:59 -05:00
eugr
ba9dde963f
Fixed 3-node Qwen 397B recipe to prevent OOM and use instanttensor
prebuilt-flashinfer-current
prebuilt-vllm-current
2026-05-10 22:20:49 -07:00
eugr
ae8ac815ac
Adjusted Qwen3.5-397B recipe to fix OOM issue and lower memory requirements
2026-05-09 13:45:15 -07:00
eugr
83a680c87b
Fixed OOM for Qwen3.5-397B
2026-05-09 13:25:31 -07:00
Eugene Rakhmatulin
69ea62294f
remove unnecessary mod from qwen3-coder-next template
2026-05-08 16:32:54 -07:00
Eugene Rakhmatulin
8e548ce664
Fixed typo
2026-05-08 14:59:13 -07:00
Eugene Rakhmatulin
bca64f9a53
Performance regression fix
2026-05-08 13:40:55 -07:00
Eugene Rakhmatulin
29d5904b80
Fix performance regression
2026-05-08 12:56:28 -07:00
Eugene Rakhmatulin
b87854fd4c
Fixed qwen3.6 recipes
2026-05-06 10:56:09 -07:00
Eugene Rakhmatulin
c67c5b5c1e
Add chat template and recipe for Qwen3.6-35B-A3B-FP8 model
2026-05-06 10:32:46 -07:00
Eugene Rakhmatulin
9fbed882bc
Added EXPERIMENTAL mod for b12x - initial support
2026-04-29 14:38:37 -07:00
Eugene Rakhmatulin
97e51d5d23
fixed gemma4 recipe
2026-04-29 12:56:07 -07:00
Eugene Rakhmatulin
87cb9f6e1e
Reverted gemma4 to safetensors. Fixes #214 and #217 .
2026-04-29 10:56:40 -07:00
eugr
e3243bf555
Merge pull request #197 from mmonad/minimax-m2.7-awq-recipe
...
Add recipe for MiniMax-M2.7-AWQ
2026-04-25 19:26:43 -07:00
Eugene Rakhmatulin
43a00ed90f
Fixed #205
2026-04-25 18:39:46 -07:00
eugr
ef9b0e50f4
Merge pull request #210 from Kaweees/main
...
Update gpu-mem-util-gb: patch with new vLLM default value
2026-04-25 10:00:52 -07:00
Miguel Villa Floran
c1e952de2e
Update gpu-mem-util-gb: patch with new vLLM default value
2026-04-24 11:40:41 -07:00
Eugene Rakhmatulin
b13a3600d3
Remove a dependency
2026-04-23 07:47:23 -07:00
Eugene Rakhmatulin
7dea11bbf0
More robust handling of PRs
2026-04-22 13:18:12 -07:00
Eugene Rakhmatulin
c187912e23
Removed merged PRs
2026-04-21 09:47:26 -07:00
L.B.R.
caa28c8e12
Add recipe for MiniMax-M2.7-AWQ
...
Add a vLLM serving recipe for the MiniMax M2.7 model using
the cyankiwi/MiniMax-M2.7-AWQ-4bit quantization. Uses the
same minimax_m2 tool-call and reasoning parsers as the
existing M2 recipe, with Ray distributed backend on 2 GPUs.
2026-04-18 22:44:26 +01:00
Eugene Rakhmatulin
5415c1fe9e
Include a PR to fix broken torch bindings (vllm pr 40191)
2026-04-18 09:19:50 -07:00
Eugene Rakhmatulin
d49fac1b8b
Re-enable flashinfer_cutlass
2026-04-16 16:40:56 -07:00
Eugene Rakhmatulin
6b7f8dace6
Fixes #187
2026-04-15 22:32:14 -07:00
Eugene Rakhmatulin
76fbf0d0be
Fix for broken MiniMax M2 parser
2026-04-15 16:31:50 -07:00
Eugene Rakhmatulin
b7830469be
Updated README
2026-04-14 17:23:42 -07:00
Eugene Rakhmatulin
b50fa426c8
Merge pull request #190
2026-04-14 17:18:56 -07:00
Tim Messerschmidt
2c13e1ce25
Add InstantTensor to runtime dependencies
2026-04-14 19:38:36 +02:00
Eugene Rakhmatulin
c026c92bd0
Updated README
2026-04-13 11:27:57 -07:00
Eugene Rakhmatulin
cf4cb35356
added new flashinfer build dependency
2026-04-13 08:47:34 -07:00
Eugene Rakhmatulin
1ad85442ac
Added a helper mod for Qwen3.5-397B recipe
2026-04-12 19:14:23 -07:00
Eugene Rakhmatulin
30919581ee
Included .gitgnore in wheels
2026-04-11 17:02:39 -07:00
Eugene Rakhmatulin
b7c8616743
Pinned pytorch version
2026-04-11 11:54:46 -07:00
Eugene Rakhmatulin
8e8e850ef1
fix for new requirements structure
2026-04-10 20:14:47 -07:00
Eugene Rakhmatulin
fc08740fba
Increased uv timeout
2026-04-10 19:38:38 -07:00
Eugene Rakhmatulin
288da8e911
Mod to fix Gemma4 tool parser
2026-04-04 16:48:07 -07:00
Eugene Rakhmatulin
7bc4e4ce5e
Fixes #158 by adding build args to gemma4 recipe
2026-04-04 10:46:06 -07:00
Eugene Rakhmatulin
49d6d9fefd
Removed PR2927 as it's been merged
2026-04-03 16:56:00 -07:00
Eugene Rakhmatulin
4afca860a5
Fix broken compilation (PR 38919)
2026-04-03 10:22:10 -07:00