spark-vllm-docker/mods at c4860b86a279d2d28af1c6520f769f95b9a729d0 - spark-vllm-docker - Gitea: Git with a cup of tea

software-engineering/spark-vllm-docker

Files

History

Eugene Rakhmatulin f4ca15ce18 Made autoround mod optional to support latest version of vLLM. Fixes #144 .

2026-03-27 09:00:50 -07:00

..

fix-glm-4.7-flash-AWQ

Now using an opened PR for glm-4.7-flash crash fix in the mod

2026-02-17 12:45:17 -08:00

fix-qwen3-coder-next

Another fix for the Qwen mod as the slow PR was reversed in main

2026-02-13 13:46:00 -08:00

fix-qwen3-next-autoround

Mod for Intel/Qwen3-Coder-Next-INT4-Autoround model

2026-02-24 18:24:42 -08:00

fix-qwen3.5-autoround

Made autoround mod optional to support latest version of vLLM. Fixes #144 .

2026-03-27 09:00:50 -07:00

fix-qwen3.5-chat-template

Unsloth chat template for qwen3.5

2026-03-06 23:35:18 -08:00

fix-qwen35-tp4-marlin

Add Qwen3.5-397B INT4-AutoRound TP=4 recipe and Marlin fix

2026-03-09 21:30:28 +00:00

fix-Salyut1-GLM-4.7-NVFP4

initial mod implementation

2025-12-23 13:38:10 -08:00

gpu-mem-util-gb

Experimental mod to support gpu-memory-utilization-gb

2026-03-12 13:37:44 -07:00

Added ability to launch NGC container in the cluster

2026-02-02 16:57:04 -08:00

super nemotron mod & recipe for nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4

2026-03-11 20:53:44 +01:00

Major cluster orchestration refactoring to support running without Ray

2026-03-13 11:55:18 -07:00