Commit Graph

111 Commits

Author SHA1 Message Date
Eugene Rakhmatulin
067bbbbb2d Merge branch 'mxfp4' 2026-01-29 14:20:07 -08:00
Eugene Rakhmatulin
9a907caffc mxfp4 dockerfile optimizations 2026-01-29 14:17:36 -08:00
Eugene Rakhmatulin
7a81e90cd2 added -e parameter 2026-01-29 13:06:22 -08:00
Eugene Rakhmatulin
53a8b45bcb Added experimental MXFP4 optimizations 2026-01-29 11:56:17 -08:00
Eugene Rakhmatulin
b58ba7b19a Added cubins and jit-cache 2026-01-29 11:42:04 -08:00
Eugene Rakhmatulin
36e3b7af27 Removed unnessesary dependencies 2026-01-29 09:58:44 -08:00
Eugene Rakhmatulin
e4b57633fe moved everything to uv 2026-01-29 08:34:49 -08:00
Eugene Rakhmatulin
a3afb6f313 Merge branch 'main' into mxfp4 2026-01-28 13:25:26 -08:00
Eugene Rakhmatulin
74c02c37c2 warning message about wheel builds 2026-01-28 13:25:02 -08:00
Eugene Rakhmatulin
cef3727f26 Updated SHA for repos 2026-01-28 13:20:03 -08:00
Eugene Rakhmatulin
6b11902cc8 Updated README 2026-01-26 23:18:27 -08:00
Eugene Rakhmatulin
564afc1f6b Working MXFP4 fork, updated build script 2026-01-26 22:31:46 -08:00
Eugene Rakhmatulin
90c8b30276 Merge branch 'main' into mxfp4 2026-01-26 16:17:58 -08:00
Eugene Rakhmatulin
e817f3dbec Updated Triton version to 3.6.0 2026-01-26 14:24:58 -08:00
Eugene Rakhmatulin
aece2fad78 Initial import of MXFP4 branch 2026-01-24 22:40:36 -08:00
Eugene Rakhmatulin
25a16ef6c2 Fixed #11 and #12 - added a new dependency for OpenCV 2026-01-19 12:07:15 -08:00
Eugene Rakhmatulin
cd7678fe9f Added MIT license 2026-01-13 19:38:24 +00:00
Eugene Rakhmatulin
18a25c8382 Updated README 2026-01-08 14:38:12 -08:00
Eugene Rakhmatulin
4ee090f632 Updated README re: hf-download option 2025-12-24 08:37:33 -08:00
Eugene Rakhmatulin
2a568481f0 Model download support 2025-12-24 00:30:15 -08:00
Eugene Rakhmatulin
04e6d27b84 Updated README re: mods functionality 2025-12-23 18:09:59 -08:00
Eugene Rakhmatulin
9ad61078ce Added multiple mods support 2025-12-23 17:45:55 -08:00
Eugene Rakhmatulin
c90a6d0bde Fixed remote docker execution 2025-12-23 13:49:38 -08:00
Eugene Rakhmatulin
19dec79c5c initial mod implementation 2025-12-23 13:38:10 -08:00
Eugene Rakhmatulin
a9b1bb5947 fixed a bug with numpy version in wheels build when transformers 5 is used. 2025-12-21 22:53:31 -08:00
Eugene Rakhmatulin
1464b0dc8f Display image name in launch-cluster.sh output 2025-12-21 22:44:01 -08:00
Eugene Rakhmatulin
786a50c5c7 Updated README 2025-12-21 22:41:48 -08:00
Eugene Rakhmatulin
1139a37324 Added transformers v5 support 2025-12-21 22:41:03 -08:00
Eugene Rakhmatulin
c37053adf6 Updated README 2025-12-21 14:57:35 -08:00
Eugene Rakhmatulin
82802f0cad Added Quickstart section to README 2025-12-21 14:53:05 -08:00
Eugene Rakhmatulin
11db634aad Switch to uv in the main Dockerfile 2025-12-21 13:28:40 -08:00
Eugene Rakhmatulin
bbd3469549 Support vLLM release wheels 2025-12-21 11:15:52 -08:00
Eugene Rakhmatulin
2aa545a810 Added PSA about build cache 2025-12-21 00:49:59 -08:00
Eugene Rakhmatulin
63a1a6a97c Update README to reflect reduced build time and container size for vLLM 2025-12-20 23:16:12 -08:00
Eugene Rakhmatulin
dfe426e912 Add support for pre-release FlashInfer packages in Docker builds 2025-12-20 23:13:26 -08:00
Eugene Rakhmatulin
1b3968fe98 Merge branch 'flashinfer-0.6.0-pre' 2025-12-20 23:02:58 -08:00
Eugene Rakhmatulin
9f35dbdd2d Reverted back to release flashinfer 2025-12-20 23:01:49 -08:00
Eugene Rakhmatulin
d5d85aaac7 Added optional flashinfer packages, using pre-release flashinfer 2025-12-20 22:56:40 -08:00
Eugene Rakhmatulin
76988e0c75 Added --use-wheels to use precompiled vLLM wheels instead of compiling from the source 2025-12-20 20:25:07 -08:00
Eugene Rakhmatulin
a83200573a Enhance Dockerfile: limit ccache size, enable compression, and optimize git repo size 2025-12-20 15:29:37 -08:00
Eugene Rakhmatulin
fbb1bf73d5 Switching to flashinfer 0.6.x pre-release wheels 2025-12-20 13:28:06 -08:00
Eugene Rakhmatulin
f075801c59 Fixed launch_cluster bug introduced by refactoring 2025-12-19 10:51:50 -08:00
Eugene Rakhmatulin
0cac77c286 Fixed contributor username 2025-12-19 10:41:03 -08:00
Eugene Rakhmatulin
3eb57a6d49 Updated README - autodiscovery in copy ops 2025-12-19 10:39:28 -08:00
Eugene Rakhmatulin
a351f182cc Implement autodiscovery for copy hosts and enhance interface detection in build-and-copy and launch-cluster scripts 2025-12-19 10:36:39 -08:00
Eugene Rakhmatulin
244ad758d2 Updated README 2025-12-19 09:56:24 -08:00
Eugene Rakhmatulin
074316de68 Merge pull request #2 2025-12-19 08:59:29 -08:00
Eugene Rakhmatulin
23858a3c7f Merge branch 'main' into pr-2 2025-12-19 08:51:52 -08:00
Eugene Rakhmatulin
de055928b8 Update CHANGELOG: Document --nccl-debug option for NCCL debug level control 2025-12-18 23:29:03 -08:00
Eugene Rakhmatulin
294d155532 Add NCCL debug level option to launch-cluster.sh 2025-12-18 23:28:12 -08:00