Commit Graph

194 Commits

Author SHA1 Message Date
Eugene Rakhmatulin
f8eb294c58 Updated README.md and added Networking Guide. 2026-02-03 12:54:38 -08:00
Eugene Rakhmatulin
4b9ab0de7c Added ability to launch NGC container in the cluster 2026-02-02 16:57:04 -08:00
Eugene Rakhmatulin
997bf9ea0e Merge branch 'main' into pytorch-base 2026-02-02 12:44:15 -08:00
Eugene Rakhmatulin
4634ee92a2 Added a mod for Nemotron Nano 2026-02-02 11:58:07 -08:00
Eugene Rakhmatulin
37953478f0 changed arch codes again to be in line with upcoming PR 2026-02-02 09:21:48 -08:00
Raphael Amorim
751bc5a47a Adding sample profile and profile loader 2026-02-02 10:25:53 -05:00
Eugene Rakhmatulin
3c7f91081d changed arch flags 2026-02-01 16:37:01 -08:00
Eugene Rakhmatulin
5f7d480801 Reverted Triton removal to use system triton package 2026-01-31 23:23:59 -08:00
Eugene Rakhmatulin
133ed9cfb9 bumped up MXFP4 base image version 2026-01-31 16:17:58 -08:00
Eugene Rakhmatulin
c81edce091 bumped up MXFP4 base image version 2026-01-31 16:12:33 -08:00
Eugene Rakhmatulin
9691eed1b0 Disabled Triton build for now 2026-01-31 00:10:52 -08:00
Eugene Rakhmatulin
7c61b4057c Added Triton compilation to custom build 2026-01-30 23:44:20 -08:00
Eugene Rakhmatulin
0482435848 Restore previous wheels build 2026-01-30 18:43:39 -08:00
Eugene Rakhmatulin
a6d6bafa69 Merge branch 'main' into pytorch-base 2026-01-30 17:06:29 -08:00
Eugene Rakhmatulin
4a4b4e7610 Fixed a bug when solo mode failed on a standalone Spark without configured RoCE. 2026-01-30 16:39:11 -08:00
Eugene Rakhmatulin
a4b524625a using "from scratch" build for wheels to reduce image size 2026-01-30 16:29:47 -08:00
Eugene Rakhmatulin
518dc0108b moved deps buster 2026-01-30 15:25:54 -08:00
Eugene Rakhmatulin
57c890b10c Reduced MXFP4 container size 2026-01-30 15:18:42 -08:00
Eugene Rakhmatulin
008af21383 Merge branch 'main' into pytorch-base 2026-01-30 13:37:03 -08:00
Eugene Rakhmatulin
a13c7d3007 cosmetic changes 2026-01-30 13:26:57 -08:00
Eugene Rakhmatulin
7dd0642621 Reduced final image size 2026-01-30 13:16:55 -08:00
Eugene Rakhmatulin
be19675980 Fixed initial vllm source fetch if not using main branch 2026-01-30 11:24:51 -08:00
Eugene Rakhmatulin
3a68e1ca46 Fixed #25 2026-01-30 11:20:29 -08:00
Eugene Rakhmatulin
af6d5eae32 Temporarily removing incompatible triton-kernels 2026-01-30 11:17:38 -08:00
Eugene Rakhmatulin
7d232a305a Reverted to Torch 2.9.1 in the source build to address #24 2026-01-30 10:43:12 -08:00
Eugene Rakhmatulin
34bd3ae39c Fixed fetching vllm source code in MXFP4 version. 2026-01-30 09:07:01 -08:00
Eugene Rakhmatulin
458439706a Build flashinfer from source 2026-01-30 09:05:22 -08:00
Eugene Rakhmatulin
ef0f996df6 Bumped base image version; reverted Triton to 3.5.1 2026-01-29 23:14:43 -08:00
Eugene Rakhmatulin
0ac438b4dd Some optimizations 2026-01-29 22:08:05 -08:00
Eugene Rakhmatulin
a5b693cc1e Merge branch 'main' into pytorch-base 2026-01-29 18:18:35 -08:00
Eugene Rakhmatulin
ace61c2d55 added new mod for glm4.7-flash-awq, solo model support. 2026-01-29 18:18:00 -08:00
Eugene Rakhmatulin
46fecd172a added missing dependancy 2026-01-29 17:01:17 -08:00
Eugene Rakhmatulin
159460af0c Migrated dockerfiles to pytorch-base image 2026-01-29 15:47:07 -08:00
Eugene Rakhmatulin
067bbbbb2d Merge branch 'mxfp4' 2026-01-29 14:20:07 -08:00
Eugene Rakhmatulin
9a907caffc mxfp4 dockerfile optimizations 2026-01-29 14:17:36 -08:00
Eugene Rakhmatulin
7a81e90cd2 added -e parameter 2026-01-29 13:06:22 -08:00
Eugene Rakhmatulin
53a8b45bcb Added experimental MXFP4 optimizations 2026-01-29 11:56:17 -08:00
Eugene Rakhmatulin
b58ba7b19a Added cubins and jit-cache 2026-01-29 11:42:04 -08:00
Eugene Rakhmatulin
36e3b7af27 Removed unnessesary dependencies 2026-01-29 09:58:44 -08:00
Eugene Rakhmatulin
e4b57633fe moved everything to uv 2026-01-29 08:34:49 -08:00
Eugene Rakhmatulin
a3afb6f313 Merge branch 'main' into mxfp4 2026-01-28 13:25:26 -08:00
Eugene Rakhmatulin
74c02c37c2 warning message about wheel builds 2026-01-28 13:25:02 -08:00
Eugene Rakhmatulin
cef3727f26 Updated SHA for repos 2026-01-28 13:20:03 -08:00
Eugene Rakhmatulin
6b11902cc8 Updated README 2026-01-26 23:18:27 -08:00
Eugene Rakhmatulin
564afc1f6b Working MXFP4 fork, updated build script 2026-01-26 22:31:46 -08:00
Eugene Rakhmatulin
90c8b30276 Merge branch 'main' into mxfp4 2026-01-26 16:17:58 -08:00
Eugene Rakhmatulin
e817f3dbec Updated Triton version to 3.6.0 2026-01-26 14:24:58 -08:00
Eugene Rakhmatulin
aece2fad78 Initial import of MXFP4 branch 2026-01-24 22:40:36 -08:00
Eugene Rakhmatulin
25a16ef6c2 Fixed #11 and #12 - added a new dependency for OpenCV 2026-01-19 12:07:15 -08:00
Eugene Rakhmatulin
cd7678fe9f Added MIT license 2026-01-13 19:38:24 +00:00