Commit Graph

36 Commits

Author SHA1 Message Date
Eugene Rakhmatulin
af6d5eae32 Temporarily removing incompatible triton-kernels 2026-01-30 11:17:38 -08:00
Eugene Rakhmatulin
458439706a Build flashinfer from source 2026-01-30 09:05:22 -08:00
Eugene Rakhmatulin
0ac438b4dd Some optimizations 2026-01-29 22:08:05 -08:00
Eugene Rakhmatulin
46fecd172a added missing dependancy 2026-01-29 17:01:17 -08:00
Eugene Rakhmatulin
159460af0c Migrated dockerfiles to pytorch-base image 2026-01-29 15:47:07 -08:00
Eugene Rakhmatulin
e817f3dbec Updated Triton version to 3.6.0 2026-01-26 14:24:58 -08:00
Eugene Rakhmatulin
25a16ef6c2 Fixed #11 and #12 - added a new dependency for OpenCV 2026-01-19 12:07:15 -08:00
Eugene Rakhmatulin
1139a37324 Added transformers v5 support 2025-12-21 22:41:03 -08:00
Eugene Rakhmatulin
11db634aad Switch to uv in the main Dockerfile 2025-12-21 13:28:40 -08:00
Eugene Rakhmatulin
dfe426e912 Add support for pre-release FlashInfer packages in Docker builds 2025-12-20 23:13:26 -08:00
Eugene Rakhmatulin
a83200573a Enhance Dockerfile: limit ccache size, enable compression, and optimize git repo size 2025-12-20 15:29:37 -08:00
Eugene Rakhmatulin
fbb1bf73d5 Switching to flashinfer 0.6.x pre-release wheels 2025-12-20 13:28:06 -08:00
Christopher Owen
a13a9f6806 Limit build parallelism to reduce OOM situations 2025-12-18 13:36:35 +01:00
TeskaLabs Admin
f1abfb85b6 Bump of the version 2025-12-16 17:58:48 +00:00
Eugene Rakhmatulin
0606b1b984 Refactor Triton and vLLM reference handling in Dockerfile and build script 2025-12-14 23:28:08 -08:00
eugr
4551795908 Fixed missing Infiniband dependency, added CuDNN 2025-12-14 21:49:50 -08:00
eugr
33720fc9d6 Use no-build-isolation for Triton Kernels build 2025-12-14 18:35:26 -08:00
eugr
dc614dc6ae Separated Triton build into a dedicated phase for better caching 2025-12-14 10:32:28 -08:00
eugr
25f759fec8 Optimized triton caching 2025-12-14 09:26:10 -08:00
eugr
e8a12da072 Build triton from source; add TRITON_SHA argument to specify triton release, and add timing statistics 2025-12-14 00:30:50 -08:00
eugr
a8217a1fd8 Improved dependency handling 2025-12-13 22:41:30 -08:00
eugr
cc3e73feb1 Improved caching 2025-12-13 21:34:57 -08:00
eugr
76a8e92c86 Multistage build with caching 2025-12-13 21:18:26 -08:00
eugr
37c12cf9e4 Removed MiniMax M2 patch since the fix is merged into main 2025-12-11 13:23:30 -08:00
eugr
5fba205db4 Implemented a temporary patch for recently broken MiniMax-M2 (in builds after 12/10) for some quants. 2025-12-11 11:13:05 -08:00
eugr
b10ed739fe formatting changes 2025-11-29 10:04:12 -08:00
eugr
6a66a4b66f Added patch to allow fastsafetensors in cluster config 2025-11-26 21:25:04 -08:00
eugr
549214e6ed Added missing Infiniband and RDMA libraries 2025-11-25 16:14:08 -08:00
eugr
a96a3a2dac Removed temporary patch for NVFP4 quants support as it's been merged into main 2025-11-25 12:48:58 -08:00
eugr
4c976375c5 Added missing dependencies; added dashboard support for Ray clusters 2025-11-24 21:13:06 -08:00
eugr
399948a725 Added missing modules for flashinfer 2025-11-24 17:02:04 -08:00
eugr
d3fd2e69fd Updated Dockerfile with additional deps 2025-11-24 15:47:20 -08:00
eugr
f5141974ae Fixed cluster script and small fix for Dockerfilewq 2025-11-24 15:45:04 -08:00
eugr
3ecca4d2b7 Updated Dockerfile to include 2 levels of cache busters, added the cluster script and README. 2025-11-24 15:21:08 -08:00
eugr
0ad880e0fe Added clustering script 2025-11-24 11:53:38 -08:00
eugr
4e95bf6fa6 Initial commit 2025-11-24 11:19:37 -08:00