Commit Graph

95 Commits

Author SHA1 Message Date
Eugene Rakhmatulin
b7c8616743 Pinned pytorch version 2026-04-11 11:54:46 -07:00
Eugene Rakhmatulin
8e8e850ef1 fix for new requirements structure 2026-04-10 20:14:47 -07:00
Eugene Rakhmatulin
fc08740fba Increased uv timeout 2026-04-10 19:38:38 -07:00
Eugene Rakhmatulin
49d6d9fefd Removed PR2927 as it's been merged 2026-04-03 16:56:00 -07:00
Eugene Rakhmatulin
4afca860a5 Fix broken compilation (PR 38919) 2026-04-03 10:22:10 -07:00
Eugene Rakhmatulin
44808f7018 Apply vLLM PR 35568 2026-04-02 17:13:54 -07:00
Eugene Rakhmatulin
a770865834 Updated PRs to apply 2026-04-01 08:31:34 -07:00
Eugene Rakhmatulin
3a3ab98b3e Temporarily added PR2897 to Dockerfile 2026-03-31 22:06:08 -07:00
Eugene Rakhmatulin
41c0ce2c9a Fixed FI PR 2026-03-30 14:25:42 -07:00
Eugene Rakhmatulin
45494688d1 Updated README, added NVFP4 fix 2026-03-30 11:45:40 -07:00
Eugene Rakhmatulin
a3201f8873 --flashinfer-ref / --apply-flashinfer-pr 2026-03-29 22:40:35 -07:00
Eugene Rakhmatulin
32674c2619 removed temporary patch as it causes more issues. 2026-03-28 17:49:17 -07:00
Eugene Rakhmatulin
d37217bad0 moved PR patch before the requirements patching 2026-03-28 09:22:19 -07:00
Eugene Rakhmatulin
e70c87b4f6 Added PR38423 (temp) 2026-03-28 08:50:54 -07:00
Eugene Rakhmatulin
51d69c5c17 commenting out non-applicable PRs 2026-03-27 16:15:54 -07:00
Eugene Rakhmatulin
e6ee108cdf Temporary patch for NVFP4 2026-03-26 11:43:44 -07:00
Eugene Rakhmatulin
174de6f0a8 temporary patch for PR38126 2026-03-26 08:58:04 -07:00
Eugene Rakhmatulin
c4b078b868 Merge branch 'main' into 3-node 2026-03-24 22:21:25 -07:00
Drew Botwinick
8298c3d7f8 Merge remote-tracking branch 'upstream/main'
# Conflicts:
#	Dockerfile
2026-03-24 15:41:09 -05:00
Eugene Rakhmatulin
f8c2653fd3 Quick fix for NCCL dependency 2026-03-23 23:20:59 -07:00
Eugene Rakhmatulin
990a7b3837 Use mesh-optimized NCCL 2026-03-23 15:43:18 -07:00
Eugene Rakhmatulin
7a54657abf Revert "cuda 13.2 torch"
This reverts commit 926dd57a87.
2026-03-21 15:36:17 -07:00
Eugene Rakhmatulin
926dd57a87 cuda 13.2 torch 2026-03-21 15:15:01 -07:00
Eugene Rakhmatulin
6e8d85c914 cleanup 2026-03-21 15:12:12 -07:00
Drew Botwinick
d6e76f8e2f add build metadata generation and include in Dockerfiles 2026-03-21 16:10:04 -05:00
Eugene Rakhmatulin
8385506c5e Fixes 2026-03-20 23:51:21 -07:00
Eugene Rakhmatulin
8caebe3155 Reverting back to CUDA image + pytorch from wheels 2026-03-20 17:03:18 -07:00
Eugene Rakhmatulin
03b055d7f0 Major cluster orchestration refactoring to support running without Ray 2026-03-13 11:55:18 -07:00
Eugene Rakhmatulin
e225c709fb Revert "fix: add temporary patch for CUDA graphs estimation" as it has been merged to main
This reverts commit 63b2a8dbed.
2026-03-09 09:46:50 -07:00
Eugene Rakhmatulin
63b2a8dbed fix: add temporary patch for CUDA graphs estimation 2026-03-08 22:43:41 -07:00
Eugene Rakhmatulin
2d03bc138d saving flashinfer and vllm commits in wheels directories 2026-03-05 14:41:25 -08:00
Eugene Rakhmatulin
bbd7db2813 revert bumping up base image 2026-03-04 07:29:53 -08:00
Eugene Rakhmatulin
fff1a24982 Rolling back base image 2026-03-04 07:19:43 -08:00
Eugene Rakhmatulin
ae19b66fdd Bumped base image version 2026-03-03 23:31:51 -08:00
Eugene Rakhmatulin
5a3536b38e Fixed a bug where updated tags would cause git fetch to fail 2026-02-24 20:59:54 -08:00
Eugene Rakhmatulin
3c27d521bb Reverting another breaking vLLM PR, fixes #60 2026-02-23 09:51:45 -08:00
Eugene Rakhmatulin
c60c16e867 Temporary patch to reverse PR that fails builds 2026-02-18 16:20:20 -08:00
Eugene Rakhmatulin
f09c2c3ac8 Refactoring, updated README 2026-02-18 15:58:53 -08:00
Eugene Rakhmatulin
ec0f189256 Initial refactoring to enable separate wheel builds 2026-02-17 19:15:32 -08:00
Eugene Rakhmatulin
4214d4fefe Caching cubins during build for reuse 2026-02-13 19:30:28 -08:00
Eugene Rakhmatulin
da4185cb12 Fixed an issue with fetching latest vLLM code 2026-02-11 22:35:49 -08:00
Eugene Rakhmatulin
3b1e49dcb0 Supporting other CUDA archs via --gpu-arch flag 2026-02-11 13:10:41 -08:00
Eugene Rakhmatulin
ace16f3a8f Applied new fastsafetensors fix to mxfp4 build; disabled wheel builds by default 2026-02-09 23:47:06 -08:00
Eugene Rakhmatulin
2923fe6ea5 Removed temp fastsafetensors patch 2026-02-09 10:21:14 -08:00
Eugene Rakhmatulin
06e8817f18 Triton 3.6.0 is now default 2026-02-08 22:38:31 -08:00
Eugene Rakhmatulin
d845cd0401 changed arch to 12.1a again 2026-02-08 14:18:12 -08:00
Eugene Rakhmatulin
79e646e833 Merge branch 'apply-pr' into pytorch-base 2026-02-03 14:14:45 -08:00
Eugene Rakhmatulin
d7e9f17c2e vLLM build-time PRs support 2026-02-03 14:14:11 -08:00
Eugene Rakhmatulin
37953478f0 changed arch codes again to be in line with upcoming PR 2026-02-02 09:21:48 -08:00
Eugene Rakhmatulin
3c7f91081d changed arch flags 2026-02-01 16:37:01 -08:00