Commit Graph

172 Commits

Author SHA1 Message Date
eugr
6a66a4b66f Added patch to allow fastsafetensors in cluster config 2025-11-26 21:25:04 -08:00
eugr
712637a348 Added second RoCE interface to examples 2025-11-26 19:53:37 -08:00
eugr
bdf16a0a34 Formatting 2025-11-26 14:02:15 -08:00
eugr
cf8e411ad2 Added benchmarking 2025-11-26 14:01:04 -08:00
eugr
676fa2ace9 Formatting fix 2025-11-26 13:52:30 -08:00
eugr
4f27899939 Added some details on networking 2025-11-26 13:50:39 -08:00
eugr
1a4bc1d7aa Typo 2025-11-26 13:44:34 -08:00
eugr
2a7d31ad81 Updated README 2025-11-26 13:30:17 -08:00
eugr
549214e6ed Added missing Infiniband and RDMA libraries 2025-11-25 16:14:08 -08:00
eugr
a96a3a2dac Removed temporary patch for NVFP4 quants support as it's been merged into main 2025-11-25 12:48:58 -08:00
eugr
a93bd56389 Updated README 2025-11-24 21:44:01 -08:00
eugr
4c976375c5 Added missing dependencies; added dashboard support for Ray clusters 2025-11-24 21:13:06 -08:00
eugr
399948a725 Added missing modules for flashinfer 2025-11-24 17:02:04 -08:00
eugr
bd48032c45 Fixed typo in docker command in README 2025-11-24 16:34:19 -08:00
eugr
2cfa1db2cf Updated README 2025-11-24 16:32:47 -08:00
eugr
6d6e4dfe50 Updated README 2025-11-24 16:23:00 -08:00
eugr
d3fd2e69fd Updated Dockerfile with additional deps 2025-11-24 15:47:20 -08:00
eugr
f5141974ae Fixed cluster script and small fix for Dockerfilewq 2025-11-24 15:45:04 -08:00
eugr
5c8feb086c Updated README 2025-11-24 15:32:28 -08:00
eugr
3ecca4d2b7 Updated Dockerfile to include 2 levels of cache busters, added the cluster script and README. 2025-11-24 15:21:08 -08:00
eugr
0ad880e0fe Added clustering script 2025-11-24 11:53:38 -08:00
eugr
4e95bf6fa6 Initial commit 2025-11-24 11:19:37 -08:00