Commit Graph

  • ace61c2d55 added new mod for glm4.7-flash-awq, solo model support. Eugene Rakhmatulin 2026-01-29 18:18:00 -08:00
  • 46fecd172a added missing dependancy Eugene Rakhmatulin 2026-01-29 17:01:17 -08:00
  • 159460af0c Migrated dockerfiles to pytorch-base image Eugene Rakhmatulin 2026-01-29 15:47:07 -08:00
  • 067bbbbb2d Merge branch 'mxfp4' Eugene Rakhmatulin 2026-01-29 14:20:07 -08:00
  • 9a907caffc mxfp4 dockerfile optimizations Eugene Rakhmatulin 2026-01-29 14:17:36 -08:00
  • 7a81e90cd2 added -e parameter Eugene Rakhmatulin 2026-01-29 13:06:22 -08:00
  • 53a8b45bcb Added experimental MXFP4 optimizations Eugene Rakhmatulin 2026-01-29 11:56:17 -08:00
  • b58ba7b19a Added cubins and jit-cache Eugene Rakhmatulin 2026-01-29 11:42:04 -08:00
  • 36e3b7af27 Removed unnessesary dependencies Eugene Rakhmatulin 2026-01-29 09:58:44 -08:00
  • e4b57633fe moved everything to uv Eugene Rakhmatulin 2026-01-29 08:34:49 -08:00
  • a3afb6f313 Merge branch 'main' into mxfp4 Eugene Rakhmatulin 2026-01-28 13:25:26 -08:00
  • 74c02c37c2 warning message about wheel builds Eugene Rakhmatulin 2026-01-28 13:25:02 -08:00
  • cef3727f26 Updated SHA for repos Eugene Rakhmatulin 2026-01-28 13:20:03 -08:00
  • 6b11902cc8 Updated README Eugene Rakhmatulin 2026-01-26 23:18:27 -08:00
  • 564afc1f6b Working MXFP4 fork, updated build script Eugene Rakhmatulin 2026-01-26 22:31:46 -08:00
  • 90c8b30276 Merge branch 'main' into mxfp4 Eugene Rakhmatulin 2026-01-26 16:17:58 -08:00
  • e817f3dbec Updated Triton version to 3.6.0 Eugene Rakhmatulin 2026-01-26 14:24:58 -08:00
  • aece2fad78 Initial import of MXFP4 branch Eugene Rakhmatulin 2026-01-24 22:40:36 -08:00
  • 25a16ef6c2 Fixed #11 and #12 - added a new dependency for OpenCV Eugene Rakhmatulin 2026-01-19 12:07:15 -08:00
  • cd7678fe9f Added MIT license Eugene Rakhmatulin 2026-01-13 19:38:24 +00:00
  • 18a25c8382 Updated README Eugene Rakhmatulin 2026-01-08 14:38:12 -08:00
  • 4ee090f632 Updated README re: hf-download option Eugene Rakhmatulin 2025-12-24 08:37:33 -08:00
  • 2a568481f0 Model download support Eugene Rakhmatulin 2025-12-24 00:30:15 -08:00
  • 04e6d27b84 Updated README re: mods functionality Eugene Rakhmatulin 2025-12-23 18:09:59 -08:00
  • 9ad61078ce Added multiple mods support Eugene Rakhmatulin 2025-12-23 17:45:55 -08:00
  • c90a6d0bde Fixed remote docker execution Eugene Rakhmatulin 2025-12-23 13:49:38 -08:00
  • 19dec79c5c initial mod implementation Eugene Rakhmatulin 2025-12-23 13:38:10 -08:00
  • a9b1bb5947 fixed a bug with numpy version in wheels build when transformers 5 is used. Eugene Rakhmatulin 2025-12-21 22:53:31 -08:00
  • 1464b0dc8f Display image name in launch-cluster.sh output Eugene Rakhmatulin 2025-12-21 22:44:01 -08:00
  • 786a50c5c7 Updated README Eugene Rakhmatulin 2025-12-21 22:41:48 -08:00
  • 1139a37324 Added transformers v5 support Eugene Rakhmatulin 2025-12-21 22:41:03 -08:00
  • c37053adf6 Updated README Eugene Rakhmatulin 2025-12-21 14:57:35 -08:00
  • 82802f0cad Added Quickstart section to README Eugene Rakhmatulin 2025-12-21 14:53:05 -08:00
  • 11db634aad Switch to uv in the main Dockerfile Eugene Rakhmatulin 2025-12-21 13:28:40 -08:00
  • bbd3469549 Support vLLM release wheels Eugene Rakhmatulin 2025-12-21 11:15:52 -08:00
  • 2aa545a810 Added PSA about build cache Eugene Rakhmatulin 2025-12-21 00:49:59 -08:00
  • 63a1a6a97c Update README to reflect reduced build time and container size for vLLM Eugene Rakhmatulin 2025-12-20 23:16:12 -08:00
  • dfe426e912 Add support for pre-release FlashInfer packages in Docker builds Eugene Rakhmatulin 2025-12-20 23:13:26 -08:00
  • 1b3968fe98 Merge branch 'flashinfer-0.6.0-pre' Eugene Rakhmatulin 2025-12-20 23:02:58 -08:00
  • 9f35dbdd2d Reverted back to release flashinfer Eugene Rakhmatulin 2025-12-20 23:01:49 -08:00
  • d5d85aaac7 Added optional flashinfer packages, using pre-release flashinfer Eugene Rakhmatulin 2025-12-20 22:56:40 -08:00
  • 76988e0c75 Added --use-wheels to use precompiled vLLM wheels instead of compiling from the source Eugene Rakhmatulin 2025-12-20 20:25:07 -08:00
  • a83200573a Enhance Dockerfile: limit ccache size, enable compression, and optimize git repo size Eugene Rakhmatulin 2025-12-20 15:29:37 -08:00
  • fbb1bf73d5 Switching to flashinfer 0.6.x pre-release wheels Eugene Rakhmatulin 2025-12-20 13:28:06 -08:00
  • f075801c59 Fixed launch_cluster bug introduced by refactoring Eugene Rakhmatulin 2025-12-19 10:51:50 -08:00
  • 0cac77c286 Fixed contributor username Eugene Rakhmatulin 2025-12-19 10:41:03 -08:00
  • 3eb57a6d49 Updated README - autodiscovery in copy ops Eugene Rakhmatulin 2025-12-19 10:39:28 -08:00
  • a351f182cc Implement autodiscovery for copy hosts and enhance interface detection in build-and-copy and launch-cluster scripts Eugene Rakhmatulin 2025-12-19 10:36:39 -08:00
  • 244ad758d2 Updated README Eugene Rakhmatulin 2025-12-19 09:56:24 -08:00
  • 074316de68 Merge pull request #2 Eugene Rakhmatulin 2025-12-19 08:59:29 -08:00
  • 23858a3c7f Merge branch 'main' into pr-2 Eugene Rakhmatulin 2025-12-19 08:51:52 -08:00
  • de055928b8 Update CHANGELOG: Document --nccl-debug option for NCCL debug level control Eugene Rakhmatulin 2025-12-18 23:29:03 -08:00
  • 294d155532 Add NCCL debug level option to launch-cluster.sh Eugene Rakhmatulin 2025-12-18 23:28:12 -08:00
  • 0377e9badf Bugfix: don't shut down on exit if cluster is already running Eugene Rakhmatulin 2025-12-18 23:12:39 -08:00
  • 2a2f8f24e2 Allow launch-cluster.sh to be executed in non-TTY environment Eugene Rakhmatulin 2025-12-18 23:02:58 -08:00
  • 8c53179cc2 changed extra docker args variable to VLLM_SPARK_EXTRA_DOCKER_ARGS for consistency Eugene Rakhmatulin 2025-12-18 22:27:27 -08:00
  • cf937af897 Merge pull request #6 Eugene Rakhmatulin 2025-12-18 22:17:12 -08:00
  • cf9da89545 Updated README Eugene Rakhmatulin 2025-12-18 22:03:46 -08:00
  • 8a0cb3c853 Merge branch 'main' into pr-6 Eugene Rakhmatulin 2025-12-18 22:02:13 -08:00
  • 442f7369ad Updated build script to handle BUILD_JOBS argument Eugene Rakhmatulin 2025-12-18 22:02:04 -08:00
  • e6efd668cd Added Table of Contents to README Eugene Rakhmatulin 2025-12-18 15:43:09 -08:00
  • 8be691e806 Fixed issue with argument passing Eugene Rakhmatulin 2025-12-18 15:31:53 -08:00
  • 369283f655 Updated README.md with launch-cluster details. Eugene Rakhmatulin 2025-12-18 15:25:22 -08:00
  • db5c443905 Enhance launch-cluster script with improved node detection and SSH scanning using netcat and Python Eugene Rakhmatulin 2025-12-18 14:52:23 -08:00
  • 6c04ebfca1 Refactor launch-cluster script to include cluster running checks and streamline start process for head and worker nodes Eugene Rakhmatulin 2025-12-18 14:50:26 -08:00
  • f7a15bfaf5 Enhance launch-cluster script with improved SSH connectivity checks for worker nodes Eugene Rakhmatulin 2025-12-18 14:22:48 -08:00
  • 25b1d8eb4f Enhance launch-cluster script with auto-detection for interfaces and nodes Eugene Rakhmatulin 2025-12-18 13:53:28 -08:00
  • a1ed352635 renamed launch-cluster for consitency Eugene Rakhmatulin 2025-12-18 13:11:48 -08:00
  • 20a6699bf7 Add launch_cluster script for managing cluster nodes and actions Eugene Rakhmatulin 2025-12-18 13:11:13 -08:00
  • 1025243316 Added launch_cluster script to simplify launching cluster on nodes. Eugene Rakhmatulin 2025-12-18 13:10:57 -08:00
  • a13a9f6806 Limit build parallelism to reduce OOM situations Christopher Owen 2025-12-18 13:31:54 +01:00
  • 11355677f6 Add parallel copy option to build-and-copy.sh Eric Lewis 2025-12-18 01:24:48 -05:00
  • e67abd5e6e Add multi-host copy support to build-and-copy.sh Eric Lewis 2025-12-18 00:30:27 -05:00
  • e0f6cff132 Merge pull request #1 Eugene Rakhmatulin 2025-12-16 21:32:42 -08:00
  • f1abfb85b6 Bump of the version TeskaLabs Admin 2025-12-16 17:58:48 +00:00
  • 79f6a204d1 Update README.md Eugene Rakhmatulin 2025-12-15 09:51:49 -08:00
  • 0606b1b984 Refactor Triton and vLLM reference handling in Dockerfile and build script Eugene Rakhmatulin 2025-12-14 23:28:08 -08:00
  • 4551795908 Fixed missing Infiniband dependency, added CuDNN eugr 2025-12-14 21:49:50 -08:00
  • 33720fc9d6 Use no-build-isolation for Triton Kernels build eugr 2025-12-14 18:35:26 -08:00
  • dc614dc6ae Separated Triton build into a dedicated phase for better caching eugr 2025-12-14 10:32:28 -08:00
  • 25f759fec8 Optimized triton caching eugr 2025-12-14 09:26:10 -08:00
  • 02f842e1fd Updated README eugr 2025-12-14 00:39:15 -08:00
  • e8a12da072 Build triton from source; add TRITON_SHA argument to specify triton release, and add timing statistics eugr 2025-12-14 00:30:50 -08:00
  • a8217a1fd8 Improved dependency handling eugr 2025-12-13 22:41:30 -08:00
  • cc3e73feb1 Improved caching eugr 2025-12-13 21:34:57 -08:00
  • 76a8e92c86 Multistage build with caching eugr 2025-12-13 21:18:26 -08:00
  • 295e1f2266 Removed MiniMax M2 temporary patch from Dockerfile; updated README.md eugr 2025-12-11 13:24:57 -08:00
  • 37c12cf9e4 Removed MiniMax M2 patch since the fix is merged into main eugr 2025-12-11 13:23:30 -08:00
  • 5fba205db4 Implemented a temporary patch for recently broken MiniMax-M2 (in builds after 12/10) for some quants. eugr 2025-12-11 11:13:05 -08:00
  • 9d351cd6d5 Updated README eugr 2025-12-05 11:32:02 -08:00
  • 270446be27 Add build-and-copy script for automated image building and deployment eugr 2025-12-05 11:28:43 -08:00
  • b10ed739fe formatting changes eugr 2025-11-29 10:04:12 -08:00
  • 6a66a4b66f Added patch to allow fastsafetensors in cluster config eugr 2025-11-26 21:25:04 -08:00
  • 712637a348 Added second RoCE interface to examples eugr 2025-11-26 19:53:37 -08:00
  • bdf16a0a34 Formatting eugr 2025-11-26 14:02:15 -08:00
  • cf8e411ad2 Added benchmarking eugr 2025-11-26 14:01:04 -08:00
  • 676fa2ace9 Formatting fix eugr 2025-11-26 13:52:30 -08:00
  • 4f27899939 Added some details on networking eugr 2025-11-26 13:50:39 -08:00
  • 1a4bc1d7aa Typo eugr 2025-11-26 13:44:34 -08:00
  • 2a7d31ad81 Updated README eugr 2025-11-26 13:30:17 -08:00