Update README.md

This commit is contained in:
Eugene Rakhmatulin
2025-12-15 09:51:49 -08:00
parent 0606b1b984
commit 79f6a204d1

View File

@@ -12,6 +12,12 @@ The Dockerfile builds from the main branch of VLLM, so depending on when you run
## CHANGELOG ## CHANGELOG
### 2025-12-15
Updated `build-and-copy.sh` flags:
- Renamed `--triton-sha` to `--triton-ref` to support branches and tags in addition to commit SHAs.
- Added `--vllm-ref <ref>`: Specify vLLM commit SHA, branch or tag (defaults to `main`).
### 2025-12-14 ### 2025-12-14
Converted to multi-stage Docker build with improved build times and reduced final image size. The builder stage is now separate from the runtime stage, excluding unnecessary build tools from the final image. Converted to multi-stage Docker build with improved build times and reduced final image size. The builder stage is now separate from the runtime stage, excluding unnecessary build tools from the final image.
@@ -45,35 +51,19 @@ Applied patch to enable FastSafeTensors in cluster configuration (EXPERIMENTAL)
## 1\. Building the Docker Image ## 1\. Building the Docker Image
### Building Manually
The Dockerfile includes specific **Build Arguments** to allow you to selectively rebuild layers (e.g., update the vLLM source code without re-downloading PyTorch). The Dockerfile includes specific **Build Arguments** to allow you to selectively rebuild layers (e.g., update the vLLM source code without re-downloading PyTorch).
Using a provided build script is recommended, but if you want to build using `docker build` command, here are the supported build arguments:
### Option A: Standard Build (First Time) | Argument | Default | Description |
| :--- | :--- | :--- |
| `CACHEBUST_DEPS` | `1` | Change this to force a re-download of PyTorch, FlashInfer, and system dependencies. |
| `CACHEBUST_VLLM` | `1` | Change this to force a fresh git clone and rebuild of vLLM source code. |
| `TRITON_REF` | `v3.5.1` | Triton commit SHA, branch, or tag to build. |
| `VLLM_REF` | `main` | vLLM commit SHA, branch, or tag to build. |
```bash ### Using the Build Script (Recommended)
docker build -t vllm-node .
```
### Option B: Fast Rebuild (Update vLLM Source Only)
Use this if you want to pull the latest code from GitHub but keep the heavy dependencies (Torch, FlashInfer, system deps) cached.
```bash
docker build \
--build-arg CACHEBUST_VLLM=$(date +%s) \
-t vllm-node .
```
### Option C: Full Rebuild (Update All Dependencies)
Use this to force a re-download of PyTorch, FlashInfer, and system packages.
```bash
docker build \
--build-arg CACHEBUST_DEPS=$(date +%s) \
-t vllm-node .
```
### Option D: Using the Build Script (Recommended)
The `build-and-copy.sh` script automates the build process and optionally copies the image to another node. This is the recommended method for building and deploying to multiple Spark nodes. The `build-and-copy.sh` script automates the build process and optionally copies the image to another node. This is the recommended method for building and deploying to multiple Spark nodes.
@@ -124,7 +114,7 @@ Using a different username:
**Build with specific Triton commit:** **Build with specific Triton commit:**
```bash ```bash
./build-and-copy.sh --triton-sha abc123def456 ./build-and-copy.sh --triton-ref abc123def456
``` ```
**Copy existing image without rebuilding:** **Copy existing image without rebuilding:**
@@ -140,7 +130,8 @@ Using a different username:
| `-t, --tag <tag>` | Image tag (default: 'vllm-node') | | `-t, --tag <tag>` | Image tag (default: 'vllm-node') |
| `--rebuild-deps` | Force rebuild all dependencies (sets CACHEBUST_DEPS) | | `--rebuild-deps` | Force rebuild all dependencies (sets CACHEBUST_DEPS) |
| `--rebuild-vllm` | Force rebuild vLLM source only (sets CACHEBUST_VLLM) | | `--rebuild-vllm` | Force rebuild vLLM source only (sets CACHEBUST_VLLM) |
| `--triton-sha <sha>` | Triton commit SHA (default: auto-detect latest main) | | `--triton-ref <ref>` | Triton commit SHA, branch or tag (default: 'v3.5.1') |
| `--vllm-ref <ref>` | vLLM commit SHA, branch or tag (default: 'main') |
| `-h, --copy-to-host <host>` | Host address to copy the image to after building | | `-h, --copy-to-host <host>` | Host address to copy the image to after building |
| `-u, --user <user>` | Username for SSH connection (default: current user) | | `-u, --user <user>` | Username for SSH connection (default: current user) |
| `--no-build` | Skip building, only copy existing image (requires `--copy-to-host`) | | `--no-build` | Skip building, only copy existing image (requires `--copy-to-host`) |