map flashinfer/torch/triton cache directories by default

2026-02-10 16:36:02 -08:00
parent b990a1b8ac
commit 6d3f5dfd5c
2 changed files with 51 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -164,6 +164,17 @@ Don't do it every time you rebuild, because it will slow down compilation times.

 For periodic maintenance, I recommend using a filter: `docker builder prune --filter until=72h`

+### 2026-02-10
+
+#### Cache Directory Mounting
+
+`launch-cluster.sh` now automatically mounts default cache directories to the container to improve cold start times:
+- `~/.cache/vllm`
+- `~/.cache/flashinfer`
+- `~/.triton`
+
+To disable this behavior (clean start), use `--no-cache-dirs` flag.
+
 ### 2026-02-09

 - Migrated to a new base image with PyTorch 2.10 compiled with Spark support. With this change, wheels build is no longer a recommended way - please use a source build instead.
@@ -732,6 +743,7 @@ You can override the auto-detected values if needed:
 | `--nccl-debug` | NCCL debug level (e.g., INFO, WARN). Defaults to INFO if flag is present but value is omitted. |
 | `--check-config` | Check configuration and auto-detection without launching. |
 | `--solo` | Solo mode: skip autodetection, launch only on current node, do not launch Ray cluster |
+| `--no-cache-dirs` | Do not mount default cache directories (~/.cache/vllm, ~/.cache/flashinfer, ~/.triton). |
 | `--launch-script` | Path to bash script to execute in the container (from examples/ directory or absolute path). If launch script is specified, action should be omitted. |
 | `-d` | Run in daemon mode (detached). |