Eugene Rakhmatulin
|
1464b0dc8f
|
Display image name in launch-cluster.sh output
|
2025-12-21 22:44:01 -08:00 |
|
Eugene Rakhmatulin
|
f075801c59
|
Fixed launch_cluster bug introduced by refactoring
|
2025-12-19 10:51:50 -08:00 |
|
Eugene Rakhmatulin
|
a351f182cc
|
Implement autodiscovery for copy hosts and enhance interface detection in build-and-copy and launch-cluster scripts
|
2025-12-19 10:36:39 -08:00 |
|
Eugene Rakhmatulin
|
294d155532
|
Add NCCL debug level option to launch-cluster.sh
|
2025-12-18 23:28:12 -08:00 |
|
Eugene Rakhmatulin
|
0377e9badf
|
Bugfix: don't shut down on exit if cluster is already running
|
2025-12-18 23:12:39 -08:00 |
|
Eugene Rakhmatulin
|
2a2f8f24e2
|
Allow launch-cluster.sh to be executed in non-TTY environment
|
2025-12-18 23:02:58 -08:00 |
|
Eugene Rakhmatulin
|
8c53179cc2
|
changed extra docker args variable to VLLM_SPARK_EXTRA_DOCKER_ARGS for consistency
|
2025-12-18 22:27:27 -08:00 |
|
Eugene Rakhmatulin
|
8be691e806
|
Fixed issue with argument passing
|
2025-12-18 15:31:53 -08:00 |
|
Eugene Rakhmatulin
|
369283f655
|
Updated README.md with launch-cluster details.
|
2025-12-18 15:25:22 -08:00 |
|
Eugene Rakhmatulin
|
db5c443905
|
Enhance launch-cluster script with improved node detection and SSH scanning using netcat and Python
|
2025-12-18 14:52:23 -08:00 |
|
Eugene Rakhmatulin
|
6c04ebfca1
|
Refactor launch-cluster script to include cluster running checks and streamline start process for head and worker nodes
|
2025-12-18 14:50:26 -08:00 |
|
Eugene Rakhmatulin
|
f7a15bfaf5
|
Enhance launch-cluster script with improved SSH connectivity checks for worker nodes
|
2025-12-18 14:22:48 -08:00 |
|
Eugene Rakhmatulin
|
25b1d8eb4f
|
Enhance launch-cluster script with auto-detection for interfaces and nodes
|
2025-12-18 13:53:28 -08:00 |
|
Eugene Rakhmatulin
|
a1ed352635
|
renamed launch-cluster for consitency
|
2025-12-18 13:11:48 -08:00 |
|