Added benchmarking

2025-11-26 14:01:04 -08:00
parent 676fa2ace9
commit cf8e411ad2
1 changed files with 15 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -198,6 +198,21 @@ docker exec -it vllm_node
 And execute vllm command inside.
 ## 5\. Benchmarking
 Follow the guidance in [VLLM Benchmark Suites](https://docs.vllm.ai/en/latest/contributing/benchmarks/) to download benchmarking dataset, and then run a benchmark with a command like this (assuming you are running on head node, otherwise specify `--host` parameter):
 ```bash 
 vllm bench serve \
  --backend vllm \
  --model RedHatAI/Qwen3-VL-235B-A22B-Instruct-NVFP4 \
  --endpoint /v1/completions   --dataset-name sharegpt \
  --dataset-path ShareGPT_V3_unfiltered_cleaned_split.json \
  --num-prompts 1 \
  --port 8888 
 ```
 Modify `--num-prompts` to benchmark concurrent requests - the command above will give you single request performance.
 ### Hardware Architecture