fix memory util
This commit is contained in:
@@ -45,7 +45,7 @@ spec:
|
|||||||
- --served-model-name
|
- --served-model-name
|
||||||
- Qwen/Qwen3.6-27B-FP8
|
- Qwen/Qwen3.6-27B-FP8
|
||||||
- --gpu-memory-utilization
|
- --gpu-memory-utilization
|
||||||
- "0.90"
|
- "0.85"
|
||||||
- --max-model-len
|
- --max-model-len
|
||||||
- "256000"
|
- "256000"
|
||||||
- --language-model-only
|
- --language-model-only
|
||||||
|
|||||||
Reference in New Issue
Block a user