Commit Graph

43 Commits

Author SHA1 Message Date
9ea69aedb4 decrease memory 2026-05-11 11:38:22 -05:00
2d2df4bb0a use defaults 2026-05-11 11:36:59 -05:00
db6efb188c fix quantization 2026-05-11 11:36:17 -05:00
ce65c6435b remove option 2026-05-11 11:35:29 -05:00
08106b6693 use correct dtype 2026-05-11 11:34:39 -05:00
719a6e3d11 fix list 2026-05-11 11:33:45 -05:00
855cce3c54 fix list 2026-05-11 11:32:56 -05:00
9594d6311a optimize model 2026-05-11 11:30:01 -05:00
b70f8063a8 fix memory util 2026-05-08 09:42:50 -05:00
2f922279ef add more gpu and change model name served 2026-05-08 09:30:48 -05:00
de12558021 set limit 2026-05-07 16:44:01 -05:00
ab34446bcf set size limit 2026-05-07 16:43:11 -05:00
95bcb06811 fix indent 2026-05-07 16:19:49 -05:00
c096777eca set probes 2026-05-07 16:19:05 -05:00
a092b6ffa5 set probes 2026-05-07 16:16:21 -05:00
64abcb1483 addd parameters 2026-05-07 16:05:19 -05:00
6ef281e06f set groupid 2026-05-07 16:01:39 -05:00
62a42ed8f0 change backend 2026-05-07 15:57:11 -05:00
f9d81c3a17 set profile 2026-05-07 15:33:04 -05:00
72bbe8789b disable download 2026-05-07 15:31:43 -05:00
4dc4fc5fdf force backend 2026-05-07 15:29:45 -05:00
6491c6cebe use vllm profiel 2026-05-07 15:12:03 -05:00
1177264fd2 set profile 2026-05-07 15:08:50 -05:00
40a7b1e117 fix backendref 2026-05-07 15:04:49 -05:00
c5db9144a9 pass args 2026-05-07 15:03:56 -05:00
1a4e73b755 use vlmm openai 2026-05-07 14:59:55 -05:00
2229daa1a3 use hf-api-secret 2026-05-07 14:56:07 -05:00
2e1ab2ea2d try befier model 2026-05-07 14:54:30 -05:00
5954bb5202 add HF cache 2026-05-07 14:45:23 -05:00
138ebc8f61 add security policy 2026-05-07 11:58:37 -05:00
eca316efa4 add timeouts 2026-05-07 11:42:59 -05:00
e56d29528f set cache reuse to 1 2026-05-07 11:28:09 -05:00
ebcc0cf045 enable cache reuse 2026-05-07 11:26:30 -05:00
0997aa48b7 add max model len 2026-05-07 11:19:11 -05:00
923679cb29 set env 2026-05-07 11:13:23 -05:00
1e7630efe8 remove args for now 2026-05-06 18:00:06 -05:00
af8c9a1254 set profiles 2026-05-06 17:52:14 -05:00
4287242035 set profiles 2026-05-06 17:51:29 -05:00
59af501a82 set default profile 2026-05-06 17:51:00 -05:00
f602e1aec9 set string 2026-05-06 17:48:20 -05:00
da57ec24ee use qwen 2026-05-06 17:46:54 -05:00
066554aa36 use 32b 2026-05-06 17:42:18 -05:00
98f39b7c68 add qwen 2026-05-06 17:33:57 -05:00