fastllm

28 models • 1 total models in database
Sort by:

Qwen3-Next-80B-A3B-Instruct-UD-Q4_K_L

NaNK
453
0

Qwen3 Next 80B A3B Instruct UD Q4 K M

NaNK
177
4

Qwen3-Next-80B-A3B-Instruct-int4g-fp16-mixed

NaNK
71
3

Qwen3-235B-A22B-INT4MIX

NaNK
license:apache-2.0
36
4

DeepSeek-V3-0324-INT4

license:apache-2.0
28
1

Qwen3-Next-80B-A3B-Instruct-UD-Q2_K_S

NaNK
26
4

Qwen3-Next-80B-A3B-Instruct-UD-Q6_K_L

NaNK
25
3

Qwen3-Next-80B-A3B-Thinking-int4g-fp16-mixed

NaNK
22
0

DeepSeek-R1-0528-INT4

license:apache-2.0
20
2

Qwen3-Next-80B-A3B-Instruct-UD-Q3_K_M

NaNK
19
0

Qwen3-30B-A3B-FP16INT4

NaNK
15
0

Kimi-K2-Instruct-INT4MIX

单 CPU 如果您使用的是单个 CPU,请使用 -t 参数设置线程数(通常设置为 CPU 核心数 - 2)。 使用环境变量 FASTLLMNUMATHREADS 设置线程数(通常设置为每个 NUMA 节点的核心数 - 2)。 single CPU If you are using a single CPU, set the number of threads with the -t parameter (generally set to CPU core count - 2). If the speed is extremely slow, it may be due to too many threads—consider reducing them. If using a multi-socket CPU machine, you need to enable CUDA + NUMA heterogeneous acceleration mode. Set the number of threads using the environment variable FASTLLMNUMATHREADS (typically set to the number of cores per NUMA node - 2). If performance is extremely slow, it may be due to excessive threads—consider reducing them.

13
3

Qwen3-Next-80B-A3B-Instruct-UD-Q5_K_L

NaNK
13
1

DeepSeek-R1-INT4

license:apache-2.0
12
0

Qwen3-Next-80B-A3B-Instruct-UD-Q2_K_M

NaNK
8
2

Qwen3-Next-80B-A3B-Instruct-UD-Q3_K_L

NaNK
8
0

Qwen3-Next-80B-A3B-Instruct-UD-Q5_K_M

NaNK
6
0

chatglm2-6b-int4.flm

NaNK
license:apache-2.0
0
10

chatglm2-6b-fp16.flm

NaNK
license:apache-2.0
0
9

Qwen-7B-Chat-int8.flm

NaNK
license:apache-2.0
0
4

chatglm2-6b-int8.flm

NaNK
license:apache-2.0
0
3

Qwen-7B-Chat-int4.flm

NaNK
license:apache-2.0
0
3

fastllmdepend-windows

license:apache-2.0
0
3

chatglm-6b-int4.flm

NaNK
license:apache-2.0
0
2

chatglm-6b-int8.flm

NaNK
license:apache-2.0
0
2

chatglm-6b-fp16.flm

NaNK
license:apache-2.0
0
2

Qwen-7B-Chat-fp16.flm

NaNK
license:apache-2.0
0
2

Qwen1.5B-Chat-72B-int4

NaNK
0
1