mit-han-lab

96 models • 3 total models in database
Sort by:

svdq-int4-flux.1-fill-dev

This repository has been deprecated and will be hidden in December 2025. Please use https://huggingface.co/nunchaku-tech/nunchaku-flux.1-fill-dev. The FLUX.1 [dev] Model is licensed by Black Forest Labs Inc. under the FLUX.1 [dev] Non-Commercial License. Copyright Black Forest Labs Inc. IN NO EVENT SHALL BLACK FOREST LABS INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL.

dataset:mit-han-lab/svdquant-datasets
136,327
32

nunchaku-flux.1-kontext-dev

This repository has been migrated to https://huggingface.co/nunchaku-tech/nunchaku-flux.1-kontext-dev and will be hidden in December 2025. This repository contains Nunchaku-quantized versions of FLUX.1-Kontext-dev, capable of editing images based on text instructions. It is optimized for efficient inference while maintaining minimal loss in performance. - Developed by: Nunchaku Team - Model type: image-to-image - License: flux-1-dev-non-commercial-license - Quantized from model: FLUX.1-Kontext-dev - `svdq-int4r32-flux.1-kontext-dev.safetensors`: SVDQuant quantized INT4 FLUX.1-Kontext-dev model. For users with non-Blackwell GPUs (pre-50-series). - `svdq-fp4r32-flux.1-kontext-dev.safetensors`: SVDQuant quantized NVFP4 FLUX.1-Kontext-dev model. For users with Blackwell GPUs (50-series). - Inference Engine: nunchaku - Quantization Library: deepcompressor - Paper: SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models - Demo: svdquant.mit.edu - Diffusers Usage: See flux.1-kontext-dev.py. Check our tutorial for more advanced usage. - ComfyUI Usage: See nunchaku-flux.1-kontext-dev.json. The FLUX.1 [dev] Model is licensed by Black Forest Labs Inc. under the FLUX.1 [dev] Non-Commercial License. Copyright Black Forest Labs Inc. IN NO EVENT SHALL BLACK FOREST LABS INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL.

dataset:mit-han-lab/svdquant-datasets
23,953
157

dc-ae-f32c32-sana-1.0

Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models Figure 1: We address the reconstruction accuracy drop of high spatial-compression autoencoders. Figure 2: DC-AE delivers significant training and inference speedup without performance drop. Figure 3: DC-AE enables efficient text-to-image generation on the laptop. We present Deep Compression Autoencoder (DC-AE), a new family of autoencoder models for accelerating high-resolution diffusion models. Existing autoencoder models have demonstrated impressive results at a moderate spatial compression ratio (e.g., 8x), but fail to maintain satisfactory reconstruction accuracy for high spatial compression ratios (e.g., 64x). We address this challenge by introducing two key techniques: (1) Residual Autoencoding, where we design our models to learn residuals based on the space-to-channel transformed features to alleviate the optimization difficulty of high spatial-compression autoencoders; (2) Decoupled High-Resolution Adaptation, an efficient decoupled three-phases training strategy for mitigating the generalization penalty of high spatial-compression autoencoders. With these designs, we improve the autoencoder's spatial compression ratio up to 128 while maintaining the reconstruction quality. Applying our DC-AE to latent diffusion models, we achieve significant speedup without accuracy drop. For example, on ImageNet 512x512, our DC-AE provides 19.1x inference speedup and 17.9x training speedup on H100 GPU for UViT-H while achieving a better FID, compared with the widely used SD-VAE-f8 autoencoder. If DC-AE is useful or relevant to your research, please kindly recognize our contributions by citing our papers:

17,434
13

nunchaku-flux.1-dev

dataset:mit-han-lab/svdquant-datasets
14,719
59

svdq-int4-flux.1-schnell

dataset:mit-han-lab/svdquant-datasets
4,985
16

StreamingVLM

4,023
8

dc-ae-f64c128-in-1.0-diffusers

Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models Figure 1: We address the reconstruction accuracy drop of high spatial-compression autoencoders. Figure 2: DC-AE delivers significant training and inference speedup without performance drop. Figure 3: DC-AE enables efficient text-to-image generation on the laptop. We present Deep Compression Autoencoder (DC-AE), a new family of autoencoder models for accelerating high-resolution diffusion models. Existing autoencoder models have demonstrated impressive results at a moderate spatial compression ratio (e.g., 8x), but fail to maintain satisfactory reconstruction accuracy for high spatial compression ratios (e.g., 64x). We address this challenge by introducing two key techniques: (1) Residual Autoencoding, where we design our models to learn residuals based on the space-to-channel transformed features to alleviate the optimization difficulty of high spatial-compression autoencoders; (2) Decoupled High-Resolution Adaptation, an efficient decoupled three-phases training strategy for mitigating the generalization penalty of high spatial-compression autoencoders. With these designs, we improve the autoencoder's spatial compression ratio up to 128 while maintaining the reconstruction quality. Applying our DC-AE to latent diffusion models, we achieve significant speedup without accuracy drop. For example, on ImageNet 512x512, our DC-AE provides 19.1x inference speedup and 17.9x training speedup on H100 GPU for UViT-H while achieving a better FID, compared with the widely used SD-VAE-f8 autoencoder. If DC-AE is useful or relevant to your research, please kindly recognize our contributions by citing our papers:

2,674
1

dc-ae-f32c32-mix-1.0

2,171
2

dc-ae-f32c32-sana-1.1-diffusers

license:mit
2,128
6

dc-ae-f32c32-in-1.0

2,003
8

nunchaku-flux.1-fill-dev

This repository has been migrated to https://huggingface.co/nunchaku-tech/nunchaku-flux.1-fill-dev and will be hidden in December 2025. This repository contains Nunchaku-quantized versions of FLUX.1-Fill-dev, capable of filling areas in existing images based on a text description. It is optimized for efficient inference while maintaining minimal loss in performance. - Developed by: Nunchaku Team - Model type: image-to-image - License: flux-1-dev-non-commercial-license - Quantized from model: FLUX.1-Fill-dev - `svdq-int4r32-flux.1-fill-dev.safetensors`: SVDQuant quantized INT4 FLUX.1-Fill-dev model. For users with non-Blackwell GPUs (pre-50-series). - `svdq-fp4r32-flux.1-fill-dev.safetensors`: SVDQuant quantized NVFP4 FLUX.1-Fill-dev model. For users with Blackwell GPUs (50-series). - Inference Engine: nunchaku - Quantization Library: deepcompressor - Paper: SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models - Demo: svdquant.mit.edu - Diffusers Usage: See flux.1-fill-dev.py. Check our tutorial for more advanced usage. - ComfyUI Usage: See nunchaku-flux.1-fill-dev.json. The FLUX.1 [dev] Model is licensed by Black Forest Labs Inc. under the FLUX.1 [dev] Non-Commercial License. Copyright Black Forest Labs Inc. IN NO EVENT SHALL BLACK FOREST LABS INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL.

dataset:mit-han-lab/svdquant-datasets
1,927
10

svdq-int4-flux.1-dev

This repository has been deprecated and will be hidden in December 2025. Please use https://huggingface.co/nunchaku-tech/nunchaku-flux.1-dev. The FLUX.1 [dev] Model is licensed by Black Forest Labs Inc. under the FLUX.1 [dev] Non-Commercial License. Copyright Black Forest Labs Inc. IN NO EVENT SHALL BLACK FOREST LABS INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL.

dataset:mit-han-lab/svdquant-datasets
1,896
86

vila-u-7b-256

NaNK
vila_u_llama
1,556
24

dc-ae-f32c32-sana-1.1

1,541
9

dc-ae-f64c128-in-1.0

1,489
7

Qwen2.5-32B-Eagle-RL

NaNK
1,305
0

nunchaku-flux.1-schnell

This repository has been migrated to https://huggingface.co/nunchaku-tech/nunchaku-flux.1-schnell and will be hidden in December 2025. This repository contains Nunchaku-quantized versions of FLUX.1-schnell, designed to generate high-quality images from text prompts. It is optimized for efficient inference while maintaining minimal loss in performance. - Developed by: Nunchaku Team - Model type: text-to-image - License: apache-2.0 - Quantized from model: FLUX.1-schnell - `svdq-int4r32-flux.1-schnell.safetensors`: SVDQuant quantized INT4 FLUX.1-schnell model. For users with non-Blackwell GPUs (pre-50-series). - `svdq-fp4r32-flux.1-schnell.safetensors`: SVDQuant quantized NVFP4 FLUX.1-schnell model. For users with Blackwell GPUs (50-series). - Inference Engine: nunchaku - Quantization Library: deepcompressor - Paper: SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models - Demo: svdquant.mit.edu - Diffusers Usage: See flux.1-schnell.py. Check our tutorial for more advanced usage. - ComfyUI Usage: See nunchaku-flux.1-schnell.json.

NaNK
dataset:mit-han-lab/svdquant-datasets
830
7

dc-ae-f64c128-mix-1.0

574
2

nunchaku-flux.1-canny-dev

This repository has been migrated to https://huggingface.co/nunchaku-tech/nunchaku-flux.1-canny-dev and will be hidden in December 2025. This repository contains Nunchaku-quantized versions of FLUX.1-Canny-dev, capable of generating an image based on a text description while following the structure of a given input image. It is optimized for efficient inference while maintaining minimal loss in performance. - Developed by: Nunchaku Team - Model type: image-to-image - License: flux-1-dev-non-commercial-license - Quantized from model: FLUX.1-Canny-dev - `svdq-int4r32-flux.1-canny-dev.safetensors`: SVDQuant quantized INT4 FLUX.1-Canny-dev model. For users with non-Blackwell GPUs (pre-50-series). - `svdq-fp4r32-flux.1-canny-dev.safetensors`: SVDQuant quantized NVFP4 FLUX.1-Canny-dev model. For users with Blackwell GPUs (50-series). - Inference Engine: nunchaku - Quantization Library: deepcompressor - Paper: SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models - Demo: svdquant.mit.edu - Diffusers Usage: See flux.1-canny-dev.py. Check our tutorial for more advanced usage. - ComfyUI Usage: See nunchaku-flux.1-canny-dev.json. The FLUX.1 [dev] Model is licensed by Black Forest Labs Inc. under the FLUX.1 [dev] Non-Commercial License. Copyright Black Forest Labs Inc. IN NO EVENT SHALL BLACK FOREST LABS INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL.

NaNK
dataset:mit-han-lab/svdquant-datasets
558
3

dc-ae-f32c32-sana-1.0-diffusers

541
16

svdq-fp4-flux.1-dev

dataset:mit-han-lab/svdquant-datasets
540
15

Qwen2-VL-1.5B-Instruct

NaNK
license:apache-2.0
242
1

Nunchaku Flux.1 Depth Dev

This repository has been migrated to https://huggingface.co/nunchaku-tech/nunchaku-flux.1-depth-dev and will be hidden in December 2025. This repository contains Nunchaku-quantized versions of FLUX.1-Depth-dev, capable of generating an image based on a text description while following the structure of a given input image. It is optimized for efficient inference while maintaining minimal loss in performance. - Developed by: Nunchaku Team - Model type: image-to-image - License: flux-1-dev-non-commercial-license - Quantized from model: FLUX.1-Depth-dev - `svdq-int4r32-flux.1-depth-dev.safetensors`: SVDQuant quantized INT4 FLUX.1-Depth-dev model. For users with non-Blackwell GPUs (pre-50-series). - `svdq-fp4r32-flux.1-depth-dev.safetensors`: SVDQuant quantized NVFP4 FLUX.1-Depth-dev model. For users with Blackwell GPUs (50-series). - Inference Engine: nunchaku - Quantization Library: deepcompressor - Paper: SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models - Demo: svdquant.mit.edu - Diffusers Usage: See flux.1-depth-dev.py. Check our tutorial for more advanced usage. - ComfyUI Usage: See nunchaku-flux.1-depth-dev.json. The FLUX.1 [dev] Model is licensed by Black Forest Labs Inc. under the FLUX.1 [dev] Non-Commercial License. Copyright Black Forest Labs Inc. IN NO EVENT SHALL BLACK FOREST LABS INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL.

NaNK
dataset:mit-han-lab/svdquant-datasets
228
4

Qwen2.5-7B-Eagle-RL

NaNK
224
0

svdq-int4-flux.1-depth-dev

dataset:mit-han-lab/svdquant-datasets
200
5

nunchaku-shuttle-jaguar

NaNK
dataset:mit-han-lab/svdquant-datasets
195
5

svdq-fp4-flux.1-schnell

dataset:mit-han-lab/svdquant-datasets
169
2

svdq-int4-flux.1-canny-dev

dataset:mit-han-lab/svdquant-datasets
153
5

dc-ae-f32c32-in-1.0-256px

135
2

svdq-flux.1-schnell-pix2pix-turbo

NaNK
license:apache-2.0
129
1

opt-1.3b-smoothquant

NaNK
license:mit
115
3

svdq-fp4-flux.1-fill-dev

dataset:mit-han-lab/svdquant-datasets
87
2

dc-ae-f128c512-in-1.0

84
2

dc-ae-lite-f32c32-sana-1.1-diffusers

Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models Figure 1: We address the reconstruction accuracy drop of high spatial-compression autoencoders. Figure 2: DC-AE delivers significant training and inference speedup without performance drop. Figure 3: DC-AE enables efficient text-to-image generation on the laptop. We present Deep Compression Autoencoder (DC-AE), a new family of autoencoder models for accelerating high-resolution diffusion models. Existing autoencoder models have demonstrated impressive results at a moderate spatial compression ratio (e.g., 8x), but fail to maintain satisfactory reconstruction accuracy for high spatial compression ratios (e.g., 64x). We address this challenge by introducing two key techniques: (1) Residual Autoencoding, where we design our models to learn residuals based on the space-to-channel transformed features to alleviate the optimization difficulty of high spatial-compression autoencoders; (2) Decoupled High-Resolution Adaptation, an efficient decoupled three-phases training strategy for mitigating the generalization penalty of high spatial-compression autoencoders. With these designs, we improve the autoencoder's spatial compression ratio up to 128 while maintaining the reconstruction quality. Applying our DC-AE to latent diffusion models, we achieve significant speedup without accuracy drop. For example, on ImageNet 512x512, our DC-AE provides 19.1x inference speedup and 17.9x training speedup on H100 GPU for UViT-H while achieving a better FID, compared with the widely used SD-VAE-f8 autoencoder. If DC-AE is useful or relevant to your research, please kindly recognize our contributions by citing our papers:

64
4

dc-ae-f64c128-in-1.0-uvit-h-in-512px-train2000k

53
4

svdq-int4-sana-1600m

NaNK
dataset:mit-han-lab/svdquant-datasets
52
2

nunchaku-sana

NaNK
dataset:mit-han-lab/svdquant-datasets
49
0

svdq-fp4-flux.1-depth-dev

dataset:mit-han-lab/svdquant-datasets
39
1

svdq-fp4-shuttle-jaguar

dataset:mit-han-lab/svdquant-datasets
36
1

dc-ae-f32c32-in-1.0-diffusers

Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models Figure 1: We address the reconstruction accuracy drop of high spatial-compression autoencoders. Figure 2: DC-AE delivers significant training and inference speedup without performance drop. Figure 3: DC-AE enables efficient text-to-image generation on the laptop. We present Deep Compression Autoencoder (DC-AE), a new family of autoencoder models for accelerating high-resolution diffusion models. Existing autoencoder models have demonstrated impressive results at a moderate spatial compression ratio (e.g., 8x), but fail to maintain satisfactory reconstruction accuracy for high spatial compression ratios (e.g., 64x). We address this challenge by introducing two key techniques: (1) Residual Autoencoding, where we design our models to learn residuals based on the space-to-channel transformed features to alleviate the optimization difficulty of high spatial-compression autoencoders; (2) Decoupled High-Resolution Adaptation, an efficient decoupled three-phases training strategy for mitigating the generalization penalty of high spatial-compression autoencoders. With these designs, we improve the autoencoder's spatial compression ratio up to 128 while maintaining the reconstruction quality. Applying our DC-AE to latent diffusion models, we achieve significant speedup without accuracy drop. For example, on ImageNet 512x512, our DC-AE provides 19.1x inference speedup and 17.9x training speedup on H100 GPU for UViT-H while achieving a better FID, compared with the widely used SD-VAE-f8 autoencoder. If DC-AE is useful or relevant to your research, please kindly recognize our contributions by citing our papers:

31
1

dc-ae-f64c128-in-1.0-uvit-h-in-512px

30
3

dc-ae-f64c128-mix-1.0-diffusers

30
3

dc-ae-f128c512-mix-1.0

27
5

dc-ae-f32c32-mix-1.0-diffusers

26
2

svdq-int4-shuttle-jaguar

dataset:mit-han-lab/svdquant-datasets
26
2

dc-ae-f128c512-mix-1.0-diffusers

21
3

opt-125m-smoothquant

license:mit
21
0

dc-ae-f32c32-in-1.0-usit-2b-in-512px

NaNK
20
1

svdq-fp4-flux.1-canny-dev

dataset:mit-han-lab/svdquant-datasets
19
3

dc-ae-lite-f32c32-sana-1.1

Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models Figure 1: We address the reconstruction accuracy drop of high spatial-compression autoencoders. Figure 2: DC-AE delivers significant training and inference speedup without performance drop. Figure 3: DC-AE enables efficient text-to-image generation on the laptop. We present Deep Compression Autoencoder (DC-AE), a new family of autoencoder models for accelerating high-resolution diffusion models. Existing autoencoder models have demonstrated impressive results at a moderate spatial compression ratio (e.g., 8x), but fail to maintain satisfactory reconstruction accuracy for high spatial compression ratios (e.g., 64x). We address this challenge by introducing two key techniques: (1) Residual Autoencoding, where we design our models to learn residuals based on the space-to-channel transformed features to alleviate the optimization difficulty of high spatial-compression autoencoders; (2) Decoupled High-Resolution Adaptation, an efficient decoupled three-phases training strategy for mitigating the generalization penalty of high spatial-compression autoencoders. With these designs, we improve the autoencoder's spatial compression ratio up to 128 while maintaining the reconstruction quality. Applying our DC-AE to latent diffusion models, we achieve significant speedup without accuracy drop. For example, on ImageNet 512x512, our DC-AE provides 19.1x inference speedup and 17.9x training speedup on H100 GPU for UViT-H while achieving a better FID, compared with the widely used SD-VAE-f8 autoencoder. If DC-AE is useful or relevant to your research, please kindly recognize our contributions by citing our papers:

18
2

dc-ae-f128c512-in-1.0-diffusers

11
2

Llama-3-8B-Instruct-QServe

NaNK
llama
8
1

opt-13b-smoothquant

NaNK
license:mit
6
2

Llama-3-8B-Instruct-QServe-g128

NaNK
llama
6
2

nunchaku-t5

This repository has been migrated to https://huggingface.co/nunchaku-tech/nunchaku-t5 and will be hidden in December 2025. This repository contains Nunchaku-quantized versions of T5-XXL, used to encode text prompt to the embeddings. It is used to reduce the memory footprint of the model. - Developed by: Nunchaku Team - Model type: text-generation - License: apache-2.0 - Quantized from model: t5v11xxl - `awq-int4-flux.1-t5xxl.safetensors`: AWQ quantized W4A16 T5-XXL model for FLUX.1. - Inference Engine: nunchaku - Quantization Library: deepcompressor - Paper: SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models - Demo: svdquant.mit.edu - Diffusers Usage: See flux.1-dev-qencoder.py. Check our tutorial for more advanced usage. - ComfyUI Usage: See nunchaku-flux.1-dev-qencoder.json.

NaNK
dataset:mit-han-lab/svdquant-datasets
4
18

Llama-3-8B-Instruct-QServe-W8A8

NaNK
llama
4
1

opt-6.7b-smoothquant

NaNK
license:mit
4
0

opt-30b-smoothquant

NaNK
license:mit
3
3

vicuna-13b-v1.3-4bit-g128-awq

NaNK
llama
3
0

Mistral-7B-v0.1-QServe

NaNK
license:apache-2.0
3
0

Llama-3-8B-QServe-g128

NaNK
llama
3
0

dc-ae-f32c32-in-1.0-dit-xl-in-512px

2
9

Llama-2-7B-QServe-g128

NaNK
llama
2
1

Llama-2-13B-QServe-g128

NaNK
llama
2
1

dc-ae-f32c32-in-1.0-uvit-s-in-512px

2
1

dc-ae-f32c32-in-1.0-usit-h-in-512px

2
1

dc-ae-f32c32-in-1.0-sit-xl-in-512px

2
1

Llama-3-8B-Instruct-Gradient-1048k-w8a8-per-channel-kv8-per-tensor

NaNK
llama
2
0

dc-ae-f32c32-in-1.0-uvit-h-in-512px

1
2

dc-ae-f64c128-in-1.0-uvit-2b-in-512px-train2000k

NaNK
1
2

Mistral-7B-v0.1-QServe-g128

NaNK
license:apache-2.0
1
1

Yi-34B-QServe-g128

NaNK
llama
1
1

dc-ae-f64c128-in-1.0-uvit-2b-in-512px

Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models Figure 1: We address the reconstruction accuracy drop of high spatial-compression autoencoders. Figure 2: DC-AE delivers significant training and inference speedup without performance drop. Figure 3: DC-AE enables efficient text-to-image generation on the laptop. We present Deep Compression Autoencoder (DC-AE), a new family of autoencoder models for accelerating high-resolution diffusion models. Existing autoencoder models have demonstrated impressive results at a moderate spatial compression ratio (e.g., 8x), but fail to maintain satisfactory reconstruction accuracy for high spatial compression ratios (e.g., 64x). We address this challenge by introducing two key techniques: (1) Residual Autoencoding, where we design our models to learn residuals based on the space-to-channel transformed features to alleviate the optimization difficulty of high spatial-compression autoencoders; (2) Decoupled High-Resolution Adaptation, an efficient decoupled three-phases training strategy for mitigating the generalization penalty of high spatial-compression autoencoders. With these designs, we improve the autoencoder's spatial compression ratio up to 128 while maintaining the reconstruction quality. Applying our DC-AE to latent diffusion models, we achieve significant speedup without accuracy drop. For example, on ImageNet 512x512, our DC-AE provides 19.1x inference speedup and 17.9x training speedup on H100 GPU for UViT-H while achieving a better FID, compared with the widely used SD-VAE-f8 autoencoder. If DC-AE is useful or relevant to your research, please kindly recognize our contributions by citing our papers:

NaNK
1
1

dc-ae-f32c32-in-1.0-dit-xl-in-512px-trainbs1024

1
1

Llama-2-7B-QServe

NaNK
llama
1
0

Llama-3-8B-QServe

NaNK
llama
1
0

vicuna-7b-v1.5-QServe

NaNK
llama
1
0

vicuna-13b-v1.5-QServe

NaNK
llama
1
0

vicuna-7b-v1.5-QServe-g128

NaNK
llama
1
0

Llama-3-8B-Instruct-Gradient-1048k-w8a8kv4-per-channel

NaNK
llama
1
0

Llama-3-8B-Instruct-Gradient-4194k-w8a8kv4-per-channel

NaNK
llama
1
0

dc-ae-f32c32-in-1.0-uvit-2b-in-512px

NaNK
1
0

dc-ae-f32c32-in-1.0-sana-cls-xl-in-512px

1
0

nunchaku

This repository has been migrated to https://huggingface.co/nunchaku-tech/nunchaku and will be hidden in December 2025. This repository provides pre-built wheels for nunchaku for both Linux and Windows platforms. For detailed information about available wheels, please visit our GitHub Releases page.

license:apache-2.0
0
103

efficientvit-sam

license:apache-2.0
0
41

hart-0.7b-1024px

NaNK
license:mit
0
13

fastcomposer

license:mit
0
7

tinychatengine-model-zoo

0
3

nunchaku-artifacts

0
3

opt-66b-smoothquant

NaNK
license:mit
0
2

opt-2.7b-smoothquant

NaNK
license:mit
0
1

vicuna-7b-v1.3-4bit-g128-awq

NaNK
llama
0
1

Llama-2-13B-QServe

NaNK
llama
0
1

Yi-34B-QServe

NaNK
llama
0
1

litepose

license:mit
0
1

VILA1.5-13B-QServe-W8A8

NaNK
llava_llama
0
1