PromptEnhancer
PromptEnhancer 32B
PromptEnhancerV2 is a multimodal language model fine-tuned for text-to-image prompt enhancement and rewriting. It restructures user input prompts while preserving the original intent, producing clearer, layered, and logically consistent prompts suitable for downstream image generation tasks. PromptEnhancerV2 is a specialized text-to-image prompt rewriting model that employs chain-of-thought reasoning to enhance user prompts. - Model type: Vision-Language Model for Prompt Enhancement - Language(s) (NLP): Chinese (zh), English (en) - License: Apache-2.0 - Finetuned from model: Qwen/Qwen2.5-VL-32B-Instruct - Repository: https://github.com/ximinng/PromptEnhancer - Paper: https://arxiv.org/abs/2509.04545 - Homepage: https://hunyuan-promptenhancer.github.io/ The model is evaluated on the T2I-Keypoints-Eval dataset, which contains diverse text-to-image prompts across various categories and languages. If you find this model useful, please consider citing:
PromptEnhancer-Img2img-Edit
PromptEnhancerV2 is a multimodal language model fine-tuned for image-to-image editing instruction enhancement and rewriting. It refines editing instructions by leveraging both the input text and the provided image, preserving the original intent while producing clearer, structured, and logically consistent prompts suitable for downstream image editing tasks. PromptEnhancerV2 (Img2Img Edit) is a specialized vision-language prompt rewriting model that employs chain-of-thought reasoning to enhance user editing instructions with visual context. - Model type: Vision-Language Model for Prompt Enhancement - Language(s) (NLP): Chinese (zh), English (en) - License: Apache-2.0 - Finetuned from model: Qwen/Qwen2.5-VL-32B-Instruct - Repository: https://github.com/ximinng/PromptEnhancer - Paper: https://arxiv.org/abs/2509.04545 - Homepage: https://hunyuan-promptenhancer.github.io/ If you find this model useful, please consider citing: