PromptEnhancer

2 models • 1 total models in database
Sort by:

PromptEnhancer 32B

PromptEnhancerV2 is a multimodal language model fine-tuned for text-to-image prompt enhancement and rewriting. It restructures user input prompts while preserving the original intent, producing clearer, layered, and logically consistent prompts suitable for downstream image generation tasks. PromptEnhancerV2 is a specialized text-to-image prompt rewriting model that employs chain-of-thought reasoning to enhance user prompts. - Model type: Vision-Language Model for Prompt Enhancement - Language(s) (NLP): Chinese (zh), English (en) - License: Apache-2.0 - Finetuned from model: Qwen/Qwen2.5-VL-32B-Instruct - Repository: https://github.com/ximinng/PromptEnhancer - Paper: https://arxiv.org/abs/2509.04545 - Homepage: https://hunyuan-promptenhancer.github.io/ The model is evaluated on the T2I-Keypoints-Eval dataset, which contains diverse text-to-image prompts across various categories and languages. If you find this model useful, please consider citing:

NaNK
906
11

PromptEnhancer-Img2img-Edit

PromptEnhancerV2 is a multimodal language model fine-tuned for image-to-image editing instruction enhancement and rewriting. It refines editing instructions by leveraging both the input text and the provided image, preserving the original intent while producing clearer, structured, and logically consistent prompts suitable for downstream image editing tasks. PromptEnhancerV2 (Img2Img Edit) is a specialized vision-language prompt rewriting model that employs chain-of-thought reasoning to enhance user editing instructions with visual context. - Model type: Vision-Language Model for Prompt Enhancement - Language(s) (NLP): Chinese (zh), English (en) - License: Apache-2.0 - Finetuned from model: Qwen/Qwen2.5-VL-32B-Instruct - Repository: https://github.com/ximinng/PromptEnhancer - Paper: https://arxiv.org/abs/2509.04545 - Homepage: https://hunyuan-promptenhancer.github.io/ If you find this model useful, please consider citing:

NaNK
225
6