Step-Audio-EditX-AWQ-4bit

Name: Step-Audio-EditX-AWQ-4bit
Author: stepfun-ai

—

stepfun-ai

Code Model

OTHER

4B params

New

2 downloads

Early-stage

Try on Hugging Face Add to Compare

Edge AI:

Mobile

Laptop

Server

9GB+ RAM

Mobile

Laptop

Server

Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile

4-6GB RAM

Laptop

16GB RAM

Server

GPU

Minimum Recommended

4GB+ RAM

Code Examples

Local Inference Demobash

# zero-shot cloning
# The path of the generated audio file is output/fear_zh_female_prompt_cloned.wav
python3 tts_infer.py \
    --model-path where_you_download_dir \
    --tokenizer-path where_you_download_dir \
    --prompt-text "我总觉得，有人在跟着我，我能听到奇怪的脚步声。" \
    --prompt-audio "examples/fear_zh_female_prompt.wav" \
    --generated-text "可惜没有如果，已经发生的事情终究是发生了。" \
    --edit-type "clone" \
    --output-dir ./output 

python3 tts_infer.py \
    --model-path where_you_download_dir \
    --tokenizer-path where_you_download_dir \
    --prompt-text "His political stance was conservative, and he was particularly close to margaret thatcher." \
    --prompt-audio "examples/zero_shot_en_prompt.wav" \
    --generated-text "Underneath the courtyard is a large underground exhibition room which connects the two buildings.	" \
    --edit-type "clone" \
    --output-dir ./output 

# edit
# There will be one or multiple wave files corresponding to each edit iteration, for example: output/fear_zh_female_prompt_edited_iter1.wav, output/fear_zh_female_prompt_edited_iter2.wav, ...
# emotion; fear
python3 tts_infer.py \
    --model-path where_you_download_dir \
    --tokenizer-path where_you_download_dir \
    --prompt-text "我总觉得，有人在跟着我，我能听到奇怪的脚步声。" \
    --prompt-audio "examples/fear_zh_female_prompt.wav" \
    --edit-type "emotion" \
    --edit-info "fear" \
    --output-dir ./output 

# emotion; happy
python3 tts_infer.py \
    --model-path where_you_download_dir \
    --tokenizer-path where_you_download_dir \
    --prompt-text "You know, I just finished that big project and feel so relieved. Everything seems easier and more colorful, what a wonderful feeling!" \
    --prompt-audio "examples/en_happy_prompt.wav" \
    --edit-type "emotion" \
    --edit-info "happy" \
    --output-dir ./output 

# style; whisper
# for style whisper, the edit iteration num should be set bigger than 1 to get better results.
python3 tts_infer.py \
    --model-path where_you_download_dir \
    --tokenizer-path where_you_download_dir \
    --prompt-text "比如在工作间隙，做一些简单的伸展运动，放松一下身体，这样，会让你更有精力." \
    --prompt-audio "examples/whisper_prompt.wav" \
    --edit-type "style" \
    --edit-info "whisper" \
    --output-dir ./output 

# paraliguistic 
# supported tags, Breathing, Laughter, Surprise-oh, Confirmation-en, Uhm, Surprise-ah, Surprise-wa, Sigh, Question-ei, Dissatisfaction-hnn
python3 tts_infer.py \
    --model-path where_you_download_dir \
    --tokenizer-path where_you_download_dir \
    --prompt-text "我觉得这个计划大概是可行的，不过还需要再仔细考虑一下。" \
    --prompt-audio "examples/paralingustic_prompt.wav" \
    --generated-text "我觉得这个计划大概是可行的，[Uhm]不过还需要再仔细考虑一下。" \
    --edit-type "paralinguistic" \
    --output-dir ./output 

# denoise
# Prompt text is not needed.
python3 tts_infer.py \
    --model-path where_you_download_dir \
    --tokenizer-path where_you_download_dir \
    --prompt-audio "examples/denoise_prompt.wav"\
    --edit-type "denoise" \
    --output-dir ./output 

# vad 
# Prompt text is not needed.
python3 tts_infer.py \
    --model-path where_you_download_dir \
    --tokenizer-path where_you_download_dir \
    --prompt-audio "examples/vad_prompt.wav" \
    --edit-type "vad" \
    --output-dir ./output 

# speed
# supported edit-info: faster, slower, more faster, more slower
python3 tts_infer.py \
    --model-path where_you_download_dir \
    --tokenizer-path where_you_download_dir \
    --prompt-text "上次你说鞋子有点磨脚，我给你买了一双软软的鞋垫。" \
    --prompt-audio "examples/speed_prompt.wav" \
    --edit-type "speed" \
    --edit-info "more faster" \
    --output-dir ./output

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.