TheStageAI

34 models • 1 total models in database
Sort by:

thewhisper-large-v3

license:mit
1,458
2

thewhisper-large-v3-turbo

license:cc-by-4.0
1,314
21

Elastic-whisper-large-v3-turbo

388
2

Elastic-whisper-large-v3

NaNK
326
2

neutts

320
0

Elastic-Wan2.2-T2V-A14B-Diffusers

NaNK
202
1

Elastic-FLUX.1-schnell

198
4

Elastic-FLUX.1-dev

168
3

wake-word

125
0

Elastic-Llama-3.1-8B-Instruct

NaNK
base_model:meta-llama/Llama-3.1-8B-Instruct
106
4

silero-vad

86
0

Elastic-Mistral-7B-Instruct-v0.3

NaNK
72
5

Elastic-Qwen2.5-7B-Instruct

NaNK
58
2

embeddinggemma-300m

23
0

Elastic-musicgen-large

license:apache-2.0
21
7

speaker-segmentation

20
0

Qwen2.5-1.5B-Instruct

NaNK
19
0

Elastic-DeepSeek-R1-Distill-Qwen-7B

NaNK
license:apache-2.0
18
2

Elastic-stable-diffusion-3.5-large

Elastic model: Fastest self-serving models. Stable Diffusion 3.5 Large. Elastic models are the models produced by TheStage AI ANNA: Automated Neural Networks Accelerator. ANNA allows you to control model size, latency and quality with a simple slider movement. For each model, ANNA produces a series of optimized models: XL: Mathematically equivalent neural network, optimized with our DNN compiler. S: The fastest model, with accuracy degradation less than 2%. Provide the fastest models and service for self-hosting. Provide flexibility in cost vs quality selection for inference. Provide clear quality and latency benchmarks. Provide interface of HF libraries: transformers and diffusers with a single line of code. Provide models supported on a wide range of hardware, which are pre-compiled and require no JIT. > It's important to note that specific quality degradation can vary from model to model. For instance, with an S model, you can have 0.5% degradation as well. Currently, our demo model supports 512x512 - 1024x1024 and batch sizes 1-4. This will be updated in the near future. To infer our models, you just need to replace `diffusers` import with `elasticmodels.diffusers`: System requirements: GPUs: H100, B200 CPU: AMD, Intel Python: 3.10-3.12 To work with our models just run these lines in your terminal: Then go to app.thestage.ai, login and generate API token from your profile page. Set up API token as follows: Benchmarking is one of the most important procedures during model acceleration. We aim to provide clear performance metrics for models using our algorithms. For quality evaluation we have used: PSNR and SSIM. PSNR and SSIM were computed using outputs of original model. | Metric/Model | S | XL | Original | |---------------|---|----|----------| | PSNR | 20.78 | 29.13 | inf | | SSIM | 0.81 | 0.95 | 1.0 | Time in seconds to generate one image 1024x1024 | GPU/Model | S | XL | Original | |-----------|-----|----|----------| | H100 | 3.10 | 3.80 | 6.55 | | B200 | 1.76 | 2.27 | 4.81 | Subscribe for updates: TheStageAI X Contact email: [email protected]

license:apache-2.0
18
0

Elastic-DeepSeek-R1-Distill-Qwen-14B

NaNK
license:apache-2.0
17
2

Elastic-MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS

NaNK
license:apache-2.0
16
4

yolo11-detection

16
0

Elastic-Mistral-Nemo-Instruct-2407

NaNK
license:apache-2.0
13
1

Elastic-DeepSeek-R1-Distill-Llama-8B

NaNK
base_model:deepseek-ai/DeepSeek-R1-Distill-Llama-8B
12
2

Elastic-Qwen2.5-14B-Instruct

NaNK
license:apache-2.0
12
1

Elastic-Llama-3.2-1B-Instruct

NaNK
base_model:meta-llama/Llama-3.2-1B-Instruct
10
3

Elastic-stable-diffusion-xl-base-1.0

10
0

Wan2.2-T2V-A14B

NaNK
license:apache-2.0
10
0

Elastic-mochi-1-preview

license:apache-2.0
9
2

Elastic-Z-Image-Turbo

8
0

Elastic-LTX-2

7
0

Elastic-Mistral-Small-3.1-24B-Instruct-2503

NaNK
license:apache-2.0
4
3

whisper-medium

license:mit
0
1

mochi-preview

0
1