stabilityai

✓ VerifiedAI Startup

Creators of Stable Diffusion and Stable LM

111 models • 36 total models in database

Sort by:

sd-turbo

SD-Turbo is a fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation. We release SD-Turbo as a research artifact, and to study small, distilled text-to-image models. For increased quality and prompt understanding, we recommend SDXL-Turbo. Please note: For commercial use, please refer to https://stability.ai/license. Model Description SD-Turbo is a distilled version of Stable Diffusion 2.1, trained for real-time synthesis. SD-Turbo is based on a novel training method called Adversarial Diffusion Distillation (ADD) (see the technical report), which allows sampling large-scale foundational image diffusion models in 1 to 4 steps at high image quality. This approach uses score distillation to leverage large-scale off-the-shelf image diffusion models as a teacher signal and combines this with an adversarial loss to ensure high image fidelity even in the low-step regime of one or two sampling steps. - Developed by: Stability AI - Funded by: Stability AI - Model type: Generative text-to-image model - Finetuned from model: Stable Diffusion 2.1 For research purposes, we recommend our `generative-models` Github repository (https://github.com/Stability-AI/generative-models), which implements the most popular diffusion frameworks (both training and inference). - Repository: https://github.com/Stability-AI/generative-models - Paper: https://stability.ai/research/adversarial-diffusion-distillation - Demo [for the bigger SDXL-Turbo]: http://clipdrop.co/stable-diffusion-turbo The charts above evaluate user preference for SD-Turbo over other single- and multi-step models. SD-Turbo evaluated at a single step is preferred by human voters in terms of image quality and prompt following over LCM-Lora XL and LCM-Lora 1.5. Note: For increased quality, we recommend the bigger version SDXL-Turbo. For details on the user study, we refer to the research paper. The model is intended for both non-commercial and commercial usage. Possible research areas and tasks include - Research on generative models. - Research on real-time applications of generative models. - Research on the impact of real-time generative models. - Safe deployment of models which have the potential to generate harmful content. - Probing and understanding the limitations and biases of generative models. - Generation of artworks and use in design and other artistic processes. - Applications in educational or creative tools. For commercial use, please refer to https://stability.ai/membership. SD-Turbo does not make use of `guidancescale` or `negativeprompt`, we disable it with `guidancescale=0.0`. Preferably, the model generates images of size 512x512 but higher image sizes work as well. A single step is enough to generate high quality images. When using SD-Turbo for image-to-image generation, make sure that `numinferencesteps` `strength` is larger or equal to 1. The image-to-image pipeline will run for `int(numinferencesteps strength)` steps, e.g. 0.5 2.0 = 1 step in our example below. The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model. The model should not be used in any way that violates Stability AI's Acceptable Use Policy. Limitations - The quality and prompt alignment is lower than that of SDXL-Turbo. - The generated images are of a fixed resolution (512x512 pix), and the model does not achieve perfect photorealism. - The model cannot render legible text. - Faces and people in general may not be generated properly. - The autoencoding part of the model is lossy. The model is intended for both non-commercial and commercial usage. Check out https://github.com/Stability-AI/generative-models

—

2,754,387

426

stable-diffusion-xl-base-1.0

--- license: openrail++ tags: - text-to-image - stable-diffusion ---

—

2,710,209

7,121

stable-diffusion-2-1

OpenRail license. Tags: stable-diffusion, text-to-image. Pinned: true.

—

652,497

4,037

stable-diffusion-2-base

—

644,755

357

stable-diffusion-3-medium-diffusers

--- license: other license_name: stabilityai-nc-research-community license_link: LICENSE tags: - text-to-image - stable-diffusion extra_gated_prompt: >- By clicking "Agree", you agree to the [License Agreement](https://huggingface.co/stabilityai/stable-diffusion-3-medium/blob/main/LICENSE) and acknowledge Stability AI's [Privacy Policy](https://stability.ai/privacy-policy). extra_gated_fields: Name: text Email: text Country: country Organization or Affiliation: text Receive email updates and pro

—

491,657

419

stable-diffusion-xl-refiner-1.0

--- license: openrail++ tags: - stable-diffusion - image-to-image ---

—

435,581

1,990

sdxl-turbo

--- pipeline_tag: text-to-image inference: false license: other license_name: sai-nc-community license_link: https://huggingface.co/stabilityai/sdxl-turbo/blob/main/LICENSE.md ---

—

407,843

2,479

sdxl-vae

--- license: mit tags: - stable-diffusion - stable-diffusion-diffusers inference: false ---

license:mit

388,599

714

stable-video-diffusion-img2vid

--- pipeline_tag: image-to-video license: other license_name: stable-video-diffusion-community license_link: LICENSE.md ---

—

381,214

989

stable-diffusion-2-1-base

—

287,210

695

stable-diffusion-2

—

217,836

1,922

stable-diffusion-3.5-large

--- license: other license_name: stabilityai-ai-community license_link: LICENSE.md tags: - text-to-image - stable-diffusion - diffusers inference: true extra_gated_prompt: >- By clicking "Agree", you agree to the [License Agreement](https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/LICENSE.md) and acknowledge Stability AI's [Privacy Policy](https://stability.ai/privacy-policy). extra_gated_fields: Name: text Email: text Country: country Organization or Affiliation: text Rec

—

207,412

3,207

stable-diffusion-3.5-medium

--- license: other license_name: stabilityai-ai-community license_link: LICENSE.md tags: - text-to-image - stable-diffusion - diffusers inference: true extra_gated_prompt: >- By clicking "Agree", you agree to the [License Agreement](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium/blob/main/LICENSE.md) and acknowledge Stability AI's [Privacy Policy](https://stability.ai/privacy-policy). extra_gated_fields: Name: text Email: text Country: country Organization or Affiliation: text Re

—

206,457

851

sd-vae-ft-mse

--- license: mit tags: - stable-diffusion - stable-diffusion-diffusers inference: false ---

license:mit

175,698

395

stable-diffusion-2-inpainting

—

155,784

612

stable-video-diffusion-img2vid-xt

--- pipeline_tag: image-to-video license: other license_name: stable-video-diffusion-community license_link: LICENSE.md ---

—

111,815

3,185

stablelm-3b-4e1t

`StableLM-3B-4E1T` is a 3 billion parameter decoder-only language model pre-trained on 1 trillion tokens of diverse English and code datasets for 4 epochs. Get started generating text with `StableLM-3B-4E1T` by using the following code snippet: Developed by: Stability AI Model type: `StableLM-3B-4E1T` models are auto-regressive language models based on the transformer decoder architecture. Language(s): English Library: GPT-NeoX License: Model checkpoints are licensed under the Creative Commons license (CC BY-SA-4.0). Under this license, you must give credit to Stability AI, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the Stability AI endorses you or your use. Contact: For questions and comments about the model, please email `[email protected]` The model is a decoder-only transformer similar to the LLaMA (Touvron et al., 2023) architecture with the following modifications: | Parameters | Hidden Size | Layers | Heads | Sequence Length | |----------------|-------------|--------|-------|-----------------| | 2,795,443,200 | 2560 | 32 | 32 | 4096 | Position Embeddings: Rotary Position Embeddings (Su et al., 2021) applied to the first 25% of head embedding dimensions for improved throughput following Black et al. (2022). Normalization: LayerNorm (Ba et al., 2016) with learned bias terms as opposed to RMSNorm (Zhang & Sennrich, 2019). Tokenizer: GPT-NeoX (Black et al., 2022). For complete dataset and training details, please see the StableLM-3B-4E1T Technical Report. The dataset is comprised of a filtered mixture of open-source large-scale datasets available on the HuggingFace Hub: Falcon RefinedWeb extract (Penedo et al., 2023), RedPajama-Data (Together Computer., 2023) and The Pile (Gao et al., 2020) both without the Books3 subset, and StarCoder (Li et al., 2023). Given the large amount of web data, we recommend fine-tuning the base StableLM-3B-4E1T for your downstream tasks. The model is pre-trained on the aforementioned datasets in `bfloat16` precision, optimized with AdamW, and trained using the NeoX tokenizer with a vocabulary size of 50,257. We outline the complete hyperparameters choices in the project's GitHub repository - config. Hardware: `StableLM-3B-4E1T` was trained on the Stability AI cluster across 256 NVIDIA A100 40GB GPUs (AWS P4d instances). Training began on August 23, 2023, and took approximately 30 days to complete. Software: We use a fork of `gpt-neox` (EleutherAI, 2021), train under 2D parallelism (Data and Tensor Parallel) with ZeRO-1 (Rajbhandari et al., 2019), and rely on flash-attention as well as SwiGLU and Rotary Embedding kernels from FlashAttention-2 (Dao et al., 2023) The model is intended to be used as a foundational base model for application-specific fine-tuning. Developers must evaluate and fine-tune the model for safe performance in downstream applications. Limitations and Bias As a base model, this model may exhibit unreliable, unsafe, or other undesirable behaviors that must be corrected through evaluation and fine-tuning prior to deployment. The pre-training dataset may have contained offensive or inappropriate content, even after applying data cleansing filters, which can be reflected in the model-generated text. We recommend that users exercise caution when using these models in production systems. Do not use the models if they are unsuitable for your application, or for any applications that may cause deliberate or unintentional harm to others. Open LLM Leaderboard Evaluation Results Detailed results can be found here | Metric |Value| |---------------------------------|----:| |Avg. |46.58| |AI2 Reasoning Challenge (25-Shot)|46.59| |HellaSwag (10-Shot) |75.94| |MMLU (5-Shot) |45.23| |TruthfulQA (0-shot) |37.20| |Winogrande (5-shot) |71.19| |GSM8k (5-shot) | 3.34|

stabilityai

sd-turbo

stable-diffusion-xl-base-1.0

stable-diffusion-2-1

stable-diffusion-2-base

stable-diffusion-3-medium-diffusers

stable-diffusion-xl-refiner-1.0

sdxl-turbo

sdxl-vae

stable-video-diffusion-img2vid

stable-diffusion-2-1-base

stable-diffusion-2

stable-diffusion-3.5-large

stable-diffusion-3.5-medium

sd-vae-ft-mse

stable-diffusion-2-inpainting

stable-video-diffusion-img2vid-xt

stable-audio-open-1.0

stable-virtual-camera

stable-video-diffusion-img2vid-xt-1-1

stable-diffusion-3.5-large-controlnet-depth

stable-diffusion-3.5-large-controlnet-canny

sd-vae-ft-ema

stablelm-3b-4e1t

stable-diffusion-x4-upscaler

TripoSR

stable-diffusion-3.5-large-turbo

stable-cascade

sd-x2-latent-upscaler

stable-diffusion-3-medium

stablelm-zephyr-3b

stable-code-3b

stable-diffusion-3.5-large-tensorrt

stablelm-2-1_6b

stablelm-2-zephyr-1_6b

sdxl-turbo-ryzen-ai

stable-fast-3d

stablelm-tuned-alpha-3b

stable-diffusion-2-1-unclip

stable-code-instruct-3b

stable-diffusion-2-depth

stablelm-base-alpha-7b-v2

stable-cascade-prior

stablelm-tuned-alpha-7b

StableBeluga-13B

japanese-stablelm-base-alpha-7b

stable-audio-open-small

StableBeluga-7B

stablelm-base-alpha-3b

japanese-stablelm-instruct-gamma-7b

japanese-stablelm-base-beta-70b

japanese-stablelm-instruct-beta-70b

japanese-stablelm-base-gamma-7b

stablelm-base-alpha-7b

sv4d2.0

StableBeluga1-Delta

StableBeluga2

stablecode-completion-alpha-3b-4k

stable-diffusion-3.5-controlnets-tensorrt

stable-point-aware-3d

tiny-random-stablelm-2

Japanese Stable Clip Vit L 16

japanese-stablelm-3b-4e1t-instruct

japanese-stablelm-3b-4e1t-base

japanese-stablelm-instruct-alpha-7b

japanese-stablelm-instruct-alpha-7b-v2

stable-diffusion-xl-base-0.9

japanese-stablelm-instruct-beta-7b

ar-stablelm-2-chat

stablelm-2-1_6b-chat

japanese-stablelm-instruct-ja_vocab-beta-7b

japanese-stablelm-base-beta-7b

stable-diffusion-2-1-unclip-small

japanese-stable-vlm

japanese-stablelm-base-ja_vocab-beta-7b

Stable Diffusion Xl 1.0 Tensorrt

stablelm-2-12b-chat-GGUF

stablelm-base-alpha-3b-v2

japanese-stable-diffusion-xl

stablelm-2-12b-chat