NSFW-API
NSFW_Wan_14b
Model Description NSFW Wan 14B T2V is a massive, 14 billion parameter text-to-video generation model, specifically fine-tuned for generating Not Safe For Work (NSFW) content. This model was created using a unified training methodology to build a solid understanding across the entire NSFW spectrum and generate videos with coherent motion natively. The primary goal of this model is to provide a research and creative tool capable of generating thematically relevant short video clips based on text prompts within the adult content domain. It aims to understand and render a wide array of NSFW scenarios, aesthetics, and actions described in natural language with high fidelity and temporal consistency. Model Details Architecture: Text-to-Video Transformer Architecture Parameters: 14 Billion Type: Text-to-Video (T2V) Specialization: NSFW Content Generation --- Training Methodology Unlike previous multi-phase approaches, the 14B model was trained using a single, unified configuration from the ground up to ensure maximum quality and stability from the very first epoch. Mixed Dataset: The model was trained on a mixed dataset of 30k video clips and 20k still images simultaneously. This method provides constant spatial regularization, preventing the anatomical drift and quality collapse that can occur in phased training. The model learns aesthetics and motion in parallel. Stable Configuration: The entire 15-epoch run used a stable learning rate with an initial warmup, batch sizes optimized for the 14B architecture, and a training schedule designed for steady, progressive learning. Training Specifications: The training was conducted on 17-frame video clips at a resolution of 480p. Outcome: The result is a series of 15 high-quality, coherent checkpoints. The model demonstrates vastly improved spatial quality, stable motion, and reliable NSFW fidelity without the legacy artifacts associated with older training methods. We strongly recommend using `wan14Be15.safetensors` for all general use cases and LoRA training. This final checkpoint represents the most refined state of the model. The model was trained on a dataset comprising the top 1,000 posts from approximately 1,250 distinct NSFW subreddits. This dataset was carefully curated to capture a broad spectrum of adult themes, visual styles, character archetypes, specific kinks, and actions prevalent in these online communities. The video portion of the dataset was sourced from similar communities. The captions associated with the training data leveraged the language and tagging conventions found within these subreddents. For insights into effective prompting strategies for specific styles or content, please refer to the `prompting-guide.json` file included in this repository. Note: Due to the nature of the source material, the training dataset inherently contains explicit adult content. `wan14Be1.safetensors` ... (and all intermediate epochs) `wan14Be15.safetensors` `prompting-guide.json`: This crucial JSON file contains an analysis of common keywords, phrases, and descriptive language associated with the content from various source subreddits. It is designed to help users craft more effective prompts. This model is intended for generating short video clips (typically a few seconds) from descriptive text prompts. 1. Select a Checkpoint: We recommend using the final `wan14Be15.safetensors` checkpoint for the best balance of training and quality. 2. No Helper LoRA Needed: The model generates motion natively. You do not need to use any external motion LoRAs. 3. Craft Your Prompt: Utilize natural language to describe the desired scene, subjects, actions, and style. 4. Consult `prompting-guide.json`: For best results, especially when targeting specific sub-community styles or niche fetishes, refer to the `prompting-guide.json`. This guide will provide insights into the terminology and phrasing most likely to elicit the desired output. 5. Generate: Use your preferred inference pipeline compatible with this model architecture. The Ideal Base for LoRA Fine-Tuning While NSFW Wan 14B T2V is a capable standalone model, its greatest strength lies in its efficacy as a foundational base for training specialized LoRAs (Low-Rank Adaptations). We highly recommend using `wan14Be15.safetensors` as the base for all LoRA training. Its robust, unified training provides a strong and stable understanding of: Core NSFW Anatomy & Aesthetics: The mixed-data training provides a strong, coherent grasp of anatomy and visual styles from the start. Coherent Motion & Actions: The video component provides foundational knowledge of common sexual acts and temporal consistency. Because the base model has a strong, coherent understanding of anatomy and motion from the outset, you can focus your LoRA training dataset exclusively on the specific niche concept, character, artistic style, or unique action you want to master. This leads to more efficient LoRA training and superior results. Connect with other users, share your creations, get help with prompting, discuss fine-tuning, and contribute to the community: We encourage active participation and feedback to help improve future iterations and resources! NSFW Focus: The model's knowledge is heavily biased towards the content prevalent in the NSFW subreddits it was trained on. It will likely perform poorly on SFW (Safe For Work) prompts. Specificity & Artifacts: While the model demonstrates high quality, it may still produce visual artifacts, anatomical inaccuracies, or fail to perfectly capture highly complex or nuanced prompts. Video generation is an evolving field. Bias: The training data reflects the content, biases, preferences, and potentially problematic depictions present in the source NSFW communities. The model may generate content that perpetuates these biases. Safety: This model does not have built-in safety filters. Users are responsible for the ethical application of the model. Temporal Coherence: Coherence is significantly improved. However, very long or complex actions might still exhibit some temporal inconsistencies. This model is intended for adult users (18+/21+ depending on local regulations) only. Consent and Harm: This model generates fictional, synthetic media. It must not be used to create non-consensual depictions of real individuals, to impersonate, defame, harass, or generate content that could cause harm. Legal Use: Users are solely responsible for ensuring that their use of this model and the content they generate complies with all applicable local, national, and international laws and regulations. Distribution: Exercise extreme caution and responsibility if distributing content generated by this model. Be mindful of platform terms of service and legal restrictions regarding adult content. No Endorsement: The creators of this model do not endorse or condone the creation or distribution of illegal, unethical, or harmful content. We strongly recommend users familiarize themselves with responsible AI practices and the potential societal impacts of generative NSFW media. The outputs of this model are entirely synthetic and computer-generated. They do not depict real people or events unless explicitly prompted to do so with user-provided data (which is not the intended use of this pre-trained model). The developers of this model are not responsible for the outputs created by users.
NSFW_Wan_1.3b
> šØ IMPORTANT UPDATE: New Experimental Checkpoints Available! šØ > > A new, experimental set of checkpoints (`wan1.3Bexpe1` through `wan1.3Bexpe14`) has been released. These were trained using a revised methodology to fix significant image quality degradation issues (e.g., body horror, artifacts) found in the original `e4-e20` checkpoints. > > We strongly recommend new users start with the experimental `wan1.3Bexpe14.safetensors` checkpoint. For more details, see the new section below titled "The 'Fix' - Experimental Epochs 1-8". Your feedback on these new models is crucial and will help determine if they will replace the original series. NSFW Wan 1.3b T2V is a powerful, 1.3 billion parameter text-to-video generation model, specifically fine-tuned for generating Not Safe For Work (NSFW) content. The model has undergone multiple training methodologies to create a model that has a solid understanding across the entire NSFW spectrum and can generate videos with coherent motion natively. The primary goal of this model is to provide a research and creative tool capable of generating thematically relevant short video clips based on text prompts within the adult content domain. It aims to understand and render a wide array of NSFW scenarios, aesthetics, and actions described in natural language, now with improved temporal consistency. Architecture: Text-to-Video Transformer Architecture Parameters: 1.3 Billion Type: Text-to-Video (T2V) Specialization: NSFW Content Generation šØ The "Fix" - Experimental Epochs 1-8 (Recommended) After user feedback and internal review revealed significant image quality degradation and "body horror" artifacts in the original training run (specifically after epoch 3), a new training procedure was designed and executed. The Problem with the Original Run The original two-phase approach, while sound in theory, suffered in practice. The initial image-only training phase (epochs 1-10) was too aggressive, causing "catastrophic forgetting" where the model's understanding of coherent anatomy (faces, hands, etc.) collapsed. The subsequent video-only training (epochs 11-20) could not fully recover from this damage, resulting in outputs that were often distorted or of low quality. The Revised Training Solution A new, single-run training configuration was developed to address these flaws from the ground up: Mixed Dataset: Instead of separate phases, the new run was trained on a mixed dataset of 30k video clips and 20k still images simultaneously. This provided constant spatial regularization, preventing the anatomical drift and quality collapse seen previously. Stable Configuration: A more conservative learning rate (LR), smaller batch sizes, and a shorter overall training schedule were used to ensure the model learned the new concepts without destroying its foundational knowledge. Outcome: The result is a series of 8 new experimental epochs that demonstrate vastly improved spatial quality, stable motion, and reliable NSFW fidelity without the "glitching" or "body horror" of the original run. We strongly recommend using `wan1.3Bexpe14.safetensors` for all general use cases and LoRA training. This checkpoint represents the best trade-off between explicit content generation and visual coherence from the new, improved training run. Note: This section describes the original training process, which resulted in the `e1` through `e20` checkpoints. This process had known flaws that led to quality degradation. For the best results, please use the new experimental models described above. The model's original training was split into two distinct phases to first build a strong aesthetic foundation and then learn motion. Epochs 1-10 (Image-Trained): The initial checkpoints were fine-tuned primarily on a massive NSFW image dataset. These epochs excel at style and detail but have limited native motion capabilities. Quality degrades significantly after epoch 3. Epochs 11-20 (Video-Trained): These later checkpoints were trained exclusively on a video dataset. This phase taught the model temporal coherence and motion. The result is a model that can generate quality video directly, without the need for any helper LoRAs. `wan1.3Be20.safetensors` is the best of this original series. The model was trained on a dataset comprising the top 1,000 posts from approximately 1,250 distinct NSFW subreddits. This dataset was carefully curated to capture a broad spectrum of adult themes, visual styles, character archetypes, specific kinks, and actions prevalent in these online communities. The second phase of training utilized a video dataset sourced from similar communities. The captions associated with the training data leveraged the language and tagging conventions found within these subreddents. For insights into effective prompting strategies for specific styles or content, please refer to the `prompting-guide.json` file included in this repository. Note: Due to the nature of the source material, the training dataset inherently contains explicit adult content. Experimental (Recommended): `wan1.3Bexpe1.safetensors` ... (and all intermediate epochs) `wan1.3Bexpe14.safetensors` Original (Legacy): `wan1.3Be1.safetensors` ... (and all intermediate epochs) `wan1.3Be20.safetensors` `prompting-guide.json`: This crucial JSON file contains an analysis of common keywords, phrases, and descriptive language associated with the content from various source subreddits. It is designed to help users craft more effective prompts. This model is intended for generating short video clips (typically a few seconds) from descriptive text prompts. 1. Select a Checkpoint: We now strongly recommend using the experimental `wan1.3Bexpe14.safetensors`. This checkpoint is from the revised training run and offers superior visual quality and motion coherence. 2. No Helper LoRA Needed: With the video-trained checkpoints (`e11-e20` and all `exp` models), you do not need to use the old `NSFWWan1.3bmotionhelper` LoRA. The model generates motion natively. 3. Craft Your Prompt: Utilize natural language to describe the desired scene, subjects, actions, and style. 4. Consult `prompting-guide.json`: For best results, especially when targeting specific sub-community styles or niche fetishes, refer to the `prompting-guide.json`. This guide will provide insights into the terminology and phrasing most likely to elicit the desired output. 5. Generate: Use your preferred inference pipeline compatible with this model architecture. While NSFW Wan 1.3B T2V is a capable standalone model, its greatest strength lies in its efficacy as a foundational base for training specialized LoRAs (Low-Rank Adaptations). We highly recommend using the new `wan1.3Bexpe14.safetensors` as the base for all LoRA training. Its improved and more stable training provides an even more robust understanding of: Core NSFW Anatomy & Aesthetics: The mixed-data training provides a strong, non-degraded grasp of anatomy and visual styles. Coherent Motion & Actions: The video component provides foundational knowledge of common sexual acts and temporal consistency. Because this new base model is not "damaged," you don't need to waste training cycles teaching your LoRA to fix underlying anatomical problems. You can focus your LoRA training dataset exclusively on the specific niche concept, character, artistic style, or unique action you want to master. This leads to more efficient LoRA training and superior results. Connect with other users, share your creations, get help with prompting, discuss the new experimental models, and contribute to the community: We encourage active participation and feedback to help improve future iterations and resources! Your feedback on the experimental models is especially valuable. NSFW Focus: The model's knowledge is heavily biased towards the content prevalent in the NSFW subreddits it was trained on. It will likely perform poorly on SFW (Safe For Work) prompts. Specificity & Artifacts: While greatly improved in the experimental checkpoints, the model may still produce visual artifacts, anatomical inaccuracies, or fail to perfectly capture highly complex or nuanced prompts. Video generation is an evolving field. Bias: The training data reflects the content, biases, preferences, and potentially problematic depictions present in the source NSFW communities. The model may generate content that perpetuates these biases. Safety: This model does not have built-in safety filters. Users are responsible for the ethical application of the model. Temporal Coherence: Coherence is significantly improved. However, very long or complex actions might still exhibit some temporal inconsistencies. This model is intended for adult users (18+/21+ depending on local regulations) only. Consent and Harm: This model generates fictional, synthetic media. It must not be used to create non-consensual depictions of real individuals, to impersonate, defame, harass, or generate content that could cause harm. Legal Use: Users are solely responsible for ensuring that their use of this model and the content they generate complies with all applicable local, national, and international laws and regulations. Distribution: Exercise extreme caution and responsibility if distributing content generated by this model. Be mindful of platform terms of service and legal restrictions regarding adult content. No Endorsement: The creators of this model do not endorse or condone the creation or distribution of illegal, unethical, or harmful content. We strongly recommend users familiarize themselves with responsible AI practices and the potential societal impacts of generative NSFW media. The outputs of this model are entirely synthetic and computer-generated. They do not depict real people or events unless explicitly prompted to do so with user-provided data (which is not the intended use of this pre-trained model). The developers of this model are not responsible for the outputs created by users.
NSFW Wan UMT5 XXL
NSFW_Wan_1.3b_motion_helper
The NSFW Wan 1.3b Motion Helper LoRA is a Low-Rank Adaptation specifically designed to significantly enhance video quality, motion consistency, and overall visual style when used with the NSFW Wan 1.3B T2V text-to-video model. It is particularly optimized for and trained with the `wan1.3Be10.safetensors` checkpoint of the base model. Type: LoRA Base Model Compatibility: NSFW Wan 1.3B T2V Recommended Base Checkpoint: `wan1.3Be10.safetensors` Purpose: To improve video quality, temporal coherence, motion rendering, and stylistic elements in generations from the base NSFW Wan 1.3B T2V model. As detailed in the main NSFW Wan 1.3B T2V model card, the base checkpoint was primarily fine-tuned on images. While capable, this can sometimes lead to: Deterioration in video quality. Inconsistencies in motion or temporal coherence. Less refined stylistic output in video format. This LoRA was trained specifically to address these limitations, acting as a "motion and quality" enhancement layer on top of the base model's understanding of NSFW content. 1. Download the Base Model: Ensure you have the NSFW Wan 1.3B T2V model. The `wan1.3Be10.safetensors` checkpoint is highly recommended for use with this LoRA. 2. Download this LoRA: Get the `NSFWWan1.3bmotionhelper.safetensors` file. 3. Load in Your Pipeline: In your preferred text-to-video generation environment, load the base model checkpoint (e.g., `wan1.3Be10.safetensors`) and then apply this LoRA. 4. Prompt as Usual: Craft your text prompts as you normally would for the base model. Refer to the `prompting-guide.json` in the base model repository for guidance. 5. Generate: Enjoy improved video outputs! Using this LoRA with the recommended base checkpoint should result in: Smoother and more natural motion. Better temporal consistency across frames. Enhanced visual fidelity and detail in video sequences. Improved overall style and aesthetic coherence for NSFW video content. This LoRA is intended to be used with the main NSFW Wan 1.3B T2V model. You can find the base model, its various epoch checkpoints, and the `prompting-guide.json` here: This LoRA enhances the output of the base model but is still bound by its inherent capabilities and limitations (e.g., understanding of highly complex or abstract prompts). The focus remains on NSFW content; performance on SFW prompts will likely be poor. Visual artifacts or inaccuracies may still occur, though they should be reduced compared to using the base model for video without this LoRA. All limitations and biases mentioned in the base model card apply. This LoRA is an add-on for a model designed to generate NSFW content. All ethical considerations, responsible use guidelines, and limitations outlined in the main NSFW Wan 1.3B T2V model card apply fully when using this LoRA. Intended for adult users (18+/21+ depending on local regulations) only. Must not be used to create non-consensual depictions of real individuals, to impersonate, defame, harass, or generate content that could cause harm. Users are solely responsible for ensuring that their use of this LoRA and the content they generate complies with all applicable laws and regulations. Exercise extreme caution and responsibility if distributing content generated. The creators do not endorse or condone the creation or distribution of illegal, unethical, or harmful content. Please refer to the NSFW Wan 1.3B T2V model card for a more comprehensive discussion on ethical considerations. The outputs generated using this LoRA in conjunction with the base model are entirely synthetic and computer-generated. They do not depict real people or events unless explicitly prompted to do so with user-provided data. The developers of this LoRA and the base model are not responsible for the outputs created by users.
NSFW-Wan-14b-Revealing-Boobs
NSFW-Wan-14b-Missionary-Sex-French-Kissing-Position
NSFW-Wan-14b-Spooning-Leg-Lifted-Sex-Position
NSFW-Wan-14b-Cumshot-Facials
NSFW-Wan-14b-Panty-Peel
NSFW_Segmentation
Multi-head release of single-task segmentation models targeting NSFW anatomy. Each checkpoint runs independently and produces binary masks for the specified classes. | File | Backbone | Task | Classes | Mask [email protected] | Mask [email protected]:0.95 | | --- | --- | --- | --- | --- | --- | | `nsfw-seg-breast-s.pt` | YOLO11s | Breast anatomy | breast, areola, nipple | 0.895 | 0.636 | | `nsfw-seg-breast-x.pt` | YOLO11x | Breast anatomy | breast, areola, nipple | 0.929 | 0.702 | | `nsfw-seg-vagina-s.pt` | YOLO11s | Vagina | vagina | 0.995 | 0.871 | | `nsfw-seg-vagina-x.pt` | YOLO11x | Vagina | vagina | 0.995 | 0.918 | | `nsfw-seg-penis-s.pt` | YOLO11s | Penis | penis | 0.995 | 0.975 | | `nsfw-seg-penis-x.pt` | YOLO11x | Penis | penis | 0.995 | 0.987 | Description - Backbones: YOLO11s and YOLO11x segmentation heads (Ultralytics 8.3.204). - Weights exported as `.pt` checkpoints compatible with `ultralytics>=8.3`. - One model per label space; load the checkpoint that matches your target anatomy. Intended Use - Automatic instance segmentation for NSFW anatomical structures in moderated, research, or medical-support workflows. - Inputs: RGB images. - Outputs: Binary masks aligned with the class taxonomy above. Data Summary - Training data consisted of curated, privately-held NSFW image sets with polygon masks (YOLO segmentation format). - Train/validation splits were normalized and merged after preprocessing; metrics reflect held-out validation imagery. - Datasets are not included in this release. Metrics - Evaluated with `yolo segment val` at 832 px resolution, confidence threshold 0.1. - Numbers in the table refer to the best checkpoint per task. Limitations - Models are not a substitute for clinical assessment. - Domain shift (lighting, camera quality, demographics) may impact performance. - No safety filtering is applied; downstream systems must implement access controls. Support For integration questions or feedback, open an issue on the hosting repository and mention the checkpoint name in the subject line.