adamo1139

117 models • 4 total models in database
Sort by:

Yi-34B-200K-AEZAKMI-v2

This model is licensed under Apache 2.0 and is tagged as a large language model (LLM).

NaNK
llama
720
12

Yi-34B-AEZAKMI-v1

NaNK
llama
720
2

Mistral-7B-AEZAKMI-v1

NaNK
697
1

Yi-6B-200K-AEZAKMI-v2

NaNK
llama
593
1

Danube3-4b-4chan-HESOYAM-2510

NaNK
llama
592
0

yi-34b-200k-rawrr-dpo-1

NaNK
llama
578
2

Yi-34B-200K-rawrr1-LORA-DPO-experimental-r3

NaNK
llama
572
1

Mistral-7B-AEZAKMI-v2

NaNK
license:apache-2.0
572
0

Yi-34B-200K-AEZAKMI-RAW-1701

NaNK
llama
565
1

aya-expanse-8b-ungated

NaNK
license:cc-by-nc-4.0
392
1

Danube3-4b-4chan-HESOYAM-2510-GGUF

NaNK
license:apache-2.0
226
0

magnum-v2-4b-gguf-lowctx

NaNK
195
0

Poziomka-Baza

Jest to merge checkpointów zlokalizowanych tu: adamo1139/poziomkapretrainhf ` Jeśli kiedyś będziesz przejeżdzał koło Krakowa, koniecznie odwiedź naszą Pizzerię. Oferujemy` ` Jesień tuż przed nami. To czas na odświeżenie garderoby i zakup` ` Szkoła Podstawowa nr.1 imienia Jana Pawła II w Kutnie zorganizowała`

license:apache-2.0
148
0

stable-diffusion-3.5-large-turbo-ungated

119
5

stable-diffusion-3.5-medium-ungated

51
1

danube3-4b-hesoyam-2208-gguf

NaNK
license:apache-2.0
45
0

danube3-4b-turtle-2208-gguf

NaNK
license:apache-2.0
44
0

GPT-OSS-20B-HESOYAM-1108-WIP-CHATML-GGUF

NaNK
43
0

poziomka-sft-full-2025-10-20-4

42
0

Apertus-8B-Instruct-2509-ungated

NaNK
license:apache-2.0
34
3

stable-diffusion-3.5-large-ungated

33
7

danube3-4b-aezakmi-2408-gguf

NaNK
license:apache-2.0
32
0

Bielik-4.5B-v3-Instruct-ungated

Bielik 4.5B v3 Instruct BF16 checkpoint without gating mechanism, making it easier and faster to work with.

NaNK
llama
30
0

danube3-500m-turtle-2108-gguf

license:apache-2.0
27
0

Danube3-500M-4chan-archive-0709-GGUF

license:apache-2.0
26
0

poziomka-lora-instruct-alpha

poziomka base trained for 1200 steps on adamo1139/temppoziomkasft at 2k ctx with LoRA rank 32

26
0

danube3-500m-hesoyam-2108-gguf

license:apache-2.0
25
0

Mistral-7B-4chan-GaLore-SFT-0911-GGUF

NaNK
25
0

Experimental-DeepSeek-V2-Coder-Lite-JUMP-alpha1-GGUF

24
0

Experimental-DeepSeek-Coder-V2-Lite-Jump-alpha2-GGUF

19
0

DeepSeek-V2-Lite-Chat-ARM-GGUF

16
0

DeepSeek-V2-Lite-ARM-GGUF

15
0

Bielik-Guard-0.1B-v1.0-ungated

Bielik Guard, pseudonim Sójka, bez mechanizmu bramki.

NaNK
license:apache-2.0
13
0

Hermes-3-Llama-3.1-8B-FP8-Dynamic

NaNK
llama
12
0

DeepSeek-R1-0528-AWQ

NaNK
license:mit
11
4

danube3-4b-aezakmi-toxic-2908-gguf

NaNK
license:apache-2.0
11
0

Mistral-7b-Magpie-Qwen2-2211-GGUF

NaNK
11
0

PS_AD_O365_Mistral_superCOT_7B_03_QLoRA_GGUF

NaNK
8
0

Yi-34B-200K-HESOYAM-TURTLE-0208-4CHAN

NaNK
llama
8
0

PS_AD_O365_SpicyBoros_7B_01_GGUF

NaNK
license:llama2
7
0

Apertus-70B-Instruct-2509-ungated

NaNK
license:apache-2.0
6
2

Yi 34B 200K AEZAKMI RAW TOXIC XLCTX 2303

This model has been renamed from adamo1139/Yi-34B-200K-AEZAKMI-XLCTX-v3 to adamo1139/Yi-34B-200K-AEZAKMI-RAW-TOXIC-XLCTX-2303 on 2024-03-30. \ I am not happy with how often this model starts enumerating lists and I plan to improve toxic dpo dataset to fix it. Due to this, I don't think it deserves to be called AEZAKMI v3 and will be just a next testing iteration of AEZAKMI RAW TOXIC. \ I think I will be uploading one EXL2 quant before moving onto a different training run. Yi-34B 200K XLCTX base model fine-tuned on RAWrrv2 (DPO), AEZAKMI-3-6 (SFT) and unalignment/toxic-dpo-0.1 (DPO) datasets. Training took around 20-30 hours total on RTX 3090 Ti, all finetuning was done locally. It's like airoboros but with less gptslop, no refusals and less typical language used by RLHFed OpenAI models, with extra spicyness. Say goodbye to "It's important to remember"! \ Prompt format is standard chatml. Don't expect it to be good at math, riddles or be crazy smart. My end goal with AEZAKMI is to create a cozy free chatbot. Cost of this fine-tune is about $5-$10 in electricity. Base model used for fine-tuning was Yi-34B-200K model shared by 01.ai, the newer version that has improved long context needle in a haystack retrieval. They didn't give it a new name, giving it numbers would mess up AEZAKMI naming scheme by adding a second number, so I will be calling it XLCTX. I had to lower maxpositionalembeddings in config.json and modelmaxlength for training to start, otherwise I was OOMing straight away. This attempt had both maxpositionembeddings and modelmaxlength set to 4096, which worked perfectly fine. I then reversed this to 200000 once I was uploading it. I think it should keep long context capabilities of the base model. In my testing it seems less unhinged than adamo1139/Yi-34b-200K-AEZAKMI-RAW-TOXIC-2702 and maybe a touch less uncensored, but still very much uncensored even with default system prompt "A chat." If you want to see training scripts, let me know and I will upload them. LoRAs are uploaded here adamo1139/Yi-34B-200K-AEZAKMI-XLCTX-v3-LoRA EXL2 quants coming soon, I think I will start by uploading 4bpw quant in a few days. I recommend using ChatML format, as this was used during fine-tune. \ Here's a prompt format you should use, you can set a different system message, model was trained on SystemChat dataset, so it should respect system prompts fine. This model loves making numbered lists, to an exhaustion. It's more of an assistant feel rather than a human feel, at least with system chat "A chat." \ Long context wasn't tested yet, it should work fine though - feel free to give me feedback about it. Thanks to unsloth and huggingface team for providing software packages used during fine-tuning. \ Thanks to Jon Durbin, abacusai, huggingface, sandex, NobodyExistsOnTheInternet, Nous-Research for open sourcing datasets I included in the AEZAKMI dataset. \ AEZAKMI is basically a mix of open source datasets I found on HF, so without them this would not be possible at all. Open LLM Leaderboard Evaluation Results Detailed results can be found here | Metric |Value| |---------------------------------|----:| |Avg. |64.39| |AI2 Reasoning Challenge (25-Shot)|64.85| |HellaSwag (10-Shot) |84.76| |MMLU (5-Shot) |74.48| |TruthfulQA (0-shot) |37.14| |Winogrande (5-shot) |81.06| |GSM8k (5-shot) |44.05|

NaNK
llama
6
2

GPT-OSS-20B-HESOYAM-1108-WIP-CHATML

NaNK
5
2

BasicEconomics-SpicyBoros-2.2-7B-QLORA-v0.2-GGUF

NaNK
license:llama2
5
0

Mistral-7B-Magpie-Ultra-GaLore-0711-GGUF

NaNK
5
0

Yi-6B-200K-AEZAKMI-v2-6bpw-exl2

NaNK
llama
4
3

BasicEconomics-SpicyBoros-2.2-7B-QLORA-v0.1-GGUF

NaNK
license:llama2
4
1

BasicEconomics-SpicyBoros-2.2-7B-QLORA-v0.3-GGUF

NaNK
license:llama2
4
0

BasicEconomics-Mistral-7B-QLORA-v0.4-GGUF

NaNK
license:apache-2.0
4
0

PS_AD_O365_CodeLlama_7B_05_QLoRA_GGUF

NaNK
license:llama2
4
0

PS_AD_O365_Dolphin_7B_06_QLoRA_GGUF

NaNK
license:apache-2.0
4
0

Qwen2-VL-2B

We're excited to unveil Qwen2-VL, the latest iteration of our Qwen-VL model, representing nearly a year of innovation. > [!Important] > This is the base pretrained model of Qwen2-VL-2B without instruction tuning. SoTA understanding of images of various resolution & ratio: Qwen2-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc. Understanding videos of 20min+: Qwen2-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc. Agent that can operate your mobiles, robots, etc.: with the abilities of complex reasoning and decision making, Qwen2-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions. Multilingual Support: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc. Naive Dynamic Resolution: Unlike before, Qwen2-VL can handle arbitrary image resolutions, mapping them into a dynamic number of visual tokens, offering a more human-like visual processing experience. Multimodal Rotary Position Embedding (M-ROPE): Decomposes positional embedding into parts to capture 1D textual, 2D visual, and 3D video positional information, enhancing its multimodal processing capabilities. We have three models with 2, 7 and 72 billion parameters. This repo contains the pretrained 2B Qwen2-VL model. The code of Qwen2-VL has been in the latest Hugging Face `transformers` and we advise you to install the latest version with command `pip install -U transformers`, or you might encounter the following error: If you find our work helpful, feel free to give us a cite.

NaNK
license:apache-2.0
4
0

Yi-34B-200K-AEZAKMI-RAW-2301

NaNK
llama
3
3

Mistral-Small-24B-Instruct-2501-ungated

NaNK
license:apache-2.0
3
1

EuroLLM 9B Instruct Ungated

NaNK
llama
3
1

PS_AD_O365_Mistral_7B_02_GGUF

NaNK
license:apache-2.0
3
0

Yi-1.5-9B-base-mirror

NaNK
llama
3
0

DeepSeek-V2.5-1210-AWQ

NaNK
3
0

Qwen-Image-Edit-fused-anime-lightning-8steps

Qwen Image Edit with Qwen Image non-Edit flymyAI Anime LoRA fused in, alongside Lightning 8-step LoRA, also fused in.

3
0

poziomka-malutka

Poziomka-malutka to model językowy trenowany jedynie na języku polskim. Model widział 5 miliardów tokenów i był trenowany od zera z użyciem Megatron-LM. Model używa architektury BailingV2MoE. Ma 128 ekspertów, 2 z nich jest aktywnych przy każdym tokenie. Jest to model typu `baza`, więc nie wspiera szablonu konwersacyjnego.

license:apache-2.0
3
0

poziomka_5_2309_4_iter35200-alpha

Model is still training as of 26th of September, this is an intermediate checkpoint with 75.4% of the run being completed already.

3
0

Qwen2-VL-7B-Sydney

NaNK
license:apache-2.0
2
5

Yi-34b-200K-AEZAKMI-RAW-TOXIC-2702

NaNK
llama
2
4

Yi-6B-200K-AEZAKMI-v2-LoRA

NaNK
llama
2
1

Yi-34B-200K-rawrr1-LORA-DPO-experimental-r2

NaNK
llama
2
1

yi-34b-200k-aezakmi-v2-rawrr-v1-run1-experimental-LoRA

NaNK
llama
2
1

Yi-6B-200K-rawrr1-run2-LORA-DPO-experimental

NaNK
llama
2
0

Yi-1.5-34B-32K-uninstruct1-1106

NaNK
llama
2
0

Yi-1.5-34B-32K-rebased-1406

NaNK
llama
2
0

Experimental-Yi-Coder-9B-JUMP-0509-alpha

NaNK
llama
2
0

Yi-9K-200K-AEZKAMI-RAW-TOXIC-GGUF

2
0

DeepSeek-R1-Distill-Qwen-1.5B-5bpw-exl2

NaNK
2
0

Apertus-70B-2509-ungated

NaNK
license:apache-2.0
2
0

szypulka4_15_09_2025_test

2
0

yi-34b-200k-rawrr-dpo-2

NaNK
llama
1
2

Yi-34B-200K-AEZAKMI-v2-exl2-4.65bpw

NaNK
llama
1
1

Yi-34B-200K-AEZAKMI-v2-LoRA

NaNK
llama
1
1

Yi-34B-AEZAKMI-v1-exl2-4.65bpw

NaNK
llama
1
0

Yi-6B-200K-AEZAKMI-v2-rawrr1-DPO-LoRA

NaNK
llama
1
0

Yi-34B-200K-AEZAKMI-RAW-2901

NaNK
llama
1
0

Yi-34B-200K-AEZAKMI-RAW-2901-4-65bpw-EXL2

NaNK
llama
1
0

Yi-34B-200K-XLCTX-AEZAKMI-RAW-2904

NaNK
llama
1
0

Yi-34B-200K-HESOYAM-0905

NaNK
llama
1
0

Yi-1.5-34B-base-mirror

NaNK
llama
1
0

Yi-34B-200K-HESOYAM-2206

NaNK
llama
1
0

Yi-34B-200K-HESOYAM-rawrr_stage2-2306

NaNK
llama
1
0

yi-34b-200k-uninstruct1-3007

NaNK
llama
1
0

Experimental-DeepSeek-V2-Coder-Lite-JUMP-alpha1-LORA

llama-factory
1
0

OpenHermes-2.5-Mistral-7B-FP8-Dynamic

NaNK
license:apache-2.0
1
0

Hermes-3-Llama-3.1-8B-Static-FP8-KV

NaNK
llama
1
0

aya-expanse-32b-ungated

NaNK
license:cc-by-nc-4.0
1
0

DeepSeek-R1-Distill-Qwen-1.5B-6bpw-exl2

NaNK
1
0

DeepSeek-R1-Zero-AWQ

It's a 4-bit AWQ quantization of DeepSeek-R1-Zero 671B model, it's suitable for use with GPU nodes like 8xA100/8xH20/8xH100 with vLLM and SGLang You can run this model on 8x H100 80GB using vLLM with `vllm serve adamo1139/DeepSeek-R1-Zero-AWQ --tensor-parallel 8`

license:mit
1
0

Apertus-8B-2509-ungated

NaNK
license:apache-2.0
1
0

szypulka3_14_09_2025_test

MoE model based on Ling V2 MoE arch, with 32 experts, 2 activated, no shared experts, no dense layers, trained on about 100M tokens of FineWeb-2 pol-Latn split, using tokenizer taken from EuroLLM 1.7B Just a test to validate pipeline before training bigger 4B 256 expert model on 100B tokens.

1
0

szypulka5_15_09_2025_test

1
0

szypulka6_15_09_2025_test

1
0

Stable Diffusion 3 Medium Ungated

Same as official repo, all hashes match. Just ungated. You can also download via torrent. Stable Diffusion 3 Medium is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features greatly improved performance in image quality, typography, complex prompt understanding, and resource-efficiency. For more technical details, please refer to the Research paper. Please note: this model is released under the Stability Non-Commercial Research Community License. For a Creator License or an Enterprise License visit Stability.ai or contact us for commercial licensing details. - Developed by: Stability AI - Model type: MMDiT text-to-image generative model - Model Description: This is a model that can be used to generate images based on text prompts. It is a Multimodal Diffusion Transformer (https://arxiv.org/abs/2403.03206) that uses three fixed, pretrained text encoders (OpenCLIP-ViT/G, CLIP-ViT/L and T5-xxl) - Non-commercial Use: Stable Diffusion 3 Medium is released under the Stability AI Non-Commercial Research Community License. The model is free to use for non-commercial purposes such as academic research. - Commercial Use: This model is not available for commercial use without a separate commercial license from Stability. We encourage professional artists, designers, and creators to use our Creator License. Please visit https://stability.ai/license to learn more. For local or self-hosted use, we recommend ComfyUI for inference. Stable Diffusion 3 Medium is available on our Stability API Platform. Stable Diffusion 3 models and workflows are available on Stable Assistant and on Discord via Stable Artisan. - ComfyUI: https://github.com/comfyanonymous/ComfyUI - StableSwarmUI: https://github.com/Stability-AI/StableSwarmUI - Tech report: https://stability.ai/news/stable-diffusion-3-research-paper - Demo: Huggingface Space is coming soon... We used synthetic data and filtered publicly available data to train our models. The model was pre-trained on 1 billion images. The fine-tuning data includes 30M high-quality aesthetic images focused on specific visual content and style, as well as 3M preference data images. We have prepared three packaging variants of the SD3 Medium model, each equipped with the same set of MMDiT & VAE weights, for user convenience. `sd3medium.safetensors` includes the MMDiT and VAE weights but does not include any text encoders. `sd3mediuminclclipst5xxlfp8.safetensors` contains all necessary weights, including fp8 version of the T5XXL text encoder, offering a balance between quality and resource requirements. `sd3mediuminclclips.safetensors` includes all necessary weights except for the T5XXL text encoder. It requires minimal resources, but the model's performance will differ without the T5XXL text encoder. The `textencoders` folder contains three text encoders and their original model card links for user convenience. All components within the textencoders folder (and their equivalents embedded in other packings) are subject to their respective original licenses. The `exampleworkfows` folder contains example comfy workflows. Intended uses include the following: Generation of artworks and use in design and other artistic processes. Applications in educational or creative tools. Research on generative models, including understanding the limitations of generative models. All uses of the model should be in accordance with our Acceptable Use Policy. The model was not trained to be factual or true representations of people or events. As such, using the model to generate such content is out-of-scope of the abilities of this model. As part of our safety-by-design and responsible AI deployment approach, we implement safety measures throughout the development of our models, from the time we begin pre-training a model to the ongoing development, fine-tuning, and deployment of each model. We have implemented a number of safety mitigations that are intended to reduce the risk of severe harms, however we recommend that developers conduct their own testing and apply additional mitigations based on their specific use cases. For more about our approach to Safety, please visit our Safety page. Our evaluation methods include structured evaluations and internal and external red-teaming testing for specific, severe harms such as child sexual abuse and exploitation, extreme violence, and gore, sexually explicit content, and non-consensual nudity. Testing was conducted primarily in English and may not cover all possible harms. As with any model, the model may, at times, produce inaccurate, biased or objectionable responses to user prompts. Harmful content: We have used filtered data sets when training our models and implemented safeguards that attempt to strike the right balance between usefulness and preventing harm. However, this does not guarantee that all possible harmful content has been removed. The model may, at times, generate toxic or biased content. All developers and deployers should exercise caution and implement content safety guardrails based on their specific product policies and application use cases. Misuse: Technical limitations and developer and end-user education can help mitigate against malicious applications of models. All users are required to adhere to our Acceptable Use Policy, including when applying fine-tuning and prompt engineering mechanisms. Please reference the Stability AI Acceptable Use Policy for information on violative uses of our products. Privacy violations: Developers and deployers are encouraged to adhere to privacy regulations with techniques that respect data privacy. Please report any issues with the model or contact us: Safety issues: [email protected] Security issues: [email protected] Privacy issues: [email protected] License and general: https://stability.ai/license Enterprise license: https://stability.ai/enterprise

0
31

Meta_Spirit-LM-ungated

0
18

Llama-3-8B-AEZAKMI-run1

NaNK
llama
0
3

BasicEconomics-SpicyBoros-2.2-7B-QLORA-v0.1

NaNK
0
2

yi-34b-200k-aezakmi-raw-toxic-2702-4.65bpw-exl2

NaNK
llama
0
2

Yi-34B-Spicyboros-2-2-run3-QLoRA

NaNK
llama
0
1

Yi-34B-AEZAKMI-v1-LoRA

NaNK
llama
0
1

Yi-34B-200K-AEZAKMI-RAW-2301-LoRA

NaNK
license:apache-2.0
0
1

LWM-7B-1M-1000000ctx-AEZAKMI-3_1-1702

NaNK
llama
0
1

yi-34b-200k-aezakmi-raw-1902-exl2

NaNK
llama
0
1

Llama-3-8B-AEZAKMI-run1-LoRA

NaNK
0
1

Yi-34B-200K-XLCTX

NaNK
llama
0
1

Yi-34B-200K-XLCTX-AEZAKMI-RAW-TOXIC-0205

NaNK
llama
0
1

Yi-34B-200K-XLCTX-RAW-ORPO-0805-GaLore

NaNK
llama
0
1

Yi-9B-200K-AEZAKMI-RAW-TOXIC-1

NaNK
llama
0
1

Yi-34B-200K-HESOYAM-TURTLE-2606

NaNK
llama
0
1

PowerShell-TheStack-DeepSeek-Coder-6.7B-0.1e

NaNK
0
1

Yi-1.5-34B-32K-Magpie-Ultra-0611

NaNK
llama
0
1

DeepSeek-R1-Distill-Qwen-1.5B-4bpw-exl2

NaNK
0
1

Mistral-Small-24B-Base-2501-ungated

NaNK
license:apache-2.0
0
1