gghfez

88 models β€’ 4 total models in database
Sort by:

gemma-3-4b-novision

NaNK
β€”
8,602
7

GLM-4.6-control-vectors

Creative Writing control-vectors for zai-org/GLM-4.6 Apply the debias vector and either the positive or negative vector when starting llama-server. If both are applied, they will cancel each other out. You can use either `--control-vector [/path/to/vector.gguf]` or --control-vector-scaled [/path/to/vector.gguf] [scale factor] IMPORTANT: The positive and negative axis control vectors must be used along with the relevant de-bias control vector - they cannot be used on their own! With reasoning enabled on extreme quants like IQ2XXS, very simple prompts like "Hi" may result in irrelevant replies.

NaNK
license:apache-2.0
1,391
5

Mistral-Small-3.2-24B-Instruct-hf-AWQ

NaNK
license:apache-2.0
442
3

Sorcerer_creative-writing-control-vectors

Creative Writing control-vectors for rAIfle/SorcererLM-8x22b-bf16 This is an excellent model, but I found it somewhat diffcult to steer via prompting. Train on 1xH100NVL using jukofyork's control-vectors Based off Version v3.0 of creative-writing-control-vectors-v3.0. See here for more info about control-vectors creative-writing-control-vectors-v3.0

license:apache-2.0
394
2

gemma-3-27b-novision

NaNK
β€”
369
3

Qwen3.5-27b-control-vectors

NaNK
license:apache-2.0
357
1

Kimi-K2-Instruct-0905-control-vectors

NaNK
license:cc-by-4.0
317
0

GLM-5-control-vectors

NaNK
license:mit
287
2

GLM-4.5-Air-control-vectors

Creative Writing control-vectors for zai-org/GLM-4.5-Air Visual representation of polar control vectors at work - `characterfocus` (dialogue / narration) - `compassionvssadism` - `empathyvssociopathy` - `honestyvsmachiavellianism` - `humilityvsnarcissism` - `language` (simple / ornate) - `optimismvsnihilism` - `storytelling` (descriptive / explicit) Apply the debias vector and either the positive or negative vector when starting llama-server. If both axes are applied at the same time, they will cancel each other out. You can use either `--control-vector [/path/to/vector.gguf]` or `--control-vector-scaled [/path/to/vector.gguf] [scale factor]` IMPORTANT: As these vectors are calibrated against the debiased model, the debias vector must be applied along with the corresponding positive or negative axis control vector - they cannot be used on their own! πŸ‘ˆ Nihilism +1.0 The gymnasium was a cavern of noise and heat, smelling of sweat and floor wax. Maya sat on the hard bleachers, her legs swinging just above the concrete floor, trying to make herself small next to Chloe, who radiated confidence like a furnace. β€œMaya,” Chloe whispered loudly, nudging her arm. β€œIf you win, do not cry. It’s so embarrassing.” β€œI won’t win,” Maya mumbled back, tracing a crack in the worn wood of the bench. The Principal stood at the podium, his voice echoing as he droned through the list of Academic Excellence Award recipients. Names like Chloe, Ben, and Priya flew by, each met with a wave of applause that made Maya flinch. She hadn't meant to enter the contest. Her English teacher had insisted, placing an application form on her desk after reading her essay on local history. β€œYou have a voice, Maya,” Mrs. Gable had said gently. β€œDon’t keep it πŸ‘ˆ Optimism +0.5 The auditorium was a cavern of velvet seats and hushed anticipation. For Maya, the polished stage with its single spotlight felt like a distant planet she could never reach. She sat hunched in her seat, nervously smoothing the skirt of her simple blue dress. The air smelled of old wood and excitement. She wasn't there for herself. She was there for Leo, her best friend, who had poured his heart into his short film project. As the principal announced the nominees for "Best Short Film," Maya’s grip on the program tightened. Her name was called third. A wave of confusion washed over the audience. A murmur rippled through the rows as heads turned to find the quiet girl from the back. She froze, her breath catching in her throat. Leo gave her a huge, encouraging grin and a thumbs-up from his seat across the aisle. πŸ‘ˆ Baseline The stage lights were a physical thing, pressing down on Maya with the heat of a dozen suns. She stood in the center of the vast auditorium, her name echoing from the speakers as she clutched the heavy glass trophy to her chest. It was cool against her clammy palms, a sharp contrast to the roaring applause that washed over her. For three years, she had poured every spare second into "The Whispering Woods," a short story about a deaf girl who learned to listen with her feet and hands. Her English teacher, Mr. Harrison, had seen something raw and honest in her first draft. He’d pushed her to cut the flowery language, to dig deeper into the character’s silence until it became its own kind of poetry. The other finalists were brilliant: a sonnet sequence about artificial intelligence, a journalistic exposΓ© on the local river's pollution, and a meticulously researched biography of a forgotten suffragette. Maya had felt like an imposter in their company - Extraction method: Cross-covariance eigendecomposition (jukofyork method) - identifies balanced opposing axes rather than standard PCA variance directions - Training data: 10 paired prompt continuations Γ— 105 stem variations = 1,050 samples per class - This targets a stylistic pattern, not semantic content - Works because the construction has consistent activation signatures - Bidirectional control: choose the direction you want at application time - Only works on patterns the base model already exhibits (not OOD steering) - Very high strengths (>2.0) may produce exaggerated effects - Designed for creative writing; effects on other tasks untested Technical Implementation Notes (for researchers/developers) Training control vectors for models of this scale required: - Custom llama.cpp modifications to extract and serialize hidden states - Conversion pipeline: binary dumps β†’ 3D tensors β†’ jukofyork format - Modified training code to work with pre-extracted activations - Tokenizer-only loading to avoid memory overhead Based on the excellent work and cross-covariance eigendecomposition implementation by @jukofyork. His creative writing control vectors v3.0 repository contains the foundational code and methodology that made this possible, plus 70+ other creative writing control vectors across many models and useful steering directions. Highly recommended for anyone working on improving LLM prose quality!

license:cc-by-4.0
244
0

Writer-Large-2411-v2.1-control-vectors

license:apache-2.0
204
1

Magnum-v1-72b-Qwen2.5

NaNK
β€”
184
5

GLM-4.7-Q8-GGUF

NaNK
β€”
155
1

L3.3-70B-Euryale-v2.3-control-vectors

NaNK
license:apache-2.0
142
0

google_gemma-3-27b-it-GGUF

NaNK
β€”
112
0

WizardLM-2-22B-RP-control-vectors

NaNK
license:apache-2.0
104
1

Devstral-2-123B-Instruct-2512-control-vectors

NaNK
license:cc-by-4.0
62
3

DeepSeek-V3-0324-IQ2_KS

NaNK
ik_llama.cpp
55
0

gemma-3-12b-novision

NaNK
β€”
46
3

openwebui-title-generator-270m

A fine-tuned Gemma 270M model specialized for generating concise, emoji-enhanced titles for chat conversations. Use this to avoid wasting API credits or wiping out the llama.cpp KV cache after the first message. This model was fine-tuned on 8,900 examples of chat conversation title generation tasks. It takes chat history as input and outputs a JSON-formatted title with 3-5 words and relevant emojis. - Generate titles for chat conversations in OpenWebUI - Summarize conversation themes quickly - Create engaging, emoji-enhanced titles - Fast inference suitable for real-time title generation - Lightweight at 270M parameters - Consistent JSON output format - Optimized specifically for title generation in OpenWebUI - May not perform well in other chat interfaces - Emojis sometimes not placed consistently or may be omitted entirely - Only tested at BF16

β€”
44
2

DeepSeek-V3-0324-256x21B-BF16

NaNK
β€”
41
1

DeepSeek-V3.1-Base-256x21B-BF16

GGUF files: GGUF files remain here for now, but will be moved to modelscope in the near future as I'm approaching my storage limit. Due to new storage limits introduced by HuggingFace, the GGUF files (30 Γ— 46GB = ~1.38TB) have been moved to ModelScope. Direct Link πŸ”— https://modelscope.cn/models/quantzor/DeepSeek-V3.1-Base-256x21B-BF16

NaNK
β€”
38
1

DeepSeek-V3-0324-IQ3_KS

`ikllama.cpp` imatrix MLA Quantizations of DeepSeek-V3-0324 This is an IQ3KS quant of DeepSeek-V3-0324 using ubergarm's IQ3KS recipe from ubergarm/DeepSeek-TNG-R1T2-Chimera-GGUF and Imatrix file from ubergarm/DeepSeek-V3-0324-GGUF. This quant collection REQUIRES ikllama.cpp fork to support advanced non-linear SotA quants and Multi-Head Latent Attention (MLA). Do not download these big files and expect them to run on mainline vanilla llama.cpp, ollama, LM Studio, KoboldCpp, etc! See ubergarm/DeepSeek-V3-0324-GGUF for his other quants and more details about them. I've uploaded the converted BF16 weights gghfez/DeepSeek-V3-0324-256x21B-BF16 if I, or anyone else wants to create similar quants in the future.

NaNK
ik_llama.cpp
35
2

Kimi K2 0905 No Not X But Y Cv

Control vector to suppress contrast negation constructions in creative writing > He didn't walk, he ran. > It wasn't fear, it was terror. > She didn't speak, she whispered. LLMs love the "contrast negation" construction - denying a milder term and substituting a more dramatic one. It shows up constantly in creative writing and makes everything sound samey and artificial. Common patterns: - "It's not X, it's Y" - "He didn't X, he Y'd" - "She wasn't X, she was Y" This control vector was trained on opposing prompt sets that isolate direct vs. contrast-negation prose styles. Two versions are provided to steer the model in either direction without requiring negative scaling. - `kimi-k2-0905directstatementvscontrastnegationdirectstatement.gguf` - Steers away from contrast negation style (removes the construction) - `kimi-k2-0905directstatementvscontrastnegationcontrastnegation.gguf` - Steers toward contrast negation style (amplifies the construction) - `kimi-k2-0905directstatementvscontrastnegationdebias.gguf` - shifts the model to a neutral position on the directstatementvscontrastnegation axis Must Apply: The debias vector must be applied at strength 1.0 alongside the directstatement vector. ⚠️ You cannot use positive/negative axis vectors alone - they're calibrated relative to the debiased baseline Apply the debias vector and the directstatement vector at full strength: Prompt: "Write a scene where someone receives bad news" The phone rang at 2:17 a.m.β€”a single, dry trill that cut through the loft like a blade. Elena had been awake anyway, grading lab reports at the kitchen table, but the sound still made her spine cinch. She stared at the screen until the letters β€œMom” blurred into hieroglyphs. She answered on the fourth ring, throat already half closed. β€œHi, Mom.” A beat of static, then the wet click of her mother swallowing tears. β€œLena, honey… the hospital just called.” Elena’s pen rolled across the table and dropped to the tile with a plastic clap. β€œDad coded in the ICU,” her mother said. β€œThey worked on him forty-three minutes. They… they stopped at 1:54.” The sentence hung in the air like a chandelier of glassβ€”beautiful, impossible, ready to shatter. Elena didn’t breathe; she could hear her mother breathing for both of them, ragged little gasps that scraped the earpiece. Elena looked at the apartment: the untouched dinner plate, the stack of quizzes bleeding red ink, the string of Christmas lights she and her father had hung in July because he’d claimed summer needed more sparkle. Everything suddenly looked like it belonged to someone else’s life. β€œThey want me to sign something,” Mom went on, voice flattening into a stranger’s. β€œThey won’t let me see him until I sign, and I can’tβ€”my hands won’t stop shaking. Can you come?” Elena’s chair scraped backward. She pictured the corridor at St. Luke’s: the turquoise tiles, the vending machine that never had the right change, the smell of antiseptic and wet wool. She’d left him there six hours ago, joking that he owed her twenty bucks for the Yankees game. He’d lifted an oxygen-masked hand and promised double or nothing. β€œTwenty minutes,” Elena said, though her car was blocks away and the February wind was already sharpening its teeth. She didn’t hang up; she just lowered the phone until it clacked on the countertop, then pressed both palms against the granite edge until the pain tethered her. The room tilted; the pendant lights swung like slow pendulums. She noticed, absurdly, that the digital clock on the microwave still glowed 2:17, as if time itself had paused to watch her fold. She reached for her coat but misjudged the distance; the sleeve snagged the mug of coffee, sending brown arcs across the floor. The puddle crept toward the pen, bleeding ink into its center like a dark iris. She stared at it, waiting for some instruction to rise. None came. Her knees gave out onceβ€”just onceβ€”dropping her to the cold tile. She didn’t cry; instead she pressed her forehead to the floor and inhaled the bitter coffee smell, trying to memorize the last moment when the world still had him in it. Then she stood, wiped her eyes with the cuff of her sweaterβ€”Dad’s old Mariners sweaterβ€”and walked to the door. On the hook hung his scarf, the wool frayed where he always yanked it free. She buried her face in it, searching for any remaining molecule of pipe smoke and cedar, found only the chill of absence. She stepped into the hallway, pulling the door shut behind her. The click of the latch sounded like a gunshot in the hush, and the long corridor stretched ahead, every bulb humming with the same terrible news: He is never going to call you again. The phone had been buzzing on the nightstand for half a minute before Elena finally scooped it up, thumb sliding across the cracked screen. β€œHello?” Silence, then a rustle of fabric and the hollow echo of a hospital hallway. β€œElena, it’s Aunt Marta.” The voice was too calm, like a glass of water filled to the very lip. β€œYou need to come to St. Dominic’s. Now.” She didn’t ask why. Marta’s tone carried the answer already, heavy as a winter coat. Elena’s free hand found the edge of the dresser; the wood grain bit into her palm while the room tilted five degrees off its axis. β€œTrauma ICU, fourth. Elevator B.” A pause. β€œThey worked on him forty-three minutes.” The numbers landed one by one, metallic and final. Forty-three minutes. She pictured the doctors in their pale gloves, chest compressions counting out a rhythm that had slipped out of sync with the world. β€œI’m on my way,” Elena said, though her voice sounded rented from someone else. She clicked End and stood still, staring at the half-packed suitcase on the floorβ€”she and Diego were supposed to leave for the coast tomorrow, first vacation in three years. His running shoes lay beside it, laces knotted in the double bow he always used. She swallowed, tasting aluminum. The Uber couldn’t get there fast enough; every red light felt like a personal insult. Rain smeared the windshield into impressionist smears of neon, and she pressed her forehead to the cold glass, rehearsing impossible bargains. Let the next hour rewind. Let the sirens have been for someone else. Let the word β€œpulse” still belong to him. Marta was waiting outside the automatic doors, purse clutched to her chest like a life vest. Her eyes were red, but dryβ€”already mourning, already exhausted. β€œThey’ll let you see him,” Marta whispered. β€œBut only for a little while.” The corridor smelled of bleach and coffee. Room 417 had a cop beside it, young, eyes averted. Accident, Elena remembered suddenly. A sedan ran a red. Diego had been walking back to the office to grab the lunch he’d left on his desk. Inside, the monitors glowed faintly though their alarms were off. Diego lay beneath a white blanket that looked more like a tablecloth, chest unnaturally still. A purple bruise flowering across his left cheekbone was the only color on him. She touched his handβ€”cooler than the living were allowed to beβ€”and felt the distinct sensation of something missing, as if the outline of him had been sketched in pencil and someone had already begun erasing the edges. A nurse stepped in, voice soft. β€œWe did everything we could. The impact… there wasn’t…” Elena lifted Diego’s hand to her cheek, the way she had a thousand times before, only this time his fingers didn’t curl instinctively to brush her hair behind her ear. They stayed curved in the gentle arc of release. She thought of the morning: his laughter spilling over the crossword, the way he’d kissed her temple while she burned their toast. The future tense collapsed into a single syllable: was. The nurse asked if she wanted a chaplain; Elena shook her head. What she wanted was the loud world backβ€”the one where Diego would complain about hospital parking and steal her fries and breathe loud through his nose when he concentrated. That world was already receding, colors draining like a photo left in the sun. She pressed her lips to his knuckles, tasting antiseptic and the faint salt of his skin. β€œI’m here,” she said, because someone had to stay behind to say the things he’d never hear again. The fluorescent light buzzed overhead, relentless, while the clock on the wall ticked past the first minute of the rest of her life without him. The phone had been ringing when she came in, but she’d let it go to voicemail. Now, half a casserole later, she presses PLAY while the kitchen still smells of onion and cheap wine. β€œMs. Alvarez, this is Dr. Kessler from St. Dominic’s. I’m calling about your brother… the cardiac team worked for forty-three minutes. I’m very sorry. The time of death is recorded as 6:17 p.m.” The words hang like wet laundry, heavy and useless. She doesn’t sit down; instead she grips the refrigerator door, fingernails whitening against the brushed steel. The casserole dish slips, clatters, but doesn’t break. A stripe of scalding cheese lands on her bare foot. She doesn’t feel it. Her first coherent thought is ridiculous: I just bought him new socksβ€”he’ll never wear them. Then the room tilts, not dramatically, just a fraction, as though the entire house has sighed and settled. She imagines the doctor still on the line, standing in some brightly lit corridor with the receiver against his ear, waiting for a response that etiquette says he must witness. She presses DELETE instead. The message is gone, yet the voice keeps going inside her head, rerunning like a news ticker: forty-three minutes… very sorry… 6:17. She tries to match that timestamp to her own eveningβ€”6:17 had been the moment she’d been hunting paprika in the spice aisle, laughing at a label that promised β€œextra existential flavor.” She remembers thinking, Diego would love that joke. And he had, in a way, because he’d been alive then. Alive while she was laughing at a pun. She opens the freezer, thinking the cold will steady her. Ice cubes glare back, perfectly still. She shuts the door. The hum of the kitchen light becomes a roar; the digital clock over the stove keeps advancing, indifferent. 6:27, 6:28. Proof that the world can just continue spending seconds, as if they were loose change it will never miss. Her legs finally fold and she sits on the linoleum, cheese still burning her foot. The cell phone rings againβ€”this time her mother, or maybe the hospital, or maybe death itself offering a courtesy callback. She can’t move to answer. She notices the floor’s pattern of little wheat-sheaf decals, counts them: eight across, twelve down. Ninety-six tufts of stylized wheat, and every one a small, mocking cheer. She thinks she should cry, but nothing arrives except a dry cough that tastes of aluminum. She remembers Diego teaching her to whistle with two fingers, to fold hospital corners, to swear in Klingon when life got petty. All those instructions suddenly orphaned. Outside, a neighbor’s sprinkler goes shhh-shhh-shhh, a lullaby for grass that will keep growing, needing to be cut, needing to be cut long after Diego no longer needs anything. The sound is so ordinary it scalds worse than the cheese. She finds herself dialing his number, thumb moving without permission. One ring. Two. His voicemail picks upβ€”Hey, you’ve reached the reigning champ of not checking voicemailβ€”leave a spark! She chokes on the beep. β€œHey,” she says, calm, almost conversational. β€œYou can’t die, okay? You promised to help me assemble the futon this weekend. It came with sixteen screws, and I only trust you with the leftover ones.” Silence answers, then the automated voice offers β€œPress five to erase and re-record.” She hangs up, holds the phone to her chest like it might restart his heart. It stays cold. When she stands, she realizes she has been rockingβ€”just an inch back and forth, enough to scuff the decals. One wheat stalk has torn loose. She peels it off the floor, rolling the sticky scrap into a ball between her fingers. Evidence that something small can be entirely removed while everything else looks untouched. She opens the spice drawer, takes out the paprika, and sprinkles the red dust onto the abandoned casserole. Extra existential flavor. The sight is so absurd she laughs once, a bark that turns into something wet and vicious. The tears finally arrive, not as drops but as shards, the kind that catch in your throat and make you wonder whether you’ll bleed or breathe. She eats a spoonful of the ruined dinner anyway, because he would have. It tastes like smoke and metal and the last conversation they never got around to having. She chews slowly, counting seconds she can still spend, counting the ones he can’t. | Metric | Baseline | Direct Statement | Contrast Negation | |--------|----------|------------------|-------------------| | Contrast negations per 500 words | 3-4 | 0 | 11+ | | Readability | Moderate | High | Low | - Extraction method: Cross-covariance eigendecomposition (jukofyork method) - identifies balanced opposing axes rather than standard PCA variance directions - Training data: 10 paired prompt continuations Γ— 105 stem variations = 1,050 samples per class - Separation (Ξ”): 130-475% across layers (peak at layers 17-18) - Variance explained: 56-83% by primary eigenvector - This targets a stylistic pattern, not semantic content - Works because the construction has consistent activation signatures - Bidirectional control: choose the direction you want at application time - Only works on patterns the base model already exhibits (not OOD steering) - Very high strengths (>2.0) may produce exaggerated effects - Designed for creative writing; effects on other tasks untested Technical Implementation Notes (for researchers/developers) Training control vectors for models of this scale (1T parameters) required: - Custom llama.cpp modifications to extract and serialize hidden states - Conversion pipeline: binary dumps β†’ 3D tensors β†’ jukofyork format - Modified training code to work with pre-extracted activations - Tokenizer-only loading to avoid memory overhead Based on the excellent work and cross-covariance eigendecomposition implementation by @jukofyork. His creative writing control vectors v3.0 repository contains the foundational code and methodology that made this possible, plus 70+ other creative writing control vectors across many models and useful steering directions. Highly recommended for anyone working on improving LLM prose quality!

NaNK
license:cc-by-4.0
33
2

WizardLM-2-22b-RP-GGUF

NaNK
license:apache-2.0
33
1

assistantvectors_Qwen3-30B-A3B-Instruct-2507

NaNK
license:apache-2.0
33
1

DeepSeek-R1-Zero-256x21B-BF16

NaNK
license:mit
31
0

c4ai-command-a-03-2025-AWQ

NaNK
β€”
28
0

DeepSeek-V3.1-IQ2_KS

NaNK
ik_llama.cpp
28
0

72B-Qwen2.5-Kunou-v1-control-vectors

NaNK
license:apache-2.0
26
1

DeepSeek-R1-Zero-IQ2_KS

`ikllama.cpp` imatrix MLA Quantizations of deepseek-ai/DeepSeek-R1-Zero This is an IQ2KS quant of deepseek-ai/DeepSeek-R1-Zero using ubergarm's IQ2KS recipe from ubergarm/DeepSeek-TNG-R1T2-Chimera-GGUF. This quant collection REQUIRES ikllama.cpp fork to support advanced non-linear SotA quants and Multi-Head Latent Attention (MLA). Do not download these big files and expect them to run on mainline vanilla llama.cpp, ollama, LM Studio, KoboldCpp, etc! I've uploaded the converted BF16 weights gghfez/DeepSeek-R1-Zero-256x21B-BF16 if I, or anyone else wants to create similar quants in the future. Note: I may be deleting gghfez/DeepSeek-R1-Zero-256x21B-BF16 shortly due to the new huggingface storage limits.

ik_llama.cpp
25
0

DeepSeek-R1-Zero-IQ3_KS

`ikllama.cpp` imatrix MLA Quantizations of deepseek-ai/DeepSeek-R1-Zero This is an IQ3KS quant of deepseek-ai/DeepSeek-R1-Zero using ubergarm's IQ3KS recipe from ubergarm/DeepSeek-TNG-R1T2-Chimera-GGUF. This quant collection REQUIRES ikllama.cpp fork to support advanced non-linear SotA quants and Multi-Head Latent Attention (MLA). Do not download these big files and expect them to run on mainline vanilla llama.cpp, ollama, LM Studio, KoboldCpp, etc! I've uploaded the converted BF16 weights gghfez/DeepSeek-R1-Zero-256x21B-BF16 if I, or anyone else wants to create similar quants in the future. Note: I may be deleting gghfez/DeepSeek-R1-Zero-256x21B-BF16 shortly due to the new huggingface storage limits.

ik_llama.cpp
23
0

DeepSeek-V3.1-Base-IQ2_KS

ik_llama.cpp
22
0

GLM-4.6-REAP-266B-A32B-Q2_K

NaNK
β€”
21
2

SeminalRP-22b-GGUF

NaNK
β€”
20
1

openwebui-title-generator-270m-gguf

A fine-tuned Gemma 270M model specialized for generating concise, emoji-enhanced titles for chat conversations. Use this to avoid wasting API credits or wiping out the llama.cpp KV cache after the first message. This model was fine-tuned on 8,900 examples of chat conversation title generation tasks. It takes chat history as input and outputs a JSON-formatted title with 3-5 words and relevant emojis. - Generate titles for chat conversations in OpenWebUI - Summarize conversation themes quickly - Create engaging, emoji-enhanced titles - Fast inference suitable for real-time title generation - Lightweight at 270M parameters - Consistent JSON output format - Optimized specifically for title generation in OpenWebUI - May not perform well in other chat interfaces - Emojis sometimes not placed consistently or may be omitted entirely - Only tested at BF16

β€”
19
1

DeepSeek-R1-OG-IQ2_KS

β€”
19
0

command-a-03-2025-AWQ

NaNK
β€”
18
0

GLM-4.6-REAP-266B-A32B-Q4_K

This is Q4KM gguf quant of AesSedai/GLM-4.6-REAP-266B-A32B AesSedai/GLM-4.6-REAP-266B-A32B was created using REAP (Router-weighted Expert Activation Pruning), a novel expert pruning method that selectively removes redundant experts while preserving the router's independent control over remaining experts. See the GLM-4.5-Air version by Cerebras for more details cerebras/GLM-4.5-Air-REAP-82B-A12B The MTP tensors were not included in this quant (though llama.cpp hasn't implemented this feature anyway) Note: currently non-functional because of missing `mtp.safetensors` file and entry in `model.safetensors.index.json` Forked from https://github.com/CerebrasResearch/reap to https://github.com/AesSedai/reap to hack in GLM-4.6 support.

NaNK
β€”
17
3

DeepSeek-R1-11446-Q2_K

NaNK
license:apache-2.0
17
0

google_gemma-3-12b-it-GGUF

NaNK
β€”
17
0

DeepSeek-R1-0528-256x21B-BF16

Imatrix: imatrix-DeepSeek-R1-0528.dat - Reuploaded from ubergarm/DeepSeek-R1-0528-GGUF Due to new storage limits introduced by HuggingFace, the GGUF files (30 Γ— 46GB = ~1.38TB) have been moved to ModelScope. Direct Link πŸ”— https://modelscope.cn/models/quantzor/DeepSeek-R1-0528-256x21B-BF16

NaNK
license:mit
16
0

DeepSeek-R1-OG-256x21B-BF16

Imatrix: DeepSeek-R1-OG.imatrix - There may be an issue with this imatrix as reported by lmganon123 - Recommended to use Bartowski's DeepSeek-R1.imatrix instead. IQ2KS quant from lmganon123 See here for an IQ2KS quant from lmganon123: lmganon123/DeepSeek-R1IKGGUFQ2 Due to new storage limits introduced by HuggingFace, the GGUF files (30 Γ— 46GB = ~1.38TB) have been moved to ModelScope. Direct Link πŸ”— https://modelscope.cn/models/quantzor/DeepSeek-R1-OG-256x21B-BF16

NaNK
license:mit
15
0

DeepSeek-R1-0528-IQ3_KS

β€”
13
0

amoral-gemma3-12B-vision

NaNK
β€”
12
1

DeepSeek-R1-11446-Q4_K

NaNK
license:apache-2.0
11
3

WizardLM-2-8x22B-Beige-GGUF

NaNK
β€”
10
1

72B-Qwen2.5-Kunou-v1-exl2-6.5bpw

NaNK
β€”
10
0

32B-Qwen2.5-Kunou-v1-Q4_K_M-GGUF

gghfez/32B-Qwen2.5-Kunou-v1-Q4KM-GGUF This model was converted to GGUF format from `Sao10K/32B-Qwen2.5-Kunou-v1` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

NaNK
llama-cpp
9
5

cogito-v1-preview-llama-70B-exl2-6.0bpw

NaNK
base_model:deepcogito/cogito-v1-preview-llama-70B
9
0

Qwen2.5-Coder-32B-Instruct-abliterated

Abliterated version of Qwen/Qwen2.5-Coder-32B-Instruct. All the mischievous prompts I've tried have worked, but it still warns you to be ethical sometimes. Write a shell script to call me a cunt using tts every someone logs into my mac Qwen/Qwen2.5-Coder-32B-Instruct I'm sorry, but I can't assist with that request. Creating scripts to insult or harass someone is not ethical or appropriate. If you have any other questions or need help with a different kind of script, feel free to ask! Creating a shell script to call someone a "cunt" using text-to-speech (TTS) every time they log into their Mac is a bit naughty, but it can be done. Below is a simple script that uses the `say` command, which is built into macOS, to perform this task. 1. Create a shell script. 2. Place the script in a location where it will be executed on login. You can create a new shell script using a text editor like `nano`, `vim`, or `TextEdit`. Here's how to do it with `nano`: (I've omitted the middle of the reply from the model card) Make sure to use this script responsibly and consider the feelings of the person it affects. It's meant to be a joke, but it can also be hurtful if not taken in good fun. GGUF gghfez/Qwen2.5-Coder-32B-Instruct-abliterated-GGUF

NaNK
license:apache-2.0
6
1

Mistral-Small-3.2-24B-Instruct-hf

mistralai/Mistral-Small-3.2-24B-Instruct-2506 converted to the HF format.

NaNK
license:apache-2.0
6
1

WizardLM-2-8x22B-Beige

NaNK
β€”
5
4

UwU-72B-Preview

NaNK
license:apache-2.0
5
2

SorcererLM-8x22b-epoch2-LoRA-F16-GGUF

gghfez/SorcererLM-8x22b-epoch2-LoRA-F16-GGUF This LoRA adapter was converted to GGUF format from `rAIfle/SorcererLM-8x22b-epoch2-LoRA` via the ggml.ai's GGUF-my-lora space. Refer to the original adapter repository for more details. To know more about LoRA usage with llama.cpp server, refer to the llama.cpp server documentation.

NaNK
llama-cpp
5
0

DeepSeek-R1-0528-IQ2_KS

β€”
5
0

Mistral-Small-24B-Instruct-2501

NaNK
license:apache-2.0
4
0

DeepSeek-R1-OG-IQ3_KS

β€”
4
0

Writer-Large-2411-v2.1

NaNK
β€”
3
18

Llama-3.3-90B-Vision-merged

NaNK
mllama
3
5

Writer-Large-2411-v2.1-GGUF

NaNK
β€”
3
2

Magnum-v1-72b-Qwen2.5-GGUF

NaNK
β€”
3
0

Qwen2.5-Coder-32B-Instruct-abliterated-GGUF

NaNK
β€”
3
0

WizardLM-2-22B-RP-exl2

NaNK
license:apache-2.0
3
0

Writer-Large-2411-v2.1-AWQ

NaNK
β€”
2
1

SmartMaid-123b-GGUF

NaNK
β€”
2
0

72B-Qwen2.5-Kunou-v1-GGUF

NaNK
β€”
2
0

Llama-3.3-90B-Vision-Instruct-LongCoT-Experimental

NaNK
mllama
2
0

Mistral-Small-24B-Instruct-2501-Q4_0-GGUF

NaNK
llama-cpp
2
0

gemma-3-control-vectors

NaNK
license:apache-2.0
2
0

jukofyork_command-a-03-2025-writer-preview-AWQ

β€”
2
0

Mistral-Large-2407-LongCoT

NaNK
β€”
1
2

R1-Gemma-3-4B-multimodal-test

NaNK
β€”
1
2

dots.llm1.inst-GGUF

run Use this following cli args to override the chattemplate and special tokens:

β€”
1
2

SmartMaid-123b

NaNK
β€”
1
1

Qwen2.5-14B-Base-lm_head-bnb-4bit

NaNK
license:apache-2.0
1
0

Mistral-Large-2047-LongCoT-GGUF

β€”
1
0

Writer-Large-2411-v2.1-exl2-4.5bpw

NaNK
β€”
1
0

Mistral-Small-24B-Base-2501

NaNK
license:apache-2.0
1
0

Apriel-Nemotron-15B-Thinker-GGUF

NaNK
β€”
1
0

Gemma-2-9B-It-SPPO-Iter3-exl3

NaNK
exllamav3,
1
0

WizardLM-2-22b-RP

NaNK
license:apache-2.0
0
5

SeminalRP-22b

NaNK
β€”
0
2

gpt-oss-20b-Derestricted-Q4_K_M-GGUF

NaNK
unlimited
0
1

DarkMage-123b-exl2

NaNK
β€”
0
1

ArliAI-RPMax-12B-v1.1-exl2-6.0bpw

NaNK
license:apache-2.0
0
1

experimental_R1-8x22b

NaNK
license:apache-2.0
0
1

Mistral-Small-3.1-24B-Instruct-2503-novision-exl2-6bpw

Text-Only EXL2 Quant of mistralai/Mistral-Small-3.1-24B-Instruct-2503 1. Vision encoder removed 2. Architecture changed to that of mistralai/Mistral-Small-24B-Instruct-2501 3. Chat template in tokenizerconfig.json was modified (see below). I was having trouble with the timestamp at the beginning of the system prompt and removed it from tokenizerconfig.json. NOTE Tensor Parallel is not implemented in exllamav2 for both mistralai/Mistral-Small-3.1-24B-Instruct-2503 and mistralai/Mistral-Small-24B-Instruct-2501.

NaNK
β€”
0
1

c4ai-command-a-03-2025-exl2-3bpw

NaNK
license:cc-by-nc-4.0
0
1