zerofata

79 models • 7 total models in database
Sort by:

MS3.2-PaintedFantasy-v4.1-24B-GGUF

NaNK
license:mit
1,020
0

MS3.2-PaintedFantasy-v4.124B-GGUF

NaNK
1,020
0

Q3.5-BlueStar-v2-27B-GGUF

NaNK
license:mit
642
0

MS3.2-PaintedFantasy-v2-24B

body { font-family: 'Georgia', 'Times New Roman', serif; color: #dce4f0; / Soft off-white / line-height: 1.6; margin: 0; padding: 0; background-color: #161a25; / Deep blue from dark sky / } .lemonade-text { color: #89d8ff; / Bright blue from city lights / position: relative; z-index: 2; margin-left: 0.2em; text-shadow: 0 0 15px #89d8ff; } / Section styling / .section-container { background-color: rgba(32, 40, 56, 0.7); / Slightly transparent dark blue / margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #ff9966; / Sunset orange / box-shadow: 0 4px 15px rgba(255, 153, 102, 0.05); } .section-header { display: flex; align-items: center; background-color: rgba(255, 153, 102, 0.12); padding: 10px 20px; } .section-indicator { width: 8px; height: 20px; background-color: #ff9966; / Sunset orange / margin-right: 15px; box-shadow: 0 0 8px rgba(255, 153, 102, 0.2); } .section-title { font-family: 'Playfair Display', serif; / Using the new font / color: #ffb399; / Lighter sunset shade / font-size: 1.4rem; margin: 0; letter-spacing: 1px; font-weight: 400; text-transform: capitalize; } .section-content { padding: 20px; font-family: 'Crimson Text', serif; / Using the new font / color: #dce4f0; line-height: 1.6; } / Title styling / .title-container { background-color: #202838; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #ff9966; / Sunset orange / box-shadow: 0 6px 20px rgba(255, 153, 102, 0.07); } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Playfair Display', serif; } .title-main { color: #ffb399; / Lighter sunset shade / font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #a6c8e0; / Muted sky blue / font-size: 1.2rem; font-family: 'Crimson Text', serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(137, 216, 255, 0.08) 1px, rgba(0,0,0,0) 2px); / Rain effect with blue tint / z-index: 1; } / Data box styling / .data-box { background-color: rgba(22, 26, 37, 0.6); padding: 15px; border-left: 2px solid #ff9966; / Sunset orange / margin-bottom: 20px; box-shadow: 0 2px 10px rgba(255, 153, 102, 0.05); } .data-arrow { color: #ff9966; / Sunset orange / width: 20px; display: inline-block; } .data-label { color: #a6c8e0; / Muted sky blue / width: 80px; display: inline-block; } / Subheading styling / .subheading { color: #a6c8e0; / Muted sky blue / font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(166, 200, 224, 0.4); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Playfair Display', serif; } / Links / a { color: #89d8ff; / Bright blue from city lights / text-decoration: none; } a:hover { text-decoration: underline; color: #ffb399; / Lighter sunset shade on hover / } / Container / .container { max-width: 1200px; margin: 20px auto; padding: 40px 20px; background-color: #202838; / Darker container background / background-image: radial-gradient(circle at 20% 80%, rgba(255, 153, 102, 0.04) 0%, transparent 50%), / Sunset glow / radial-gradient(circle at 80% 20%, rgba(137, 216, 255, 0.04) 0%, transparent 50%), / Blue glow / radial-gradient(circle at 40% 40%, rgba(224, 230, 241, 0.02) 0%, transparent 50%); / Faint cloud/light glow / min-height: calc(100vh - 40px); border: 1px solid #ff9966; / Sunset orange / border-radius: 8px; box-shadow: 0 8px 32px rgba(255, 153, 102, 0.07); } / Dropdown styling / .dropdown-container { margin-top: 20px; } .dropdown-summary { cursor: pointer; padding: 10px 0; border-bottom: 1px dashed rgba(166, 200, 224, 0.4); color: #a6c8e0; / Muted sky blue / font-size: 1.1rem; font-weight: 400; text-transform: uppercase; letter-spacing: 1px; font-family: 'Playfair Display', serif; list-style: none; display: flex; align-items: center; } .dropdown-summary::-webkit-details-marker { display: none; } .dropdown-arrow { color: #ff9966; / Sunset orange / margin-right: 10px; transition: transform 0.3s ease; } .dropdown-container[open] .dropdown-arrow { transform: rotate(90deg); } .dropdown-content { margin-top: 15px; padding: 15px; background-color: rgba(22, 26, 37, 0.6); border-left: 2px solid #ff9966; / Sunset orange / box-shadow: 0 2px 10px rgba(255, 153, 102, 0.05); } .config-title { color: #a6c8e0; / Muted sky blue / font-size: 1rem; margin-bottom: 10px; font-family: 'Playfair Display', serif; text-transform: uppercase; letter-spacing: 1px; } This is an uncensored creative model intended to excel at character driven RP / ERP. Version 2 feels quite different from the original, with a heavy focus on reducing repetition across conversations and improving instruction following. Has a pretty unique writing style and sense of creativity (IMO). Pays the price with intermittent brain farts though. Training process: SFT > DPO > KTO SFT with RP/ERP, Stories and in character assistant data. DPO focused on reducing repetition, misgendered characters and slop. KTO focused on further reducing repetition and slop. Not optimized for cost / performance efficiency, YMMV. SFT 1H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ConicCat/Mistral-Small-3.2-AntiRep-24B modeltype: AutoModelForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./dataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 3 microbatchsize: 8 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 8192 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: true # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./PT-SFT1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: PF-SFT wandbentity: yourentity wandbname: runname

NaNK
license:apache-2.0
265
34

L3.3 GeneticLemonade Final 70B

Inspired to learn how to merge by the Nevoria series from SteelSkull. This model is the second result of the Genetic Lemonade series. Designed for RP and creative writing, all three models are focused around striking a balance between writing style, creativity and intelligence. The basic differences between the models are below. | Version | Strength | Weakness | |---------|----------------|----| | Unleashed | Well balanced | Somewhat censored | | Final | Fully uncensored | Least intelligent | | Sunset | Well balanced, most intelligent | GPTisms / weakest writing style | Llam@ception recommended for sane defaults if unsure, import them to SillyTavern and they're plug n play. Sampler Settings - Temp: 0.9-1.0 - MinP: 0.03-0.05 - Dry: 0.8, 1.75, 4 Temperature last, neutralize other samplers. This model natively strikes a balance of creativity & intelligence. Llama-3-Instruct-Names but you will need to uncheck "System same as user". - Static quants by mradermacher - iMatrix quants by mradermacher The base aims to build a strong general purpose model using high performing models that are trained on various datasets from different languages / cultures. This is to reduce the chance of the same datasets appearing multiple times to build natural creativity into L3.3 The second merge aims to impart specific RP / creative writing knowledge, again focusing on trying to find high performing models that use or likely use different datasets.

NaNK
llama
180
11

GLM-4.5-Iceblink-v3-106B-A12B-GGUF

NaNK
license:mit
155
3

MS3.2 PaintedFantasy Visage V3 34B

.container { --primary-accent: #C0C0C0; --secondary-accent: #4A9EFF; --glow-primary: rgba(192, 192, 192, 0.6); --glow-secondary: rgba(74, 158, 255, 0.6); --bg-main: #0B0A18; --bg-container: #110F24; --bg-card: rgba(20, 18, 40, 0.7); --text-main: #DCDCDC; --text-muted: #9E9E9E; --white: #FFFFFF; --border-color: #3C3A50; --font-title: 'Cinzel', serif; --font-body: 'EB Garamond', serif; --font-code: 'Courier New', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; font-weight: 400; max-width: 1100px; margin: 20px auto; padding: 25px; background-color: var(--bg-main); background-image: linear-gradient(rgba(11, 10, 24, 0.95), rgba(11, 10, 24, 0.95)), url('https://www.transparenttextures.com/patterns/stardust.png'); min-height: calc(100vh - 40px); border-radius: 8px; box-shadow: 0 0 25px rgba(0,0,0,0.7); border: 1px solid var(--border-color); } .container .title-container { background: linear-gradient(135deg, rgba(20, 18, 40, 0.8), rgba(30, 28, 50, 0.6)); margin-bottom: 30px; border: 1px solid var(--border-color); border-radius: 6px; padding: 25px; text-align: center; position: relative; box-shadow: 0 5px 15px rgba(0,0,0,0.4); overflow: hidden; } .container .title-main { color: var(--white); font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 4px; display: block; text-transform: uppercase; text-shadow: 0 0 4px var(--glow-primary), 0 0 8px var(--glow-primary), 0 0 12px var(--glow-primary); font-family: var(--font-title); } .container .lemonade-text { color: var(--secondary-accent); text-shadow: 0 0 8px var(--glow-secondary); } .container .title-subtitle { padding-left: 0; margin-top: 15px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-style: italic; font-weight: 400; letter-spacing: 2px; text-transform: uppercase; opacity: 0.8; } .container img { max-width: 100%; border: 2px solid var(--border-color); margin-bottom: 40px; box-shadow: 0 5px 15px rgba(0,0,0,0.5); border-radius: 4px; } .container .section-container { margin-bottom: 25px; padding-bottom: 25px; border-bottom: 1px dashed var(--border-color); } .container .section-container:last-of-type { border-bottom: none; padding-bottom: 0; margin-bottom: 0; } .container .section-header { display: flex; align-items: center; padding: 0 0 15px 0; } .container .section-title { font-family: var(--font-title); background: linear-gradient(45deg, var(--secondary-accent), var(--primary-accent)); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; font-size: 1.4rem; margin: 0 !important; padding: 0 0 10px 0 !important; letter-spacing: 1px; font-weight: 700; text-transform: uppercase; border: none !important; position: relative; display: inline-block; } .container .section-title::after { content: ''; position: absolute; bottom: 0; left: 0; width: 100%; height: 2px; background-image: linear-gradient(to right, var(--secondary-accent), var(--primary-accent)); box-shadow: 0 0 6px var(--glow-secondary), 0 0 6px var(--glow-primary); border-radius: 2px; } .container .section-content { padding: 20px 0 0 0; } .container .subheading { color: var(--secondary-accent); font-size: 1.1rem; margin-top: 20px; margin-bottom: 12px; font-weight: 700; display: block; text-transform: uppercase; letter-spacing: 2px; font-family: var(--font-title); border-bottom: 1px solid var(--secondary-accent); padding-bottom: 6px; text-shadow: 0 0 4px var(--glow-secondary); } .container .data-box { background-color: var(--bg-card); padding: 15px; border: 1px solid var(--border-color); border-left: 2px solid var(--primary-accent); margin-bottom: 15px; box-shadow: inset 0 0 6px rgba(0,0,0,0.4); border-radius: 4px; font-size: 1rem; } .container .data-row { display: flex; align-items: center; margin-bottom: 6px; padding: 5px 0; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--secondary-accent); font-weight: bold; margin-right: 10px; font-family: var(--font-code); font-size: 1rem; } .container .data-label { color: var(--white); font-weight: 600; font-family: var(--font-body); margin-right: 8px; min-width: 80px; } .container a { color: var(--primary-accent); text-decoration: none; font-weight: 600; transition: all .2s; } .container .data-row a { border-bottom: 1px dotted var(--primary-accent); } .container a:hover { text-decoration: none; color: var(--white); text-shadow: 0 0 5px var(--glow-primary); } .container .data-row a:hover { border-bottom-style: solid; } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; color: var(--text-muted); font-size: 1.1rem; font-weight: 700; text-transform: none; font-family: var(--font-title); letter-spacing: 1px; list-style: none; transition: color 0.2s ease; } .container .dropdown-summary:hover { color: var(--primary-accent); } .container .dropdown-arrow { color: var(--secondary-accent); margin-right: 10px; transition: transform 0.2s ease; } .container .dropdown-content { margin-top: 15px; padding: 20px; background-color: var(--bg-card); border: 1px solid var(--border-color); border-radius: 4px; } .container .config-title { color: var(--text-muted); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-body); text-transform: uppercase; letter-spacing: 1px; font-weight: 700; } .container pre { background-color: #1c1c1c; padding: 15px; border: 1px solid var(--border-color); white-space: pre-wrap; word-wrap: break-word; color: #c5c8c6; border-radius: 4px; box-shadow: inset 0 0 5px rgba(0,0,0,0.5); } .container pre code { background: none; color: inherit; padding: 0; border-radius: 0; } .container code { font-family: var(--font-code); color: var(--primary-accent); background: var(--border-color); padding: 2px 5px; border-radius: 4px; } No layer left behind edition. Upscale redone with the missing final layer included. The original upscales were always missing a layer, but I never troubleshooted to identify what layer was missing. Turns out it was the final layer. That's kind of an important one. This model is an uncensored, creative writing and RP model. Compared to the older version, it is smarter and I think has a bit less repetition. The old V2 version though is slightly more creative due to the instability it had. Creation Process: Upscale > CPT > SFT > DPO Pretrained on approx 300MB of light novel and FineWeb-2 corpus. SFT on approx 8 million tokens, SFW / NSFW RP, stories and creative instruct data. DPO on a high quality RP / NSFW dataset with a focus on improving instruction following, reducing repetition and fixing common model mistakes. Merge configurations used during the model creation process. Upscale (Passthrough) basemodel: ConicCat/Mistral-Small-3.2-AntiRep-24B mergemethod: passthrough dtype: bfloat16 slices: - sources: - model: ConicCat/Mistral-Small-3.2-AntiRep-24B layerrange: [0, 29] - sources: - model: ConicCat/Mistral-Small-3.2-AntiRep-24B layerrange: [10, 40] Not optimized for cost / performance efficiency, YMMV. Pretrain 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ../mergekit/pfv3upscale modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/pretraindatasetv5stripped.jsonl type: completion datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 64 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 4e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 12288 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 40 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V3-PT-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V3-PT wandbentity: yourentity wandbname: Visage-V3-PT-1 SFT 4H100 # ==================== MODEL CONFIGURATION ==================== basemodel: ./Visage-V3-PT-1/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken ==================== DATASET CONFIGURATION ==================== datasets: - path: ./data/dataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses ==================== QLORA CONFIGURATION ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true loramodulestosave: # Uncomment only if you added NEW tokens ==================== TRAINING PARAMETERS ==================== numepochs: 3 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 ==================== SEQUENCE & PACKING ==================== sequencelen: 8192 samplepacking: true padtosequencelen: true ==================== HARDWARE OPTIMIZATIONS ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this ==================== EVALUATION & CHECKPOINTING ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false ==================== LOGGING & OUTPUT ==================== outputdir: ./Visage-V3-PT-1-SFT-2 loggingsteps: 1 savesafetensors: true ==================== WANDB TRACKING ==================== wandbproject: Visage-V3-SFT wandbentity: yourentity wandbname: Visage-V3-PT-1-SFT-2 DPO 2H200 # ==================== MODEL CONFIGURATION ==================== basemodel: ./Visage-V3-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken ==================== RL/DPO CONFIGURATION ==================== rl: dpo rlbeta: 0.085 ==================== DATASET CONFIGURATION ==================== datasets: - path: ./data/handcrafteddatasetmistralrep.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] - path: ./data/approvedautomatedl3dataset.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses ==================== QLORA CONFIGURATION ==================== adapter: lora loadin8bit: true lorar: 16 loraalpha: 32 loradropout: 0.1 loratargetlinear: true loramodulestosave: # Uncomment only if you added NEW tokens ==================== TRAINING PARAMETERS ==================== numepochs: 1 microbatchsize: 2 gradientaccumulationsteps: 4 learningrate: 2e-6 optimizer: adamwtorchfused lrscheduler: cosine warmupsteps: 5 weightdecay: 0.01 maxgradnorm: 1.0 ==================== SEQUENCE CONFIGURATION ==================== sequencelen: 8192 padtosequencelen: true ==================== HARDWARE OPTIMIZATIONS ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this deepspeed: deepspeedconfigs/zero1.json ==================== CHECKPOINTING ==================== savesteps: 10 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false ==================== LOGGING & OUTPUT ==================== outputdir: ./Visage-V3-PT-1-SFT-2-DPO-2 loggingsteps: 1 savesafetensors: true ==================== WANDB TRACKING ==================== wandbproject: Visage-V3-DPO wandbentity: yourentity wandbname: Visage-V3-PT-1-SFT-2-DPO-2

NaNK
128
26

MS3.2-PaintedFantasy-v4-24B

NaNK
license:mit
101
17

MS3.2 PaintedFantasy Visage V4 34B

@keyframes fallingleaf { 0% { transform: translateY(-20px) rotate(0deg); opacity: 1; } 100% { transform: translateY(100vh) rotate(360deg); opacity: 0.3; } } .container { --primary-accent: #7BC4C4; --secondary-accent: #9FD4D4; --accent-warm: #E8A5B5; --red-deep: #2C5F6F; --gold-bright: #B4DDD4; --ink-dark: #000000; --bg-main: #0A1A1A; --bg-container: #0F2525; --bg-card: rgba(15, 30, 35, 0.95);

NaNK
license:mit
88
9

GLM 4.5 Iceblink 106B A12B

.container { --primary-accent: #2B8CCC; --secondary-accent: #87CEEB; --glow-primary: rgba(43, 140, 204, 0.4); --glow-secondary: rgba(135, 206, 235, 0.6); --bg-main: #F8FEFF; --bg-container: #FFFFFF; --bg-card: rgba(240, 248, 255, 0.9); --text-main: #2C3E50; --text-muted: #546E7A; --white: #FFFFFF; --border-color: #B0E0E6; --font-title: 'Inter', sans-serif; --font-body: 'Source Sans Pro', sans-serif; --font-code: 'JetBrains Mono', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; font-weight: 400; max-width: 1100px; margin: 20px auto; padding: 25px; background-color: var(--bg-main); background-image: linear-gradient(135deg, rgba(240, 248, 255, 0.9), rgba(255, 255, 255, 0.7)), radial-gradient(circle at 20% 80%, rgba(135, 206, 235, 0.1) 0%, transparent 50%), radial-gradient(circle at 80% 20%, rgba(176, 224, 230, 0.15) 0%, transparent 50%); min-height: calc(100vh - 40px); border-radius: 12px; box-shadow: 0 8px 32px rgba(43, 140, 204, 0.15), 0 2px 8px rgba(135, 206, 235, 0.1); border: 2px solid var(--border-color); } .container .title-container { background: linear-gradient(135deg, rgba(255, 255, 255, 0.95), rgba(240, 248, 255, 0.9)); backdrop-filter: blur(10px); margin-bottom: 30px; border: 2px solid var(--border-color); border-radius: 16px; padding: 35px; text-align: center; position: relative; box-shadow: 0 8px 32px rgba(43, 140, 204, 0.12), inset 0 1px 0 rgba(255, 255, 255, 0.8); overflow: hidden; } / FLAIR: Dense 24-Spoke Snowflake / .container .title-container::before { content: ''; position: absolute; top: 50%; left: 50%; width: 350px; height: 350px; margin: -175px 0 0 -175px; background-image: radial-gradient(circle at center, transparent 35%, rgba(135, 206, 235, 0.25) 36%, transparent 37%), conic-gradient(from 0deg, rgba(176, 224, 230, 0.18) 0deg, transparent 7.5deg, rgba(135, 206, 235, 0.15) 15deg, transparent 22.5deg, rgba(176, 224, 230, 0.18) 30deg, transparent 37.5deg, rgba(135, 206, 235, 0.15) 45deg, transparent 52.5deg, rgba(176, 224, 230, 0.18) 60deg, transparent 67.5deg, rgba(135, 206, 235, 0.15) 75deg, transparent 82.5deg, rgba(176, 224, 230, 0.18) 90deg, transparent 97.5deg, rgba(135, 206, 235, 0.15) 105deg, transparent 112.5deg, rgba(176, 224, 230, 0.18) 120deg, transparent 127.5deg, rgba(135, 206, 235, 0.15) 135deg, transparent 142.5deg, rgba(176, 224, 230, 0.18) 150deg, transparent 157.5deg, rgba(135, 206, 235, 0.15) 165deg, transparent 172.5deg, rgba(176, 224, 230, 0.18) 180deg, transparent 187.5deg, rgba(135, 206, 235, 0.15) 195deg, transparent 202.5deg, rgba(176, 224, 230, 0.18) 210deg, transparent 217.5deg, rgba(135, 206, 235, 0.15) 225deg, transparent 232.5deg, rgba(176, 224, 230, 0.18) 240deg, transparent 247.5deg, rgba(135, 206, 235, 0.15) 255deg, transparent 262.5deg, rgba(176, 224, 230, 0.18) 270deg, transparent 277.5deg, rgba(135, 206, 235, 0.15) 285deg, transparent 292.5deg, rgba(176, 224, 230, 0.18) 300deg, transparent 307.5deg, rgba(135, 206, 235, 0.15) 315deg, transparent 322.5deg, rgba(176, 224, 230, 0.18) 330deg, transparent 337.5deg, rgba(135, 206, 235, 0.15) 345deg, transparent 352.5deg, rgba(176, 224, 230, 0.18) 360deg ); mask: radial-gradient(circle at center, black 65%, transparent 75%); -webkit-mask: radial-gradient(circle at center, black 65%, transparent 75%); z-index: 1; pointer-events: none; } .container .title-container::after { content: ''; position: absolute; top: 50%; left: 50%; width: 180px; height: 180px; margin: -90px 0 0 -90px; background: conic-gradient(from 0deg, transparent 0deg, rgba(43, 140, 204, 0.12) 5deg, transparent 10deg, rgba(43, 140, 204, 0.12) 35deg, transparent 40deg, rgba(43, 140, 204, 0.12) 65deg, transparent 70deg, rgba(43, 140, 204, 0.12) 95deg, transparent 100deg, rgba(43, 140, 204, 0.12) 125deg, transparent 130deg, rgba(43, 140, 204, 0.12) 155deg, transparent 160deg, rgba(43, 140, 204, 0.12) 185deg, transparent 190deg, rgba(43, 140, 204, 0.12) 215deg, transparent 220deg, rgba(43, 140, 204, 0.12) 245deg, transparent 250deg, rgba(43, 140, 204, 0.12) 275deg, transparent 280deg, rgba(43, 140, 204, 0.12) 305deg, transparent 310deg, rgba(43, 140, 204, 0.12) 335deg, transparent 340deg ); mask: radial-gradient(circle at center, transparent 25%, black 30%, black 40%, transparent 45%); -webkit-mask: radial-gradient(circle at center, transparent 25%, black 30%, black 40%, transparent 45%); z-index: 1; pointer-events: none; } .container .title-container .title-wrapper { position: relative; z-index: 2; } .container .title-main { color: var(--text-main); font-size: 3.2rem; font-weight: 900; margin: 0; letter-spacing: 4px; display: block; text-transform: uppercase; background: linear-gradient(135deg, var(--primary-accent), var(--secondary-accent)); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; font-family: var(--font-title); filter: drop-shadow(0 4px 8px rgba(43, 140, 204, 0.4)) drop-shadow(0 2px 4px rgba(255, 255, 255, 0.6)); text-shadow: 0 0 20px rgba(255, 255, 255, 0.8), 0 0 40px rgba(135, 206, 235, 0.6), 0 4px 8px rgba(43, 140, 204, 0.3); position: relative; } .container .lemonade-text { background: linear-gradient(135deg, var(--secondary-accent), #B0E0E6); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; filter: drop-shadow(0 2px 4px rgba(135, 206, 235, 0.3)); } .container .title-subtitle { padding-left: 0; margin-top: 15px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-style: italic; font-weight: 400; letter-spacing: 2px; text-transform: uppercase; opacity: 0.8; } .container img { max-width: 100%; border: 3px solid var(--border-color); margin-bottom: 40px; box-shadow: 0 12px 24px rgba(43, 140, 204, 0.15), 0 4px 8px rgba(135, 206, 235, 0.1); border-radius: 12px; } .container .section-container { margin-bottom: 30px; padding: 25px; background: rgba(255, 255, 255, 0.6); border: 1px solid var(--border-color); border-radius: 12px; box-shadow: 0 4px 16px rgba(43, 140, 204, 0.08); } .container .section-container:last-of-type { margin-bottom: 0; } .container .section-header { display: flex; align-items: center; padding: 0 0 20px 0; border-bottom: 2px solid var(--border-color); margin-bottom: 20px; } .container .section-title { font-family: var(--font-title); background: linear-gradient(45deg, var(--secondary-accent), var(--primary-accent)); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; font-size: 1.4rem; margin: 0 !important; padding: 0 0 10px 0 !important; letter-spacing: 1px; font-weight: 700; text-transform: uppercase; border: none !important; position: relative; display: inline-block; } .container .section-title::after { content: ''; position: absolute; bottom: -2px; left: 0; width: 100%; height: 3px; background-image: linear-gradient(to right, var(--secondary-accent), var(--primary-accent)); border-radius: 2px; } .container .subheading { color: var(--primary-accent); font-size: 1.2rem; margin-top: 25px; margin-bottom: 15px; font-weight: 600; display: block; text-transform: uppercase; letter-spacing: 1px; font-family: var(--font-title); border-bottom: 2px solid var(--primary-accent); padding-bottom: 8px; } .container .data-box { background-color: var(--bg-card); padding: 20px; border: 2px solid var(--border-color); border-left: 4px solid var(--primary-accent); margin-bottom: 20px; box-shadow: 0 4px 12px rgba(43, 140, 204, 0.1); border-radius: 8px; font-size: 1rem; } .container .data-row { display: flex; align-items: center; margin-bottom: 6px; padding: 5px 0; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--primary-accent); font-weight: bold; margin-right: 12px; font-family: var(--font-code); font-size: 1.1rem; } .container .data-label { color: var(--text-main); font-weight: 600; font-family: var(--font-body); margin-right: 10px; min-width: 90px; } .container a { color: var(--primary-accent); text-decoration: none; font-weight: 600; transition: all .2s; } .container .data-row a { border-bottom: 1px dotted var(--primary-accent); } .container a:hover { text-decoration: none; color: var(--secondary-accent); transform: translateY(-1px); } .container .data-row a:hover { border-bottom-style: solid; } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; color: var(--text-muted); font-size: 1.1rem; font-weight: 700; text-transform: none; font-family: var(--font-title); letter-spacing: 1px; list-style: none; transition: color 0.2s ease; } .container .dropdown-summary:hover { color: var(--primary-accent); } .container .dropdown-arrow { color: var(--secondary-accent); margin-right: 10px; transition: transform 0.2s ease; } .container .dropdown-content { margin-top: 15px; padding: 25px; background-color: var(--bg-card); border: 2px solid var(--border-color); border-radius: 8px; box-shadow: 0 4px 12px rgba(43, 140, 204, 0.1); } .container .config-title { color: var(--text-muted); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-body); text-transform: uppercase; letter-spacing: 1px; font-weight: 700; } .container pre { background-color: #f8f9fa; padding: 20px; border: 2px solid var(--border-color); white-space: pre-wrap; word-wrap: break-word; color: var(--text-main); border-radius: 8px; box-shadow: inset 0 2px 4px rgba(43, 140, 204, 0.1); } .container pre code { background: none; color: inherit; padding: 0; border-radius: 0; } .container code { font-family: var(--font-code); color: var(--primary-accent); background: rgba(176, 224, 230, 0.2); padding: 3px 6px; border-radius: 4px; } An experimental GLM4.5 Air finetune. Had this one in the works for a while, but was struggling to find the right hyperparams to get this model to behave nicely. Thank you to TheDrummer for helping me out with them. This model is a creative writing and RP model. It's pretty verbose. The intent is to keep the behavior of the original model, but to slightly improve writing, dialogue & creativity. Creation Process: SFT SFT on approx 10 million tokens, SFW / NSFW RP, stories, creative instruct & chat data. MoE are brutal to train even with a small dataset like mine, so I took a different approach from usual. I used a very low LR in an effort to avoid having to apply DPO / KTO training afterwards. I think there's likely a better config to be found, but experimentation with the model to find it is quite draining. Not optimized for cost / performance efficiency, YMMV. SFT (4H200) basemodel: zai-org/GLM-4.5-Air eottokens: - "<|user|>" - "<|endoftext|>" specialtokens: eostoken: "<|user|>" # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/dataset.jsonl type: chattemplate split: train fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: ./lastrunprepared trainoninputs: false # Only train on assistant responses evalsamplepacking: False # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 32 loradropout: 0.1 loratargetmodules: - gateproj - downproj - upproj - qproj - vproj - kproj - oproj # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 3 microbatchsize: 2 gradientaccumulationsteps: 4 learningrate: 4.5e-6 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 valsetsize: 0.02 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 8192 samplepacking: true padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: true plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin ligerrope: false ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligerfusedlinearcrossentropy: true cutcrossentropy: false # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 20 evalsteps: 35 savetotallimit: 18 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./GLM-AIR-SFTv2-5 loggingsteps: 1 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: GLM-AIR-SFT # wandbentity: yourentity wandbname: GLM-AIR-SFTv2-5

NaNK
license:mit
74
13

L3.3-GeneticLemonade-Opus-70B

body { font-family: sans-serif; color: #d1d5db; line-height: 1.6; margin: 0; padding: 0; background-color: #0b0f19; } .lemonade-text { background: linear-gradient(90deg, #f97316, #eab308, #22c55e, #3b82f6, #8b5cf6); -webkit-background-clip: text; -webkit-text-fill-color: transparent; background-clip: text; text-fill-color: transparent; position: relative; z-index: 2; margin-left: 0.2em; text-shadow: 0 0 10px rgba(249, 115, 22, 0.4); } / Section styling / .section-container { background-color: rgba(17, 24, 39, 0.7); margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #cdaa7d; } .section-header { display: flex; align-items: center; background-color: rgba(205, 170, 125, 0.08); padding: 10px 20px; } .section-indicator { width: 8px; height: 20px; background-color: #cdaa7d; margin-right: 15px; } .section-title { font-family: 'Orbitron', sans-serif; color: #f3f4f6; font-size: 1.3rem; margin: 0; letter-spacing: 2px; text-transform: uppercase; font-weight: 500; } .section-content { padding: 20px; font-family: sans-serif; color: #d1d5db; line-height: 1.6; } / Title styling / .title-container { background-color: #111827; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #cdaa7d; } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Orbitron', sans-serif; } .title-main { color: #f3f4f6; font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #f3f4f6; font-size: 1.2rem; font-family: 'Orbitron', sans-serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(205, 170, 125, 0.05) 1px, rgba(0,0,0,0) 2px); z-index: 1; } / Data box styling / .data-box { background-color: rgba(17, 24, 39, 0.4); padding: 15px; border-left: 2px solid #cdaa7d; margin-bottom: 20px; } .data-arrow { color: #cdaa7d; width: 20px; display: inline-block; } .data-label { color: #9ca3af; width: 80px; display: inline-block; } / Subheading styling / .subheading { color: #f3f4f6; font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(243, 244, 246, 0.4); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Orbitron', sans-serif; } / Links / a { color: #86a8e7; text-decoration: none; } a:hover { text-decoration: underline; color: #eab308; } / Container / .container { max-width: 1200px; margin: 20px auto; padding: 40px 20px; background-color: #111827; background-image: linear-gradient(rgba(205, 170, 125, 0.05) 1px, transparent 1px), linear-gradient(90deg, rgba(205, 170, 125, 0.05) 1px, transparent 1px); background-size: 20px 20px; min-height: calc(100vh - 40px); border: 1px solid #cdaa7d; border-radius: 2px; } Felt like making a merge. This model combines three individually solid, stable and distinctly different RP models. zerofata/GeneticLemonade-Unleashed-v3 Creative, generalist RP / ERP model. Delta-Vector/Plesio-70B Unique prose and unique dialogue RP / ERP model. TheDrummer/Anubis-70B-v1.1 Character portrayal, neutrally aligned RP / ERP model. Play with these, they are not the 'best' settings just a stable baseline. Recommended Samplers Llama-3-Instruct-Names but you will need to uncheck "System same as user". models: - model: zerofata/L3.3-GeneticLemonade-Unleashed-v3-70B - model: Delta-Vector/Plesio-70B - model: TheDrummer/Anubis-70B-v1.1 basemodel: shisa-ai/shisa-v2-llama3.3-70b mergemethod: sce parameters: selecttopk: 0.16 outdtype: bfloat16 tokenizer: source: base

NaNK
llama
66
12

MS3.2-PaintedFantasy-v3-24B

NaNK
license:apache-2.0
49
8

L3.3-GeneticLemonade-Final-v2-70B

body { font-family: sans-serif; color: #f0f0f0; line-height: 1.6; margin: 0; padding: 0; background-color: #1a0f1a; } .lemonade-text { color: #ff3366; position: relative; z-index: 2; margin-left: 0.2em; text-shadow: 0 0 10px #ff3366; } / Section styling / .section-container { background-color: rgba(26, 15, 26, 0.7); margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #ff3366; } .section-header { display: flex; align-items: center; background-color: rgba(255, 51, 102, 0.08); padding: 10px 20px; } .section-indicator { width: 8px; height: 20px; background-color: #ff3366; margin-right: 15px; } .section-title { font-family: 'Orbitron', sans-serif; color: #f0f0f0; font-size: 1.3rem; margin: 0; letter-spacing: 2px; text-transform: uppercase; font-weight: 500; } .section-content { padding: 20px; font-family: sans-serif; color: #f0f0f0; line-height: 1.6; } / Title styling / .title-container { background-color: #0a0a0a; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #ff3366; } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Orbitron', sans-serif; } .title-main { color: #f0f0f0; font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #cc0066; font-size: 1.2rem; font-family: 'Orbitron', sans-serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(139, 0, 0, 0.1) 1px, rgba(0,0,0,0) 2px); z-index: 1; } / Data box styling / .data-box { background-color: rgba(0, 0, 0, 0.4); padding: 15px; border-left: 2px solid #ff3366; margin-bottom: 20px; } .data-arrow { color: #ff3366; width: 20px; display: inline-block; } .data-label { color: #cc0066; width: 80px; display: inline-block; } / Subheading styling / .subheading { color: #cc0066; font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(204, 0, 102, 0.4); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Orbitron', sans-serif; } / Links / a { color: #cc0066; text-decoration: none; } a:hover { text-decoration: underline; color: #ff6600; } / Container / .container { max-width: 1200px; margin: 20px auto; padding: 40px 20px; background-color: #0a0a0a; background-image: linear-gradient(rgba(139, 0, 0, 0.12) 1px, transparent 1px), linear-gradient(90deg, rgba(139, 0, 0, 0.12) 1px, transparent 1px); background-size: 20px 20px; min-height: calc(100vh - 40px); border: 1px solid #ff3366; border-radius: 2px; } Wasn't intending to release another model (so soon at least), but I was testing out some new dataset ideas and thought this model came out pretty nice. zerofata/GeneticLemonade-Final SFT QLora finetune. This is an uncensored creative model intended to excel at character driven RP / ERP. This model is designed to provide longer, narrative heavy responses where characters are portrayed accurately and proactively. Compared to Unleashed v3, this model has significantly reduced positivity bias and arguably a nicer writing style. The tradeoff is it swipe heavy, making a few more logical errors and can be a bit too concise at times. Play with these, they are not the 'best' settings just a stable baseline. Recommended Samplers Llama-3-Instruct-Names but you will need to uncheck "System same as user". This model was trained using a dataset of approx 4.3 million tokens, 700 RP conversations, 2000 creative writing / instruct samples and about 400 summaries. The bulk of this data has been made public. This model didn't take well to my existing DPO dataset, so it hasn't been used here. Axolotl configs Not optimized for cost / performance efficiency, YMMV. SFT 1H200

NaNK
llama
47
9

L3.3-GeneticLemonade-Unleashed-v3-70B

/ Base styling for cyberpunk theme / body {font-family: sans-serif; background-color: #080c14; color: #e1e9f0; line-height: 1.6; margin: 0; padding: 0;} .lemonade-text { color: #33ff99; position: relative; / Keep relative positioning / z-index: 2; margin-left: 0.2em; text-shadow: 0 0 10px #33ff99; / Add static glow / } / Section styling / .section-container {background-color: rgba(8, 12, 20, 0.7); margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #33ff99;} .section-header {display: flex; align-items: center; background-color: rgba(0, 195, 255, 0.1); padding: 10px 20px;} .section-indicator {width: 8px; height: 20px; background-color: #33ff99; margin-right: 15px;} .section-title {font-family: 'Orbitron', sans-serif; color: #e1e9f0; font-size: 1.3rem; margin: 0; letter-spacing: 2px; text-transform: uppercase; font-weight: 500;} .section-content {padding: 20px; font-family: sans-serif; color: #e1e9f0; line-height: 1.6;} / Title styling / .title-container { background-color: #080c14; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #33ff99; } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Orbitron', sans-serif; } .title-main { color: #e1e9f0; font-size: 2.5rem; / Reduced font size / font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #00c3ff; font-size: 1.2rem; / Reduced font size / font-family: 'Orbitron', sans-serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(0,0,0,0.1) 1px, rgba(0,0,0,0) 2px); z-index: 1; } / Data box styling / .data-box {background-color: rgba(0, 0, 0, 0.2); padding: 15px; border-left: 2px solid #33ff99; margin-bottom: 20px;} .data-row {display: flex; margin-bottom: 8px;} .data-arrow {color: #33ff99; width: 20px; display: inline-block;} .data-label {color: #00c3ff; width: 80px; display: inline-block;} / Subheading styling / .subheading {color: #00c3ff; font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(0, 195, 255, 0.3); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Orbitron', sans-serif;} / Links / a {color: #00c3ff; text-decoration: none;} a:hover {text-decoration: underline;} / Container / .container {max-width: 1200px; margin: 0 auto; padding: 40px 20px;} / Cyberpunk grid background / .cyber-grid-bg {position: fixed; top: 0; left: 0; right: 0; bottom: 0; background-color: #05071b; background-image: linear-gradient(rgba(0, 194, 255, 0.03) 1px, transparent 1px), linear-gradient(90deg, rgba(0, 194, 255, 0.03) 1px, transparent 1px); background-size: 20px 20px; z-index: -1;} An experimental release. zerofata/GeneticLemonade-Unleashed SFT+DPO QLora finetune. This is a creative model intended to excel at character driven RP / ERP. It has not been tested or trained on adventure stories or any large amounts of creative writing. This model is designed to provide longer, narrative heavy responses where characters are portrayed accurately and proactively. Play with these, they are not the 'best' settings just a stable baseline. Something interesting to note is this model supports higher temps than would normally be recommended for other L3 models. Recommended Samplers Llama-3-Instruct-Names but you will need to uncheck "System same as user". The model first went through SFT with a small synthetic dataset of 2.9 million tokens, approximately 750 conversations. Primarily RP data with small amounts of random instruct / assistant data and creative writing. The model then went through DPO training using approx 1100 chosen examples from the SFT dataset that were of exceptional quality or showed verifiable instruction following. Rejected samples were generated using another Llama 3.3 finetune that is known for poor instruction following. Axolotl configs Neither are optimized for cost / performance efficiency, YMMV. SFT 1H200

NaNK
llama
39
22

MS3.2-PaintedFantasy-24B

body { font-family: sans-serif; color: #e8f4f8; line-height: 1.6; margin: 0; padding: 0; background-color: #0a1628; } .lemonade-text { color: #4fc3f7; position: relative; z-index: 2; margin-left: 0.2em; text-shadow: 0 0 15px #4fc3f7; } / Section styling / .section-container { background-color: rgba(10, 22, 40, 0.8); margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #4fc3f7; box-shadow: 0 4px 15px rgba(79, 195, 247, 0.1); } .section-header { display: flex; align-items: center; background-color: rgba(79, 195, 247, 0.06); padding: 10px 20px; } .section-indicator { width: 8px; height: 20px; background-color: #4fc3f7; margin-right: 15px; box-shadow: 0 0 8px rgba(79, 195, 247, 0.4); } .section-title { font-family: 'Georgia', 'Times New Roman', serif; color: #e8f4f8; font-size: 1.4rem; margin: 0; letter-spacing: 1px; font-weight: 400; text-transform: capitalize; } .section-content { padding: 20px; font-family: sans-serif; color: #e8f4f8; line-height: 1.6; } / Title styling / .title-container { background-color: #051017; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #4fc3f7; box-shadow: 0 6px 20px rgba(79, 195, 247, 0.15); } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Georgia', 'Times New Roman', serif; } .title-main { color: #e8f4f8; font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #b39ddb; font-size: 1.2rem; font-family: 'Georgia', 'Times New Roman', serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(79, 195, 247, 0.08) 1px, rgba(0,0,0,0) 2px); z-index: 1; } / Data box styling / .data-box { background-color: rgba(5, 16, 23, 0.6); padding: 15px; border-left: 2px solid #4fc3f7; margin-bottom: 20px; box-shadow: 0 2px 10px rgba(79, 195, 247, 0.1); } .data-arrow { color: #4fc3f7; width: 20px; display: inline-block; } .data-label { color: #b39ddb; width: 80px; display: inline-block; } / Subheading styling / .subheading { color: #b39ddb; font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(179, 157, 219, 0.4); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Georgia', 'Times New Roman', serif; } / Links / a { color: #b39ddb; text-decoration: none; } a:hover { text-decoration: underline; color: #f8bbd9; } / Container / .container { max-width: 1200px; margin: 20px auto; padding: 40px 20px; background-color: #051017; background-image: radial-gradient(circle at 20% 80%, rgba(79, 195, 247, 0.03) 0%, transparent 50%), radial-gradient(circle at 80% 20%, rgba(179, 157, 219, 0.03) 0%, transparent 50%), radial-gradient(circle at 40% 40%, rgba(248, 187, 217, 0.02) 0%, transparent 50%); min-height: calc(100vh - 40px); border: 1px solid #4fc3f7; border-radius: 8px; box-shadow: 0 8px 32px rgba(79, 195, 247, 0.15); } / Dropdown styling / .dropdown-container { margin-top: 20px; } .dropdown-summary { cursor: pointer; padding: 10px 0; border-bottom: 1px dashed rgba(179, 157, 219, 0.4); color: #b39ddb; font-size: 1.1rem; font-weight: 400; text-transform: uppercase; letter-spacing: 1px; font-family: 'Georgia', 'Times New Roman', serif; list-style: none; display: flex; align-items: center; } .dropdown-summary::-webkit-details-marker { display: none; } .dropdown-arrow { color: #4fc3f7; margin-right: 10px; transition: transform 0.3s ease; } .dropdown-container[open] .dropdown-arrow { transform: rotate(90deg); } .dropdown-content { margin-top: 15px; padding: 15px; background-color: rgba(5, 16, 23, 0.6); border-left: 2px solid #4fc3f7; box-shadow: 0 2px 10px rgba(79, 195, 247, 0.1); } .config-title { color: #4fc3f7; font-size: 1rem; margin-bottom: 10px; font-family: 'Georgia', 'Times New Roman', serif; text-transform: uppercase; letter-spacing: 1px; } Experimental release. This is an uncensored creative model intended to excel at character driven RP / ERP. This model is designed to provide longer, narrative heavy responses where characters are portrayed accurately and proactively. Training process: Pretrain > SFT > DPO > DPO 2 Did a small pretrain on some light novels and Frieren wiki data as a test. Hasn't seemed to hurt the model and model has shown some small improvements in the lore of series that were included. The model then went through the standard SFT using a dataset of approx 3.6 million tokens, 700 RP conversations, 1000 creative writing / instruct samples and about 100 summaries. The bulk of this data has been made public. Finally DPO was used to make the model a little more consistent. The first stage of DPO focused on instruction following and the second tried to burn out some Mistral-isms. Not optimized for cost / performance efficiency, YMMV. SFT 1H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./MS3-2-Pretrain/merged modeltype: AutoModelForCausalLM tokenizertype: AutoTokenizer # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./dataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 3 microbatchsize: 4 gradientaccumulationsteps: 2 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 8192 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: true # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 5 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./MS3-2-SFT-2 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: MS3-2-SFT wandbentity: yourentity wandbname: runname

NaNK
license:apache-2.0
31
31

L3.3 GeneticLemonade Unleashed 70B

NaNK
llama
27
11

GLM-4.5-Iceblink-v3-106B-A12B

NaNK
license:mit
27
0

MS3.2-PaintedFantasy-v3-24B-exl3-6bpw

NaNK
license:apache-2.0
26
0

GLM-4.5-Iceblink-v2-106B-A12B

.container { --primary-accent: #87CEEB; --secondary-accent: #B0E5F5; --tertiary-accent: #5FA8D3; --ice-accent: #E0F4FF; --silver-accent: #C8D8E4; --glow-primary: rgba(135, 206, 235, 0.6); --glow-secondary: rgba(176, 229, 245, 0.7); --bg-main: #0a1628; --bg-container: #0f1e35; --bg-card: rgba(15, 30, 53, 0.95); --bg-elevated: #162840; --text-main: #E8F4F8; --text-muted: #9BC4E2; --text-bright: #FFFFFF; --white: #FFFFFF; --border-color: #2B4F76; --border-ice: #B0E5F5; --font-title: 'Inter', sans-serif; --font-body: 'Source Sans Pro', sans-serif; --font-code: 'JetBrains Mono', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; font-weight: 400; max-width: 1100px; margin: 20px auto; padding: 60px; background: linear-gradient(135deg, #0a1628 0%, #0f1e35 50%, #0a1628 100%); min-height: calc(100vh - 40px); position: relative; border: 1px solid var(--border-ice); box-shadow: 0 0 0 3px var(--border-color), 0 0 0 5px var(--border-ice), 0 0 0 8px var(--border-color), 0 0 60px rgba(135, 206, 235, 0.4), inset 0 0 100px rgba(135, 206, 235, 0.15); } .container .title-container { background: linear-gradient(135deg, var(--bg-elevated), var(--bg-card)); margin-bottom: 50px; border: 2px solid var(--border-ice); padding: 50px; text-align: center; position: relative; box-shadow: 0 0 0 1px var(--border-color), 0 0 0 4px var(--border-ice), 0 0 0 6px var(--border-color), 0 0 40px var(--glow-primary), inset 0 0 60px rgba(135, 206, 235, 0.2); overflow: visible; } .container .title-container::before { content: ''; position: absolute; top: 0; left: 0; right: 0; height: 3px; background: linear-gradient(90deg, transparent, var(--ice-accent), transparent); box-shadow: 0 0 10px var(--ice-accent); } .container .title-container::after { content: ''; position: absolute; bottom: 0; left: 0; right: 0; height: 3px; background: linear-gradient(90deg, transparent, var(--ice-accent), transparent); box-shadow: 0 0 10px var(--ice-accent); } .container .title-container .title-wrapper { position: relative; z-index: 2; } .container .title-main { color: var(--text-bright); font-size: 3rem; font-weight: 900; margin: 0; letter-spacing: 4px; display: block; text-transform: uppercase; background: linear-gradient(90deg, var(--ice-accent), var(--text-bright), var(--ice-accent)); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; font-family: var(--font-title); text-shadow: 0 0 30px var(--ice-accent), 0 0 60px rgba(176, 229, 245, 0.5), 0 4px 8px rgba(135, 206, 235, 0.6); position: relative; } .container .lemonade-text { background: linear-gradient(135deg, var(--silver-accent), var(--ice-accent)); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; } .container .version-indicator { color: var(--text-muted); font-size: 0.85rem; font-weight: 600; letter-spacing: 3px; margin-top: 15px; text-transform: uppercase; font-family: var(--font-title); opacity: 0.7; } .container .title-subtitle { padding: 20px; margin-top: 25px; border: 1px solid var(--border-ice); box-shadow: 0 0 20px rgba(135, 206, 235, 0.3); } .container .subtitle-text { color: var(--text-muted); font-size: 1.3rem; font-family: var(--font-body); font-style: italic; font-weight: 400; letter-spacing: 3px; text-transform: uppercase; } .container img { max-width: 100%; border: 3px solid var(--border-ice); margin-bottom: 40px; box-shadow: 0 0 0 1px var(--border-color), 0 0 0 5px var(--border-ice), 0 12px 24px rgba(135, 206, 235, 0.3); } .container .section-container { margin-bottom: 40px; padding: 40px; background: linear-gradient(135deg, var(--bg-card), var(--bg-elevated)); border: 2px solid var(--border-ice); box-shadow: 0 0 0 1px var(--border-color), 0 0 0 5px var(--border-ice), 0 8px 24px rgba(135, 206, 235, 0.3), inset 0 0 40px rgba(135, 206, 235, 0.1); } .container .section-container:last-of-type { margin-bottom: 0; } .container .section-header { display: flex; align-items: center; padding: 20px; border: 1px solid var(--border-ice); margin-bottom: 30px; background: rgba(43, 79, 118, 0.2); box-shadow: 0 0 20px rgba(135, 206, 235, 0.2); } .container .section-title { font-family: var(--font-title); background: linear-gradient(90deg, var(--ice-accent), var(--text-bright), var(--ice-accent)); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; font-size: 1.8rem; margin: 0 !important; padding: 0 !important; letter-spacing: 4px; font-weight: 800; text-transform: uppercase; border: none !important; display: inline-block; text-shadow: 0 0 20px var(--ice-accent); } .container .subheading { color: var(--text-bright); font-size: 1.4rem; margin-top: 30px; margin-bottom: 20px; font-weight: 700; display: block; text-transform: uppercase; letter-spacing: 3px; font-family: var(--font-title); border-bottom: 2px solid var(--border-ice); padding-bottom: 12px; text-shadow: 0 0 15px var(--ice-accent); } .container .data-box { background: linear-gradient(135deg, var(--bg-card), rgba(22, 40, 64, 0.8)); padding: 25px; border: 2px solid var(--border-ice); border-left: 5px solid var(--primary-accent); margin-bottom: 25px; box-shadow: 0 0 20px rgba(135, 206, 235, 0.3), inset 0 0 20px rgba(135, 206, 235, 0.1); font-size: 1rem; } .container .data-row { display: flex; align-items: center; margin-bottom: 12px; padding: 10px 0; border-bottom: 1px solid rgba(176, 229, 245, 0.2); } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--ice-accent); font-weight: bold; margin-right: 15px; font-family: var(--font-code); font-size: 1.2rem; text-shadow: 0 0 10px var(--ice-accent); } .container .data-label { color: var(--text-muted); font-weight: 700; font-family: var(--font-body); margin-right: 15px; min-width: 120px; text-transform: uppercase; letter-spacing: 1px; } .container a { color: var(--text-bright); text-decoration: none; font-weight: 600; transition: all .2s; } .container .data-row a { border-bottom: 1px dotted var(--ice-accent); } .container a:hover { color: var(--ice-accent); text-shadow: 0 0 10px var(--ice-accent); } .container .data-row a:hover { border-bottom-style: solid; } .container .dropdown-container { margin-top: 30px; } .container .dropdown-summary { cursor: pointer; padding: 15px 20px; color: var(--text-muted); font-size: 1.2rem; font-weight: 700; text-transform: uppercase; font-family: var(--font-title); letter-spacing: 2px; list-style: none; transition: all 0.2s ease; border: 1px solid var(--border-ice); background: rgba(43, 79, 118, 0.2); box-shadow: 0 0 15px rgba(135, 206, 235, 0.2); } .container .dropdown-summary:hover { color: var(--ice-accent); background: rgba(43, 79, 118, 0.3); box-shadow: 0 0 25px rgba(135, 206, 235, 0.3); } .container .dropdown-summary::-webkit-details-marker { display: none; } .container .dropdown-arrow { color: var(--ice-accent); margin-right: 15px; transition: transform 0.2s ease; text-shadow: 0 0 10px var(--ice-accent); } .container details[open] .dropdown-arrow { transform: rotate(90deg); } .container .dropdown-content { margin-top: 20px; padding: 20px 15px; background: linear-gradient(135deg, var(--bg-card), rgba(22, 40, 64, 0.9)); border: 2px solid var(--border-ice); box-shadow: 0 0 20px rgba(135, 206, 235, 0.3), inset 0 0 30px rgba(135, 206, 235, 0.1); } .container .config-title { color: var(--text-bright); font-size: 1.1rem; margin-bottom: 10px; font-family: var(--font-body); text-transform: uppercase; letter-spacing: 2px; font-weight: 700; text-shadow: 0 0 10px var(--ice-accent); } .container pre { background: #050a14; padding: 8px 16px; margin: 0; border: 2px solid var(--border-ice); white-space: pre; overflow-x: auto; color: var(--text-main); box-shadow: 0 0 20px rgba(135, 206, 235, 0.2), inset 0 0 20px rgba(135, 206, 235, 0.15); } .container pre code { background: none; color: inherit; padding: 0; margin: 0; display: block; border: none; outline: none; } .container code { font-family: var(--font-code); color: var(--ice-accent); background: rgba(176, 229, 245, 0.15); padding: 4px 8px; border: 1px solid rgba(135, 206, 235, 0.3); } Another re-attempt at GLM 4.5 Air. This time using a different training framework, some updated data and better hyperparameters. This model is a creative writing and RP model. It's pretty verbose. The intent is to keep the behavior of the original model, but to improve writing, dialogue & creativity. Compared to the original Iceblink, the effect on this one is more pronounced, with hopefully minimal impact on the intelligence. Creation Process: SFT SFT on approx 13 million tokens, SFW / NSFW RP, stories, creative instruct & chat data. Some of the SFW datasets are public and can be found in the model datasets list. I've switched over from Axolotl to MS-Swift w/ Megatron to train MoE models now. There's a roughly 5-10x speedup in training the models, thanks to escaping the naive MoE implementation in TRL. The training time for this run took only 40 minutes, excluding environment setup time. A low LR for GLM Air appears to be king. Going any higher, I've found it extremely easy to begin overcooking the model. Not optimized for cost / performance efficiency, YMMV. SFT (8H200) PYTORCHCUDAALLOCCONF='expandablesegments:True' \ NPROCPERNODE=8 \ WANDBAPIKEY=wandbkey \ CUDAVISIBLEDEVICES=0,1,2,3,4,5,6,7 \ megatron sft \ --load '/workspace/glm-4.5-air-mcore' \ --dataset '/workspace/joineddatasetcleanedmodified.jsonl' \ --loadfromcachefile true \ --traintype lora \ --lorarank 256 \ --loraalpha 16 \ --use-rslora true \ --targetmodules all-linear \ --splitdatasetratio 0.01 \ --moepermutefusion true \ --tensormodelparallelsize 8 \ --experttensorparallelsize 1 \ --expertmodelparallelsize 8 \ --moegroupedgemm true \ --moesharedexpertoverlap true \ --moeauxlosscoeff 6e-5 \ --microbatchsize 4 \ --globalbatchsize 32 \ --recomputegranularity full \ --recomputemethod uniform \ --recomputenumlayers 1 \ --maxepochs 2 \ --crossentropylossfusion true \ --lr 6e-6 \ --lrwarmupfraction 0.05 \ --minlr 6e-7 \ --save megatronoutput/Iceblink-v3-SFT-3 \ --evalinterval 20 \ --saveinterval 25 \ --finetune true \ --packing true \ --maxlength 10280 \ --numworkers 8 \ --datasetnumproc 8 \ --nosaveoptim true \ --nosaverng true \ --sequenceparallel true \ --wandbproject Megatron-Air-SFT \ --wandbexpname Iceblink-v3-SFT-3 \ --attentionbackend flash A shoutout to the people in BeaverAI discord that helped me test this model and my intermediate versions. ddh0 (Madison), Ambius, Dysfunctional & my dude.

NaNK
license:mit
24
4

MS3.2-PaintedFantasy-Visage-v4-34b-exl3-6bpw

@keyframes fallingleaf { 0% { transform: translateY(-20px) rotate(0deg); opacity: 1; } 100% { transform: translateY(100vh) rotate(360deg); opacity: 0.3; } } .container { --primary-accent: #7BC4C4; --secondary-accent: #9FD4D4; --accent-warm: #E8A5B5; --red-deep: #2C5F6F; --gold-bright: #B4DDD4; --ink-dark: #000000; --bg-main: #0A1A1A; --bg-container: #0F2525; --bg-card: rgba(15, 30, 35, 0.95); --text-main: #D5E8E8; --text-muted: #9CB9C4; --white: #FFFFFF; --border-color: #6B9BAA; --font-title: 'Inter', sans-serif; --font-body: 'Source Sans Pro', sans-serif; --font-code: 'JetBrains Mono', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.7; font-weight: 400; max-width: 1100px; margin: 20px auto; padding: 50px 40px; background-color: var(--bg-main); background-image: radial-gradient(ellipse at top, rgba(20, 45, 60, 0.3), transparent 60%), radial-gradient(ellipse at bottom, rgba(0, 0, 0, 0.8), transparent 50%), linear-gradient(180deg, rgba(15, 30, 40, 0.6) 0%, rgba(12, 25, 35, 0.7) 25%, rgba(10, 20, 28, 0.8) 50%, rgba(8, 16, 22, 0.85) 75%, rgba(6, 12, 18, 0.9) 100%); min-height: calc(100vh - 40px); position: relative; z-index: 2; border-radius: 3px; box-shadow: 0 10px 60px rgba(0, 0, 0, 0.8), 0 2px 10px rgba(0, 0, 0, 0.9), inset 0 0 120px rgba(20, 40, 55, 0.1); border: 2px solid rgba(30, 60, 75, 0.4); } .container::before { content: ''; position: absolute; top: 0; left: 0; right: 0; bottom: 0; background-image: radial-gradient(circle 3px at 15% 12%, rgba(123, 196, 196, 0.4) 0%, transparent 100%), radial-gradient(circle 2px at 25% 18%, rgba(159, 212, 212, 0.35) 0%, transparent 100%), radial-gradient(circle 4px at 45% 25%, rgba(232, 165, 181, 0.4) 0%, transparent 100%), radial-gradient(circle 3px at 65% 15%, rgba(123, 196, 196, 0.4) 0%, transparent 100%), radial-gradient(circle 2px at 85% 20%, rgba(196, 240, 234, 0.35) 0%, transparent 100%), radial-gradient(circle 3px at 35% 35%, rgba(232, 165, 181, 0.35) 0%, transparent 100%), radial-gradient(circle 2px at 75% 40%, rgba(180, 221, 221, 0.35) 0%, transparent 100%), radial-gradient(circle 4px at 20% 50%, rgba(123, 196, 196, 0.45) 0%, transparent 100%), radial-gradient(circle 3px at 90% 55%, rgba(196, 240, 234, 0.4) 0%, transparent 100%), radial-gradient(circle 2px at 50% 60%, rgba(232, 165, 181, 0.3) 0%, transparent 100%); pointer-events: none; border-radius: 3px; } .container .title-container { background: radial-gradient(circle at 15% 25%, rgba(30, 65, 80, 0.4) 0%, transparent 40%), radial-gradient(circle at 85% 70%, rgba(123, 196, 196, 0.2) 0%, transparent 45%), radial-gradient(circle at 50% 50%, rgba(20, 45, 60, 0.35) 0%, transparent 60%), linear-gradient(135deg, rgba(20, 45, 60, 0.3) 0%, rgba(15, 35, 50, 0.4) 50%, rgba(12, 28, 40, 0.5) 100%); margin-bottom: 50px; border: 3px solid; border-image: linear-gradient(135deg, rgba(123, 196, 196, 0.5) 0%, rgba(159, 212, 212, 0.6) 50%, rgba(123, 196, 196, 0.5) 100%) 1; border-radius: 0; padding: 60px 50px 70px; text-align: center; position: relative; overflow: hidden; box-shadow: 0 10px 50px rgba(0, 0, 0, 0.7), 0 4px 20px rgba(0, 0, 0, 0.8), inset 0 2px 8px rgba(123, 196, 196, 0.15); } .container .title-container::before { content: ''; position: absolute; top: -50%; right: -10%; width: 300px; height: 300px; background: radial-gradient(circle, rgba(44, 95, 111, 0.3) 0%, transparent 70%); border-radius: 40% 60% 70% 30%; filter: blur(40px); } .container .title-container::after { content: ''; position: absolute; bottom: -30%; left: -5%; width: 250px; height: 250px; background: radial-gradient(circle, rgba(232, 165, 181, 0.2) 0%, transparent 70%); border-radius: 60% 40% 30% 70%; filter: blur(35px); } .container .title-container .title-wrapper { position: relative; } .container .title-main { font-size: 3.5rem; font-weight: 800; margin: 0; letter-spacing: 6px; text-transform: uppercase; font-family: var(--font-title); position: relative; line-height: 1.2; z-index: 1; } .container .title-main .title-prefix { display: block; background: linear-gradient(135deg, #6EC5C5 0%, #E8A5B5 35%, #9FD4D4 70%, #B4E8DD 100%); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; filter: drop-shadow(0 3px 12px rgba(110, 197, 197, 0.6)) drop-shadow(0 0 20px rgba(232, 165, 181, 0.4)); margin-bottom: 15px; font-size: 3.5rem; letter-spacing: 6px; font-weight: 700; position: relative; padding-bottom: 12px; font-family: 'Cinzel', serif; text-transform: uppercase; } .container .title-main .title-prefix::after { content: ''; position: absolute; bottom: 0; left: 50%; transform: translateX(-50%); width: 80px; height: 4px; background: linear-gradient(90deg, transparent 0%, rgba(159, 212, 212, 0.8) 35%, rgba(232, 165, 181, 0.8) 65%, transparent 100%); border-radius: 2px; box-shadow: 0 0 10px rgba(232, 165, 181, 0.5); } .container .title-main .title-version { display: inline-block; background: linear-gradient(135deg, #8DD4D4 0%, #A8E0E0 50%, #C4F0EA 100%); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; filter: drop-shadow(0 2px 8px rgba(141, 212, 212, 0.6)) drop-shadow(0 0 15px rgba(168, 224, 224, 0.5)); font-size: 2.5rem; letter-spacing: 4px; font-weight: 700; position: relative; padding: 5px 15px; border: 2px solid transparent; border-image: linear-gradient(135deg, rgba(141, 212, 212, 0.4), rgba(168, 224, 224, 0.6), rgba(141, 212, 212, 0.4)) 1; font-family: 'Cinzel', serif; text-transform: uppercase; } .container .title-main .title-version::before { content: ''; position: absolute; top: -8px; left: -8px; right: -8px; bottom: -8px; background: linear-gradient(135deg, rgba(141, 212, 212, 0.15), rgba(44, 95, 111, 0.1)); z-index: -1; transform: skew(-2deg); } .container .lemonade-text { background: linear-gradient(135deg, var(--red-deep), var(--primary-accent)); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; filter: drop-shadow(0 2px 6px rgba(232, 107, 142, 0.6)); } .container .title-subtitle { padding-left: 0; margin-top: 15px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-style: italic; font-weight: 400; letter-spacing: 2px; text-transform: uppercase; opacity: 0.9; text-shadow: 0 0 10px rgba(159, 212, 212, 0.3); } .container img { max-width: 100%; border: 2px solid rgba(44, 95, 111, 0.5); margin-bottom: 40px; box-shadow: 0 8px 30px rgba(0, 0, 0, 0.7), 0 3px 12px rgba(123, 196, 196, 0.4), 0 0 20px rgba(232, 165, 181, 0.3); border-radius: 2px; position: relative; } .container img::after { content: ''; position: absolute; inset: -5px; background: linear-gradient(135deg, rgba(123, 196, 196, 0.2), rgba(232, 165, 181, 0.15)); filter: blur(8px); z-index: -1; } .container .section-container { margin-bottom: 35px; padding: 30px; background: linear-gradient(135deg, rgba(25, 55, 70, 0.85) 0%, rgba(30, 65, 80, 0.9) 50%, rgba(25, 55, 70, 0.85) 100%); border: 2px solid transparent; border-image: linear-gradient(135deg, rgba(123, 196, 196, 0.6), rgba(159, 212, 212, 0.7), rgba(123, 196, 196, 0.6)) 1; border-radius: 0; box-shadow: 0 8px 35px rgba(0, 0, 0, 0.7), inset 0 2px 8px rgba(123, 196, 196, 0.3), 0 0 25px rgba(123, 196, 196, 0.3), inset 0 -1px 20px rgba(159, 212, 212, 0.12); position: relative; } .container .section-container::before { content: ''; position: absolute; top: 0; left: 0; right: 0; bottom: 0; background: radial-gradient(circle at 20% 30%, rgba(123, 196, 196, 0.25) 0%, transparent 50%), radial-gradient(circle at 80% 70%, rgba(159, 212, 212, 0.15) 0%, transparent 50%), radial-gradient(circle at 50% 10%, rgba(232, 165, 181, 0.1) 0%, transparent 60%); pointer-events: none; z-index: 0; } .container .section-container:last-of-type { margin-bottom: 0; } .container .section-header { display: flex; align-items: center; padding: 0 0 18px 0; border-bottom: 2px solid rgba(123, 196, 196, 0.4); margin-bottom: 25px; position: relative; z-index: 1; } .container .section-title { font-family: var(--font-title); background: linear-gradient(135deg, #9FD4D4 0%, #B4E8DD 100%); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; font-size: 1.5rem; margin: 0 !important; padding: 0 !important; letter-spacing: 2px; font-weight: 700; text-transform: uppercase; border: none !important; position: relative; display: inline-block; filter: drop-shadow(0 2px 4px rgba(123, 196, 196, 0.5)); } .container .section-title::after { content: ''; position: absolute; bottom: -18px; left: 0; width: 50%; height: 2px; background-image: linear-gradient(to right, rgba(159, 212, 212, 0.7), transparent); opacity: 0.8; } .container .section-content { padding: 0; position: relative; z-index: 1; } .container .subheading { color: var(--secondary-accent); font-size: 1.25rem; margin-top: 28px; margin-bottom: 18px; font-weight: 600; display: block; text-transform: uppercase; letter-spacing: 1.5px; font-family: var(--font-title); border-bottom: 2px solid transparent; border-image: linear-gradient(to right, var(--primary-accent), var(--secondary-accent), transparent) 1; padding-bottom: 10px; text-shadow: 0 0 10px rgba(159, 212, 212, 0.4); } .container .data-box { background: linear-gradient(135deg, rgba(20, 40, 50, 0.6) 0%, rgba(25, 50, 60, 0.7) 100%); padding: 22px; border: 2px solid rgba(107, 155, 170, 0.3); border-left: 5px solid var(--primary-accent); margin-bottom: 20px; box-shadow: 0 4px 15px rgba(0, 0, 0, 0.5), inset 0 1px 2px rgba(44, 95, 111, 0.3), 0 0 10px rgba(232, 107, 142, 0.15); border-radius: 0; font-size: 1rem; position: relative; } .container .data-box::after { content: ''; position: absolute; top: 0; right: 0; width: 100px; height: 100px; background: radial-gradient(circle at top right, rgba(123, 196, 196, 0.08) 0%, transparent 60%); pointer-events: none; border-radius: 0 0 0 100%; } .container .data-row { display: flex; align-items: center; margin-bottom: 6px; padding: 5px 0; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--secondary-accent); font-weight: bold; margin-right: 12px; font-family: var(--font-code); font-size: 1.1rem; text-shadow: 0 0 8px rgba(159, 212, 212, 0.5); } .container .data-label { color: var(--text-main); font-weight: 600; font-family: var(--font-body); margin-right: 10px; min-width: 90px; } .container a { color: var(--primary-accent); text-decoration: none; font-weight: 600; transition: all .2s; text-shadow: 0 0 10px rgba(232, 107, 142, 0.4); } .container .data-row a { border-bottom: 1px dotted var(--primary-accent); } .container a:hover { text-decoration: none; color: var(--secondary-accent); transform: translateY(-1px); filter: drop-shadow(0 2px 6px rgba(159, 212, 212, 0.6)); text-shadow: 0 0 15px rgba(159, 212, 212, 0.8); } .container .data-row a:hover { border-bottom-style: solid; border-bottom-color: var(--secondary-accent); } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; color: var(--text-muted); font-size: 1.1rem; font-weight: 700; text-transform: none; font-family: var(--font-title); letter-spacing: 1px; list-style: none; transition: color 0.2s ease; } .container .dropdown-summary:hover { color: var(--primary-accent); } .container .dropdown-arrow { color: var(--primary-accent); margin-right: 10px; transition: transform 0.2s ease; } .container .dropdown-content { margin-top: 15px; padding: 25px; background: linear-gradient(135deg, rgba(15, 30, 35, 0.8) 0%, rgba(20, 40, 50, 0.85) 100%); border: 2px solid rgba(107, 155, 170, 0.3); border-radius: 0; box-shadow: 0 4px 15px rgba(0, 0, 0, 0.6), inset 0 1px 3px rgba(44, 95, 111, 0.3), 0 0 10px rgba(232, 107, 142, 0.15); } .container .config-title { color: var(--secondary-accent); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-body); text-transform: uppercase; letter-spacing: 1px; font-weight: 700; text-shadow: 0 0 10px rgba(159, 212, 212, 0.4); } .container pre { background: linear-gradient(135deg, rgba(20, 5, 5, 0.9) 0%, rgba(30, 8, 8, 0.95) 100%); padding: 22px; border: 2px solid rgba(139, 0, 0, 0.3); white-space: pre-wrap; word-wrap: break-word; color: var(--text-main); border-radius: 0; box-shadow: inset 0 2px 6px rgba(0, 0, 0, 0.5), 0 4px 12px rgba(0, 0, 0, 0.6); } .container pre code { background: none; color: inherit; padding: 0; border-radius: 0; } .container code { font-family: var(--font-code); color: var(--secondary-accent); background: rgba(159, 212, 212, 0.15); padding: 3px 7px; border-radius: 2px; box-shadow: 0 1px 3px rgba(159, 212, 212, 0.3); } body { background: linear-gradient(180deg, #0A1F2E 0%, #0D2838 15%, #102D3D 30%, #0D2838 50%, #0A1F2E 70%, #071520 85%, #040D15 100%); background-attachment: fixed; margin: 0; padding: 0; position: relative; overflow-x: hidden; } body::before { content: ''; position: fixed; left: -10%; bottom: 0; width: 25%; height: 100%; background-image: radial-gradient(ellipse 80px 200px at 20% 90%, rgba(0, 0, 0, 0.95), transparent), radial-gradient(ellipse 100px 300px at 15% 85%, rgba(0, 0, 0, 0.9), transparent), radial-gradient(ellipse 60px 250px at 25% 88%, rgba(0, 0, 0, 0.92), transparent), radial-gradient(ellipse 70px 180px at 10% 92%, rgba(0, 0, 0, 0.93), transparent), linear-gradient(to top, rgba(0, 0, 0, 0.8) 0%, transparent 60%); pointer-events: none; z-index: 1; } body::after { content: ''; position: fixed; right: -10%; bottom: 0; width: 25%; height: 100%; background-image: radial-gradient(ellipse 90px 220px at 80% 88%, rgba(0, 0, 0, 0.95), transparent), radial-gradient(ellipse 110px 320px at 85% 83%, rgba(0, 0, 0, 0.9), transparent), radial-gradient(ellipse 65px 260px at 75% 86%, rgba(0, 0, 0, 0.92), transparent), radial-gradient(ellipse 80px 190px at 90% 90%, rgba(0, 0, 0, 0.93), transparent), linear-gradient(to top, rgba(0, 0, 0, 0.8) 0%, transparent 60%); pointer-events: none; z-index: 1; } Magistral 24B Upscaled to 34B. The latest Magistral model seems pretty good. Has some refreshing prose. This model is an uncensored, creative writing and RP model. It uses a new (still a work in progress) dataset I've been curating based on real character cards. Has some structural repetition, at this point it's a calling card of Mistral models. I think it's better than v3 though. Creation Process: Upscale > CPT > SFT > Merge After upscaling, was pretrained on approx 100MB of light novels and a subset of DCLM records. SFT on approx 10 million tokens, SFW / NSFW RP, stories and creative instruct. I've removed some chat data which I think hurt more than helped and replaced it with conversations from real character cards. Did some experimenting with lora methods. Particularly dora vs rslora. With dora the writing was fantastic, but the model wasn't able to handle its own creativity, even with further RLHF applied. Rslora took the data far better, but was significantly less adept at writing. Merged the two models together, using the stable version as a base, which seems to have successfully combined the positives of both models. Upscale basemodel: Darkhn/Magistral-Small-2509-Text-Only mergemethod: passthrough dtype: bfloat16 slices: - sources: - model: Darkhn/Magistral-Small-2509-Text-Only layerrange: [0, 29] - sources: - model: Darkhn/Magistral-Small-2509-Text-Only layerrange: [10, 40] Slerp Merge models: - model: ApocalypseParty/Magi-PT-2-SFT-1-DPO-3 - model: ApocalypseParty/Magi-PT-2-SFT-2 mergemethod: slerp basemodel: ApocalypseParty/Magi-PT-2-SFT-2 parameters: t: [0, 0, 0, 0.1, 0.2] dtype: bfloat16 Pretrain (2H100) # ==================== # MODEL CONFIGURATION # ==================== basemodel: ApocalypseParty/magistral-34b modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/textfilesminimaldataset.jsonl type: completion - path: ./data/filteredresults.jsonl type: completion datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 64 loraalpha: 64 loradropout: 0.0 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 3e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.0 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 16384 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 40 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Magi-PT-2 loggingsteps: 1 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Magi-PT # wandbentity: yourentity wandbname: Magi-PT-2 SFT (2H100) basemodel: ApocalypseParty/Magi-PT-2 modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken plugins: - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin loadin8bit: true loadinbit: false deepspeed: deepspeedconfigs/zero1.json datasetpreparedpath: lastrunprepared valsetsize: 0 outputdir: ./Magi-PT-2-SFT-2 lorar: 128 loraalpha: 128 loradropout: 0.05 loratargetlinear: true loratargetmodules: - gateproj - downproj - upproj - qproj - vproj - kproj - oproj gradientaccumulationsteps: 4 microbatchsize: 2 numepochs: 3 optimizer: adamwbnb8bit lrscheduler: cosine learningrate: 1e-5 gradientcheckpointing: true resumefromcheckpoint: loggingsteps: 1 flashattention: true warmupratio: 0.05 evalsperepoch: 1 savesperepoch: 2

NaNK
license:mit
23
0

MS3.2-PaintedFantasy-v3-24B-exl3-4bpw

NaNK
license:apache-2.0
21
1

MS3.2-PaintedFantasy-v3-24B-exl3-5bpw

NaNK
license:apache-2.0
21
0

llama-3.3-70b-joyous-exl3-4.25bpw

NaNK
llama
19
2

Q3.5-BlueStar-v2-27B

NaNK
license:mit
19
0

L3.3-Cu-Mai-R1-70b-4.5bpw-hb6-exl2

NaNK
llama
19
0

MS3.2-PaintedFantasy-Visage-33B_exl3_6bpw

NaNK
license:apache-2.0
18
1

MS3.2-PaintedFantasy-v3-24B-exl3-3.5bpw

NaNK
license:apache-2.0
15
0

Q3.5-BlueStar-27B

NaNK
license:mit
13
0

MS3.2-PaintedFantasy-Visage-v4-34b-exl3-4bpw

@keyframes fallingleaf { 0% { transform: translateY(-20px) rotate(0deg); opacity: 1; } 100% { transform: translateY(100vh) rotate(360deg); opacity: 0.3; } } .container { --primary-accent: #7BC4C4; --secondary-accent: #9FD4D4; --accent-warm: #E8A5B5; --red-deep: #2C5F6F; --gold-bright: #B4DDD4; --ink-dark: #000000; --bg-main: #0A1A1A; --bg-container: #0F2525; --bg-card: rgba(15, 30, 35, 0.95); --text-main: #D5E8E8; --text-muted: #9CB9C4; --white: #FFFFFF; --border-color: #6B9BAA; --font-title: 'Inter', sans-serif; --font-body: 'Source Sans Pro', sans-serif; --font-code: 'JetBrains Mono', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.7; font-weight: 400; max-width: 1100px; margin: 20px auto; padding: 50px 40px; background-color: var(--bg-main); background-image: radial-gradient(ellipse at top, rgba(20, 45, 60, 0.3), transparent 60%), radial-gradient(ellipse at bottom, rgba(0, 0, 0, 0.8), transparent 50%), linear-gradient(180deg, rgba(15, 30, 40, 0.6) 0%, rgba(12, 25, 35, 0.7) 25%, rgba(10, 20, 28, 0.8) 50%, rgba(8, 16, 22, 0.85) 75%, rgba(6, 12, 18, 0.9) 100%); min-height: calc(100vh - 40px); position: relative; z-index: 2; border-radius: 3px; box-shadow: 0 10px 60px rgba(0, 0, 0, 0.8), 0 2px 10px rgba(0, 0, 0, 0.9), inset 0 0 120px rgba(20, 40, 55, 0.1); border: 2px solid rgba(30, 60, 75, 0.4); } .container::before { content: ''; position: absolute; top: 0; left: 0; right: 0; bottom: 0; background-image: radial-gradient(circle 3px at 15% 12%, rgba(123, 196, 196, 0.4) 0%, transparent 100%), radial-gradient(circle 2px at 25% 18%, rgba(159, 212, 212, 0.35) 0%, transparent 100%), radial-gradient(circle 4px at 45% 25%, rgba(232, 165, 181, 0.4) 0%, transparent 100%), radial-gradient(circle 3px at 65% 15%, rgba(123, 196, 196, 0.4) 0%, transparent 100%), radial-gradient(circle 2px at 85% 20%, rgba(196, 240, 234, 0.35) 0%, transparent 100%), radial-gradient(circle 3px at 35% 35%, rgba(232, 165, 181, 0.35) 0%, transparent 100%), radial-gradient(circle 2px at 75% 40%, rgba(180, 221, 221, 0.35) 0%, transparent 100%), radial-gradient(circle 4px at 20% 50%, rgba(123, 196, 196, 0.45) 0%, transparent 100%), radial-gradient(circle 3px at 90% 55%, rgba(196, 240, 234, 0.4) 0%, transparent 100%), radial-gradient(circle 2px at 50% 60%, rgba(232, 165, 181, 0.3) 0%, transparent 100%); pointer-events: none; border-radius: 3px; } .container .title-container { background: radial-gradient(circle at 15% 25%, rgba(30, 65, 80, 0.4) 0%, transparent 40%), radial-gradient(circle at 85% 70%, rgba(123, 196, 196, 0.2) 0%, transparent 45%), radial-gradient(circle at 50% 50%, rgba(20, 45, 60, 0.35) 0%, transparent 60%), linear-gradient(135deg, rgba(20, 45, 60, 0.3) 0%, rgba(15, 35, 50, 0.4) 50%, rgba(12, 28, 40, 0.5) 100%); margin-bottom: 50px; border: 3px solid; border-image: linear-gradient(135deg, rgba(123, 196, 196, 0.5) 0%, rgba(159, 212, 212, 0.6) 50%, rgba(123, 196, 196, 0.5) 100%) 1; border-radius: 0; padding: 60px 50px 70px; text-align: center; position: relative; overflow: hidden; box-shadow: 0 10px 50px rgba(0, 0, 0, 0.7), 0 4px 20px rgba(0, 0, 0, 0.8), inset 0 2px 8px rgba(123, 196, 196, 0.15); } .container .title-container::before { content: ''; position: absolute; top: -50%; right: -10%; width: 300px; height: 300px; background: radial-gradient(circle, rgba(44, 95, 111, 0.3) 0%, transparent 70%); border-radius: 40% 60% 70% 30%; filter: blur(40px); } .container .title-container::after { content: ''; position: absolute; bottom: -30%; left: -5%; width: 250px; height: 250px; background: radial-gradient(circle, rgba(232, 165, 181, 0.2) 0%, transparent 70%); border-radius: 60% 40% 30% 70%; filter: blur(35px); } .container .title-container .title-wrapper { position: relative; } .container .title-main { font-size: 3.5rem; font-weight: 800; margin: 0; letter-spacing: 6px; text-transform: uppercase; font-family: var(--font-title); position: relative; line-height: 1.2; z-index: 1; } .container .title-main .title-prefix { display: block; background: linear-gradient(135deg, #6EC5C5 0%, #E8A5B5 35%, #9FD4D4 70%, #B4E8DD 100%); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; filter: drop-shadow(0 3px 12px rgba(110, 197, 197, 0.6)) drop-shadow(0 0 20px rgba(232, 165, 181, 0.4)); margin-bottom: 15px; font-size: 3.5rem; letter-spacing: 6px; font-weight: 700; position: relative; padding-bottom: 12px; font-family: 'Cinzel', serif; text-transform: uppercase; } .container .title-main .title-prefix::after { content: ''; position: absolute; bottom: 0; left: 50%; transform: translateX(-50%); width: 80px; height: 4px; background: linear-gradient(90deg, transparent 0%, rgba(159, 212, 212, 0.8) 35%, rgba(232, 165, 181, 0.8) 65%, transparent 100%); border-radius: 2px; box-shadow: 0 0 10px rgba(232, 165, 181, 0.5); } .container .title-main .title-version { display: inline-block; background: linear-gradient(135deg, #8DD4D4 0%, #A8E0E0 50%, #C4F0EA 100%); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; filter: drop-shadow(0 2px 8px rgba(141, 212, 212, 0.6)) drop-shadow(0 0 15px rgba(168, 224, 224, 0.5)); font-size: 2.5rem; letter-spacing: 4px; font-weight: 700; position: relative; padding: 5px 15px; border: 2px solid transparent; border-image: linear-gradient(135deg, rgba(141, 212, 212, 0.4), rgba(168, 224, 224, 0.6), rgba(141, 212, 212, 0.4)) 1; font-family: 'Cinzel', serif; text-transform: uppercase; } .container .title-main .title-version::before { content: ''; position: absolute; top: -8px; left: -8px; right: -8px; bottom: -8px; background: linear-gradient(135deg, rgba(141, 212, 212, 0.15), rgba(44, 95, 111, 0.1)); z-index: -1; transform: skew(-2deg); } .container .lemonade-text { background: linear-gradient(135deg, var(--red-deep), var(--primary-accent)); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; filter: drop-shadow(0 2px 6px rgba(232, 107, 142, 0.6)); } .container .title-subtitle { padding-left: 0; margin-top: 15px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-style: italic; font-weight: 400; letter-spacing: 2px; text-transform: uppercase; opacity: 0.9; text-shadow: 0 0 10px rgba(159, 212, 212, 0.3); } .container img { max-width: 100%; border: 2px solid rgba(44, 95, 111, 0.5); margin-bottom: 40px; box-shadow: 0 8px 30px rgba(0, 0, 0, 0.7), 0 3px 12px rgba(123, 196, 196, 0.4), 0 0 20px rgba(232, 165, 181, 0.3); border-radius: 2px; position: relative; } .container img::after { content: ''; position: absolute; inset: -5px; background: linear-gradient(135deg, rgba(123, 196, 196, 0.2), rgba(232, 165, 181, 0.15)); filter: blur(8px); z-index: -1; } .container .section-container { margin-bottom: 35px; padding: 30px; background: linear-gradient(135deg, rgba(25, 55, 70, 0.85) 0%, rgba(30, 65, 80, 0.9) 50%, rgba(25, 55, 70, 0.85) 100%); border: 2px solid transparent; border-image: linear-gradient(135deg, rgba(123, 196, 196, 0.6), rgba(159, 212, 212, 0.7), rgba(123, 196, 196, 0.6)) 1; border-radius: 0; box-shadow: 0 8px 35px rgba(0, 0, 0, 0.7), inset 0 2px 8px rgba(123, 196, 196, 0.3), 0 0 25px rgba(123, 196, 196, 0.3), inset 0 -1px 20px rgba(159, 212, 212, 0.12); position: relative; } .container .section-container::before { content: ''; position: absolute; top: 0; left: 0; right: 0; bottom: 0; background: radial-gradient(circle at 20% 30%, rgba(123, 196, 196, 0.25) 0%, transparent 50%), radial-gradient(circle at 80% 70%, rgba(159, 212, 212, 0.15) 0%, transparent 50%), radial-gradient(circle at 50% 10%, rgba(232, 165, 181, 0.1) 0%, transparent 60%); pointer-events: none; z-index: 0; } .container .section-container:last-of-type { margin-bottom: 0; } .container .section-header { display: flex; align-items: center; padding: 0 0 18px 0; border-bottom: 2px solid rgba(123, 196, 196, 0.4); margin-bottom: 25px; position: relative; z-index: 1; } .container .section-title { font-family: var(--font-title); background: linear-gradient(135deg, #9FD4D4 0%, #B4E8DD 100%); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; font-size: 1.5rem; margin: 0 !important; padding: 0 !important; letter-spacing: 2px; font-weight: 700; text-transform: uppercase; border: none !important; position: relative; display: inline-block; filter: drop-shadow(0 2px 4px rgba(123, 196, 196, 0.5)); } .container .section-title::after { content: ''; position: absolute; bottom: -18px; left: 0; width: 50%; height: 2px; background-image: linear-gradient(to right, rgba(159, 212, 212, 0.7), transparent); opacity: 0.8; } .container .section-content { padding: 0; position: relative; z-index: 1; } .container .subheading { color: var(--secondary-accent); font-size: 1.25rem; margin-top: 28px; margin-bottom: 18px; font-weight: 600; display: block; text-transform: uppercase; letter-spacing: 1.5px; font-family: var(--font-title); border-bottom: 2px solid transparent; border-image: linear-gradient(to right, var(--primary-accent), var(--secondary-accent), transparent) 1; padding-bottom: 10px; text-shadow: 0 0 10px rgba(159, 212, 212, 0.4); } .container .data-box { background: linear-gradient(135deg, rgba(20, 40, 50, 0.6) 0%, rgba(25, 50, 60, 0.7) 100%); padding: 22px; border: 2px solid rgba(107, 155, 170, 0.3); border-left: 5px solid var(--primary-accent); margin-bottom: 20px; box-shadow: 0 4px 15px rgba(0, 0, 0, 0.5), inset 0 1px 2px rgba(44, 95, 111, 0.3), 0 0 10px rgba(232, 107, 142, 0.15); border-radius: 0; font-size: 1rem; position: relative; } .container .data-box::after { content: ''; position: absolute; top: 0; right: 0; width: 100px; height: 100px; background: radial-gradient(circle at top right, rgba(123, 196, 196, 0.08) 0%, transparent 60%); pointer-events: none; border-radius: 0 0 0 100%; } .container .data-row { display: flex; align-items: center; margin-bottom: 6px; padding: 5px 0; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--secondary-accent); font-weight: bold; margin-right: 12px; font-family: var(--font-code); font-size: 1.1rem; text-shadow: 0 0 8px rgba(159, 212, 212, 0.5); } .container .data-label { color: var(--text-main); font-weight: 600; font-family: var(--font-body); margin-right: 10px; min-width: 90px; } .container a { color: var(--primary-accent); text-decoration: none; font-weight: 600; transition: all .2s; text-shadow: 0 0 10px rgba(232, 107, 142, 0.4); } .container .data-row a { border-bottom: 1px dotted var(--primary-accent); } .container a:hover { text-decoration: none; color: var(--secondary-accent); transform: translateY(-1px); filter: drop-shadow(0 2px 6px rgba(159, 212, 212, 0.6)); text-shadow: 0 0 15px rgba(159, 212, 212, 0.8); } .container .data-row a:hover { border-bottom-style: solid; border-bottom-color: var(--secondary-accent); } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; color: var(--text-muted); font-size: 1.1rem; font-weight: 700; text-transform: none; font-family: var(--font-title); letter-spacing: 1px; list-style: none; transition: color 0.2s ease; } .container .dropdown-summary:hover { color: var(--primary-accent); } .container .dropdown-arrow { color: var(--primary-accent); margin-right: 10px; transition: transform 0.2s ease; } .container .dropdown-content { margin-top: 15px; padding: 25px; background: linear-gradient(135deg, rgba(15, 30, 35, 0.8) 0%, rgba(20, 40, 50, 0.85) 100%); border: 2px solid rgba(107, 155, 170, 0.3); border-radius: 0; box-shadow: 0 4px 15px rgba(0, 0, 0, 0.6), inset 0 1px 3px rgba(44, 95, 111, 0.3), 0 0 10px rgba(232, 107, 142, 0.15); } .container .config-title { color: var(--secondary-accent); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-body); text-transform: uppercase; letter-spacing: 1px; font-weight: 700; text-shadow: 0 0 10px rgba(159, 212, 212, 0.4); } .container pre { background: linear-gradient(135deg, rgba(20, 5, 5, 0.9) 0%, rgba(30, 8, 8, 0.95) 100%); padding: 22px; border: 2px solid rgba(139, 0, 0, 0.3); white-space: pre-wrap; word-wrap: break-word; color: var(--text-main); border-radius: 0; box-shadow: inset 0 2px 6px rgba(0, 0, 0, 0.5), 0 4px 12px rgba(0, 0, 0, 0.6); } .container pre code { background: none; color: inherit; padding: 0; border-radius: 0; } .container code { font-family: var(--font-code); color: var(--secondary-accent); background: rgba(159, 212, 212, 0.15); padding: 3px 7px; border-radius: 2px; box-shadow: 0 1px 3px rgba(159, 212, 212, 0.3); } body { background: linear-gradient(180deg, #0A1F2E 0%, #0D2838 15%, #102D3D 30%, #0D2838 50%, #0A1F2E 70%, #071520 85%, #040D15 100%); background-attachment: fixed; margin: 0; padding: 0; position: relative; overflow-x: hidden; } body::before { content: ''; position: fixed; left: -10%; bottom: 0; width: 25%; height: 100%; background-image: radial-gradient(ellipse 80px 200px at 20% 90%, rgba(0, 0, 0, 0.95), transparent), radial-gradient(ellipse 100px 300px at 15% 85%, rgba(0, 0, 0, 0.9), transparent), radial-gradient(ellipse 60px 250px at 25% 88%, rgba(0, 0, 0, 0.92), transparent), radial-gradient(ellipse 70px 180px at 10% 92%, rgba(0, 0, 0, 0.93), transparent), linear-gradient(to top, rgba(0, 0, 0, 0.8) 0%, transparent 60%); pointer-events: none; z-index: 1; } body::after { content: ''; position: fixed; right: -10%; bottom: 0; width: 25%; height: 100%; background-image: radial-gradient(ellipse 90px 220px at 80% 88%, rgba(0, 0, 0, 0.95), transparent), radial-gradient(ellipse 110px 320px at 85% 83%, rgba(0, 0, 0, 0.9), transparent), radial-gradient(ellipse 65px 260px at 75% 86%, rgba(0, 0, 0, 0.92), transparent), radial-gradient(ellipse 80px 190px at 90% 90%, rgba(0, 0, 0, 0.93), transparent), linear-gradient(to top, rgba(0, 0, 0, 0.8) 0%, transparent 60%); pointer-events: none; z-index: 1; } Magistral 24B Upscaled to 34B. The latest Magistral model seems pretty good. Has some refreshing prose. This model is an uncensored, creative writing and RP model. It uses a new (still a work in progress) dataset I've been curating based on real character cards. Has some structural repetition, at this point it's a calling card of Mistral models. I think it's better than v3 though. Creation Process: Upscale > CPT > SFT > Merge After upscaling, was pretrained on approx 100MB of light novels and a subset of DCLM records. SFT on approx 10 million tokens, SFW / NSFW RP, stories and creative instruct. I've removed some chat data which I think hurt more than helped and replaced it with conversations from real character cards. Did some experimenting with lora methods. Particularly dora vs rslora. With dora the writing was fantastic, but the model wasn't able to handle its own creativity, even with further RLHF applied. Rslora took the data far better, but was significantly less adept at writing. Merged the two models together, using the stable version as a base, which seems to have successfully combined the positives of both models. Upscale basemodel: Darkhn/Magistral-Small-2509-Text-Only mergemethod: passthrough dtype: bfloat16 slices: - sources: - model: Darkhn/Magistral-Small-2509-Text-Only layerrange: [0, 29] - sources: - model: Darkhn/Magistral-Small-2509-Text-Only layerrange: [10, 40] Slerp Merge models: - model: ApocalypseParty/Magi-PT-2-SFT-1-DPO-3 - model: ApocalypseParty/Magi-PT-2-SFT-2 mergemethod: slerp basemodel: ApocalypseParty/Magi-PT-2-SFT-2 parameters: t: [0, 0, 0, 0.1, 0.2] dtype: bfloat16 Pretrain (2H100) # ==================== # MODEL CONFIGURATION # ==================== basemodel: ApocalypseParty/magistral-34b modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/textfilesminimaldataset.jsonl type: completion - path: ./data/filteredresults.jsonl type: completion datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 64 loraalpha: 64 loradropout: 0.0 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 3e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.0 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 16384 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 40 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Magi-PT-2 loggingsteps: 1 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Magi-PT # wandbentity: yourentity wandbname: Magi-PT-2 SFT (2H100) basemodel: ApocalypseParty/Magi-PT-2 modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken plugins: - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin loadin8bit: true loadinbit: false deepspeed: deepspeedconfigs/zero1.json datasetpreparedpath: lastrunprepared valsetsize: 0 outputdir: ./Magi-PT-2-SFT-2 lorar: 128 loraalpha: 128 loradropout: 0.05 loratargetlinear: true loratargetmodules: - gateproj - downproj - upproj - qproj - vproj - kproj - oproj gradientaccumulationsteps: 4 microbatchsize: 2 numepochs: 3 optimizer: adamwbnb8bit lrscheduler: cosine learningrate: 1e-5 gradientcheckpointing: true resumefromcheckpoint: loggingsteps: 1 flashattention: true warmupratio: 0.05 evalsperepoch: 1 savesperepoch: 2

NaNK
license:mit
13
0

MS3.2-PaintedFantasy-v4.1-24B

NaNK
license:mit
11
0

MS3.2-PaintedFantasy-v2-24b-exl3-4bpw

body { font-family: 'Georgia', 'Times New Roman', serif; color: #dce4f0; / Soft off-white / line-height: 1.6; margin: 0; padding: 0; background-color: #161a25; / Deep blue from dark sky / } .lemonade-text { color: #89d8ff; / Bright blue from city lights / position: relative; z-index: 2; margin-left: 0.2em; text-shadow: 0 0 15px #89d8ff; } / Section styling / .section-container { background-color: rgba(32, 40, 56, 0.7); / Slightly transparent dark blue / margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #ff9966; / Sunset orange / box-shadow: 0 4px 15px rgba(255, 153, 102, 0.05); } .section-header { display: flex; align-items: center; background-color: rgba(255, 153, 102, 0.12); padding: 10px 20px; } .section-indicator { width: 8px; height: 20px; background-color: #ff9966; / Sunset orange / margin-right: 15px; box-shadow: 0 0 8px rgba(255, 153, 102, 0.2); } .section-title { font-family: 'Playfair Display', serif; / Using the new font / color: #ffb399; / Lighter sunset shade / font-size: 1.4rem; margin: 0; letter-spacing: 1px; font-weight: 400; text-transform: capitalize; } .section-content { padding: 20px; font-family: 'Crimson Text', serif; / Using the new font / color: #dce4f0; line-height: 1.6; } / Title styling / .title-container { background-color: #202838; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #ff9966; / Sunset orange / box-shadow: 0 6px 20px rgba(255, 153, 102, 0.07); } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Playfair Display', serif; } .title-main { color: #ffb399; / Lighter sunset shade / font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #a6c8e0; / Muted sky blue / font-size: 1.2rem; font-family: 'Crimson Text', serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(137, 216, 255, 0.08) 1px, rgba(0,0,0,0) 2px); / Rain effect with blue tint / z-index: 1; } / Data box styling / .data-box { background-color: rgba(22, 26, 37, 0.6); padding: 15px; border-left: 2px solid #ff9966; / Sunset orange / margin-bottom: 20px; box-shadow: 0 2px 10px rgba(255, 153, 102, 0.05); } .data-arrow { color: #ff9966; / Sunset orange / width: 20px; display: inline-block; } .data-label { color: #a6c8e0; / Muted sky blue / width: 80px; display: inline-block; } / Subheading styling / .subheading { color: #a6c8e0; / Muted sky blue / font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(166, 200, 224, 0.4); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Playfair Display', serif; } / Links / a { color: #89d8ff; / Bright blue from city lights / text-decoration: none; } a:hover { text-decoration: underline; color: #ffb399; / Lighter sunset shade on hover / } / Container / .container { max-width: 1200px; margin: 20px auto; padding: 40px 20px; background-color: #202838; / Darker container background / background-image: radial-gradient(circle at 20% 80%, rgba(255, 153, 102, 0.04) 0%, transparent 50%), / Sunset glow / radial-gradient(circle at 80% 20%, rgba(137, 216, 255, 0.04) 0%, transparent 50%), / Blue glow / radial-gradient(circle at 40% 40%, rgba(224, 230, 241, 0.02) 0%, transparent 50%); / Faint cloud/light glow / min-height: calc(100vh - 40px); border: 1px solid #ff9966; / Sunset orange / border-radius: 8px; box-shadow: 0 8px 32px rgba(255, 153, 102, 0.07); } / Dropdown styling / .dropdown-container { margin-top: 20px; } .dropdown-summary { cursor: pointer; padding: 10px 0; border-bottom: 1px dashed rgba(166, 200, 224, 0.4); color: #a6c8e0; / Muted sky blue / font-size: 1.1rem; font-weight: 400; text-transform: uppercase; letter-spacing: 1px; font-family: 'Playfair Display', serif; list-style: none; display: flex; align-items: center; } .dropdown-summary::-webkit-details-marker { display: none; } .dropdown-arrow { color: #ff9966; / Sunset orange / margin-right: 10px; transition: transform 0.3s ease; } .dropdown-container[open] .dropdown-arrow { transform: rotate(90deg); } .dropdown-content { margin-top: 15px; padding: 15px; background-color: rgba(22, 26, 37, 0.6); border-left: 2px solid #ff9966; / Sunset orange / box-shadow: 0 2px 10px rgba(255, 153, 102, 0.05); } .config-title { color: #a6c8e0; / Muted sky blue / font-size: 1rem; margin-bottom: 10px; font-family: 'Playfair Display', serif; text-transform: uppercase; letter-spacing: 1px; } This is an uncensored creative model intended to excel at character driven RP / ERP. Version 2 feels quite different from the original, with a heavy focus on reducing repetition across conversations and improving instruction following. Has a pretty unique writing style and sense of creativity (IMO). Pays the price with intermittent brain farts though. Training process: SFT > DPO > KTO SFT with RP/ERP, Stories and in character assistant data. DPO focused on reducing repetition, misgendered characters and slop. KTO focused on further reducing repetition and slop. Not optimized for cost / performance efficiency, YMMV. SFT 1H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ConicCat/Mistral-Small-3.2-AntiRep-24B modeltype: AutoModelForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./dataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 3 microbatchsize: 8 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 8192 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: true # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./PT-SFT1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: PF-SFT wandbentity: yourentity wandbname: runname

NaNK
license:apache-2.0
11
0

MS3.2-PaintedFantasy-Visage-v4-34b-exl3-5bpw

@keyframes fallingleaf { 0% { transform: translateY(-20px) rotate(0deg); opacity: 1; } 100% { transform: translateY(100vh) rotate(360deg); opacity: 0.3; } } .container { --primary-accent: #7BC4C4; --secondary-accent: #9FD4D4; --accent-warm: #E8A5B5; --red-deep: #2C5F6F; --gold-bright: #B4DDD4; --ink-dark: #000000; --bg-main: #0A1A1A; --bg-container: #0F2525; --bg-card: rgba(15, 30, 35, 0.95); --text-main: #D5E8E8; --text-muted: #9CB9C4; --white: #FFFFFF; --border-color: #6B9BAA; --font-title: 'Inter', sans-serif; --font-body: 'Source Sans Pro', sans-serif; --font-code: 'JetBrains Mono', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.7; font-weight: 400; max-width: 1100px; margin: 20px auto; padding: 50px 40px; background-color: var(--bg-main); background-image: radial-gradient(ellipse at top, rgba(20, 45, 60, 0.3), transparent 60%), radial-gradient(ellipse at bottom, rgba(0, 0, 0, 0.8), transparent 50%), linear-gradient(180deg, rgba(15, 30, 40, 0.6) 0%, rgba(12, 25, 35, 0.7) 25%, rgba(10, 20, 28, 0.8) 50%, rgba(8, 16, 22, 0.85) 75%, rgba(6, 12, 18, 0.9) 100%); min-height: calc(100vh - 40px); position: relative; z-index: 2; border-radius: 3px; box-shadow: 0 10px 60px rgba(0, 0, 0, 0.8), 0 2px 10px rgba(0, 0, 0, 0.9), inset 0 0 120px rgba(20, 40, 55, 0.1); border: 2px solid rgba(30, 60, 75, 0.4); } .container::before { content: ''; position: absolute; top: 0; left: 0; right: 0; bottom: 0; background-image: radial-gradient(circle 3px at 15% 12%, rgba(123, 196, 196, 0.4) 0%, transparent 100%), radial-gradient(circle 2px at 25% 18%, rgba(159, 212, 212, 0.35) 0%, transparent 100%), radial-gradient(circle 4px at 45% 25%, rgba(232, 165, 181, 0.4) 0%, transparent 100%), radial-gradient(circle 3px at 65% 15%, rgba(123, 196, 196, 0.4) 0%, transparent 100%), radial-gradient(circle 2px at 85% 20%, rgba(196, 240, 234, 0.35) 0%, transparent 100%), radial-gradient(circle 3px at 35% 35%, rgba(232, 165, 181, 0.35) 0%, transparent 100%), radial-gradient(circle 2px at 75% 40%, rgba(180, 221, 221, 0.35) 0%, transparent 100%), radial-gradient(circle 4px at 20% 50%, rgba(123, 196, 196, 0.45) 0%, transparent 100%), radial-gradient(circle 3px at 90% 55%, rgba(196, 240, 234, 0.4) 0%, transparent 100%), radial-gradient(circle 2px at 50% 60%, rgba(232, 165, 181, 0.3) 0%, transparent 100%); pointer-events: none; border-radius: 3px; } .container .title-container { background: radial-gradient(circle at 15% 25%, rgba(30, 65, 80, 0.4) 0%, transparent 40%), radial-gradient(circle at 85% 70%, rgba(123, 196, 196, 0.2) 0%, transparent 45%), radial-gradient(circle at 50% 50%, rgba(20, 45, 60, 0.35) 0%, transparent 60%), linear-gradient(135deg, rgba(20, 45, 60, 0.3) 0%, rgba(15, 35, 50, 0.4) 50%, rgba(12, 28, 40, 0.5) 100%); margin-bottom: 50px; border: 3px solid; border-image: linear-gradient(135deg, rgba(123, 196, 196, 0.5) 0%, rgba(159, 212, 212, 0.6) 50%, rgba(123, 196, 196, 0.5) 100%) 1; border-radius: 0; padding: 60px 50px 70px; text-align: center; position: relative; overflow: hidden; box-shadow: 0 10px 50px rgba(0, 0, 0, 0.7), 0 4px 20px rgba(0, 0, 0, 0.8), inset 0 2px 8px rgba(123, 196, 196, 0.15); } .container .title-container::before { content: ''; position: absolute; top: -50%; right: -10%; width: 300px; height: 300px; background: radial-gradient(circle, rgba(44, 95, 111, 0.3) 0%, transparent 70%); border-radius: 40% 60% 70% 30%; filter: blur(40px); } .container .title-container::after { content: ''; position: absolute; bottom: -30%; left: -5%; width: 250px; height: 250px; background: radial-gradient(circle, rgba(232, 165, 181, 0.2) 0%, transparent 70%); border-radius: 60% 40% 30% 70%; filter: blur(35px); } .container .title-container .title-wrapper { position: relative; } .container .title-main { font-size: 3.5rem; font-weight: 800; margin: 0; letter-spacing: 6px; text-transform: uppercase; font-family: var(--font-title); position: relative; line-height: 1.2; z-index: 1; } .container .title-main .title-prefix { display: block; background: linear-gradient(135deg, #6EC5C5 0%, #E8A5B5 35%, #9FD4D4 70%, #B4E8DD 100%); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; filter: drop-shadow(0 3px 12px rgba(110, 197, 197, 0.6)) drop-shadow(0 0 20px rgba(232, 165, 181, 0.4)); margin-bottom: 15px; font-size: 3.5rem; letter-spacing: 6px; font-weight: 700; position: relative; padding-bottom: 12px; font-family: 'Cinzel', serif; text-transform: uppercase; } .container .title-main .title-prefix::after { content: ''; position: absolute; bottom: 0; left: 50%; transform: translateX(-50%); width: 80px; height: 4px; background: linear-gradient(90deg, transparent 0%, rgba(159, 212, 212, 0.8) 35%, rgba(232, 165, 181, 0.8) 65%, transparent 100%); border-radius: 2px; box-shadow: 0 0 10px rgba(232, 165, 181, 0.5); } .container .title-main .title-version { display: inline-block; background: linear-gradient(135deg, #8DD4D4 0%, #A8E0E0 50%, #C4F0EA 100%); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; filter: drop-shadow(0 2px 8px rgba(141, 212, 212, 0.6)) drop-shadow(0 0 15px rgba(168, 224, 224, 0.5)); font-size: 2.5rem; letter-spacing: 4px; font-weight: 700; position: relative; padding: 5px 15px; border: 2px solid transparent; border-image: linear-gradient(135deg, rgba(141, 212, 212, 0.4), rgba(168, 224, 224, 0.6), rgba(141, 212, 212, 0.4)) 1; font-family: 'Cinzel', serif; text-transform: uppercase; } .container .title-main .title-version::before { content: ''; position: absolute; top: -8px; left: -8px; right: -8px; bottom: -8px; background: linear-gradient(135deg, rgba(141, 212, 212, 0.15), rgba(44, 95, 111, 0.1)); z-index: -1; transform: skew(-2deg); } .container .lemonade-text { background: linear-gradient(135deg, var(--red-deep), var(--primary-accent)); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; filter: drop-shadow(0 2px 6px rgba(232, 107, 142, 0.6)); } .container .title-subtitle { padding-left: 0; margin-top: 15px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-style: italic; font-weight: 400; letter-spacing: 2px; text-transform: uppercase; opacity: 0.9; text-shadow: 0 0 10px rgba(159, 212, 212, 0.3); } .container img { max-width: 100%; border: 2px solid rgba(44, 95, 111, 0.5); margin-bottom: 40px; box-shadow: 0 8px 30px rgba(0, 0, 0, 0.7), 0 3px 12px rgba(123, 196, 196, 0.4), 0 0 20px rgba(232, 165, 181, 0.3); border-radius: 2px; position: relative; } .container img::after { content: ''; position: absolute; inset: -5px; background: linear-gradient(135deg, rgba(123, 196, 196, 0.2), rgba(232, 165, 181, 0.15)); filter: blur(8px); z-index: -1; } .container .section-container { margin-bottom: 35px; padding: 30px; background: linear-gradient(135deg, rgba(25, 55, 70, 0.85) 0%, rgba(30, 65, 80, 0.9) 50%, rgba(25, 55, 70, 0.85) 100%); border: 2px solid transparent; border-image: linear-gradient(135deg, rgba(123, 196, 196, 0.6), rgba(159, 212, 212, 0.7), rgba(123, 196, 196, 0.6)) 1; border-radius: 0; box-shadow: 0 8px 35px rgba(0, 0, 0, 0.7), inset 0 2px 8px rgba(123, 196, 196, 0.3), 0 0 25px rgba(123, 196, 196, 0.3), inset 0 -1px 20px rgba(159, 212, 212, 0.12); position: relative; } .container .section-container::before { content: ''; position: absolute; top: 0; left: 0; right: 0; bottom: 0; background: radial-gradient(circle at 20% 30%, rgba(123, 196, 196, 0.25) 0%, transparent 50%), radial-gradient(circle at 80% 70%, rgba(159, 212, 212, 0.15) 0%, transparent 50%), radial-gradient(circle at 50% 10%, rgba(232, 165, 181, 0.1) 0%, transparent 60%); pointer-events: none; z-index: 0; } .container .section-container:last-of-type { margin-bottom: 0; } .container .section-header { display: flex; align-items: center; padding: 0 0 18px 0; border-bottom: 2px solid rgba(123, 196, 196, 0.4); margin-bottom: 25px; position: relative; z-index: 1; } .container .section-title { font-family: var(--font-title); background: linear-gradient(135deg, #9FD4D4 0%, #B4E8DD 100%); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; font-size: 1.5rem; margin: 0 !important; padding: 0 !important; letter-spacing: 2px; font-weight: 700; text-transform: uppercase; border: none !important; position: relative; display: inline-block; filter: drop-shadow(0 2px 4px rgba(123, 196, 196, 0.5)); } .container .section-title::after { content: ''; position: absolute; bottom: -18px; left: 0; width: 50%; height: 2px; background-image: linear-gradient(to right, rgba(159, 212, 212, 0.7), transparent); opacity: 0.8; } .container .section-content { padding: 0; position: relative; z-index: 1; } .container .subheading { color: var(--secondary-accent); font-size: 1.25rem; margin-top: 28px; margin-bottom: 18px; font-weight: 600; display: block; text-transform: uppercase; letter-spacing: 1.5px; font-family: var(--font-title); border-bottom: 2px solid transparent; border-image: linear-gradient(to right, var(--primary-accent), var(--secondary-accent), transparent) 1; padding-bottom: 10px; text-shadow: 0 0 10px rgba(159, 212, 212, 0.4); } .container .data-box { background: linear-gradient(135deg, rgba(20, 40, 50, 0.6) 0%, rgba(25, 50, 60, 0.7) 100%); padding: 22px; border: 2px solid rgba(107, 155, 170, 0.3); border-left: 5px solid var(--primary-accent); margin-bottom: 20px; box-shadow: 0 4px 15px rgba(0, 0, 0, 0.5), inset 0 1px 2px rgba(44, 95, 111, 0.3), 0 0 10px rgba(232, 107, 142, 0.15); border-radius: 0; font-size: 1rem; position: relative; } .container .data-box::after { content: ''; position: absolute; top: 0; right: 0; width: 100px; height: 100px; background: radial-gradient(circle at top right, rgba(123, 196, 196, 0.08) 0%, transparent 60%); pointer-events: none; border-radius: 0 0 0 100%; } .container .data-row { display: flex; align-items: center; margin-bottom: 6px; padding: 5px 0; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--secondary-accent); font-weight: bold; margin-right: 12px; font-family: var(--font-code); font-size: 1.1rem; text-shadow: 0 0 8px rgba(159, 212, 212, 0.5); } .container .data-label { color: var(--text-main); font-weight: 600; font-family: var(--font-body); margin-right: 10px; min-width: 90px; } .container a { color: var(--primary-accent); text-decoration: none; font-weight: 600; transition: all .2s; text-shadow: 0 0 10px rgba(232, 107, 142, 0.4); } .container .data-row a { border-bottom: 1px dotted var(--primary-accent); } .container a:hover { text-decoration: none; color: var(--secondary-accent); transform: translateY(-1px); filter: drop-shadow(0 2px 6px rgba(159, 212, 212, 0.6)); text-shadow: 0 0 15px rgba(159, 212, 212, 0.8); } .container .data-row a:hover { border-bottom-style: solid; border-bottom-color: var(--secondary-accent); } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; color: var(--text-muted); font-size: 1.1rem; font-weight: 700; text-transform: none; font-family: var(--font-title); letter-spacing: 1px; list-style: none; transition: color 0.2s ease; } .container .dropdown-summary:hover { color: var(--primary-accent); } .container .dropdown-arrow { color: var(--primary-accent); margin-right: 10px; transition: transform 0.2s ease; } .container .dropdown-content { margin-top: 15px; padding: 25px; background: linear-gradient(135deg, rgba(15, 30, 35, 0.8) 0%, rgba(20, 40, 50, 0.85) 100%); border: 2px solid rgba(107, 155, 170, 0.3); border-radius: 0; box-shadow: 0 4px 15px rgba(0, 0, 0, 0.6), inset 0 1px 3px rgba(44, 95, 111, 0.3), 0 0 10px rgba(232, 107, 142, 0.15); } .container .config-title { color: var(--secondary-accent); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-body); text-transform: uppercase; letter-spacing: 1px; font-weight: 700; text-shadow: 0 0 10px rgba(159, 212, 212, 0.4); } .container pre { background: linear-gradient(135deg, rgba(20, 5, 5, 0.9) 0%, rgba(30, 8, 8, 0.95) 100%); padding: 22px; border: 2px solid rgba(139, 0, 0, 0.3); white-space: pre-wrap; word-wrap: break-word; color: var(--text-main); border-radius: 0; box-shadow: inset 0 2px 6px rgba(0, 0, 0, 0.5), 0 4px 12px rgba(0, 0, 0, 0.6); } .container pre code { background: none; color: inherit; padding: 0; border-radius: 0; } .container code { font-family: var(--font-code); color: var(--secondary-accent); background: rgba(159, 212, 212, 0.15); padding: 3px 7px; border-radius: 2px; box-shadow: 0 1px 3px rgba(159, 212, 212, 0.3); } body { background: linear-gradient(180deg, #0A1F2E 0%, #0D2838 15%, #102D3D 30%, #0D2838 50%, #0A1F2E 70%, #071520 85%, #040D15 100%); background-attachment: fixed; margin: 0; padding: 0; position: relative; overflow-x: hidden; } body::before { content: ''; position: fixed; left: -10%; bottom: 0; width: 25%; height: 100%; background-image: radial-gradient(ellipse 80px 200px at 20% 90%, rgba(0, 0, 0, 0.95), transparent), radial-gradient(ellipse 100px 300px at 15% 85%, rgba(0, 0, 0, 0.9), transparent), radial-gradient(ellipse 60px 250px at 25% 88%, rgba(0, 0, 0, 0.92), transparent), radial-gradient(ellipse 70px 180px at 10% 92%, rgba(0, 0, 0, 0.93), transparent), linear-gradient(to top, rgba(0, 0, 0, 0.8) 0%, transparent 60%); pointer-events: none; z-index: 1; } body::after { content: ''; position: fixed; right: -10%; bottom: 0; width: 25%; height: 100%; background-image: radial-gradient(ellipse 90px 220px at 80% 88%, rgba(0, 0, 0, 0.95), transparent), radial-gradient(ellipse 110px 320px at 85% 83%, rgba(0, 0, 0, 0.9), transparent), radial-gradient(ellipse 65px 260px at 75% 86%, rgba(0, 0, 0, 0.92), transparent), radial-gradient(ellipse 80px 190px at 90% 90%, rgba(0, 0, 0, 0.93), transparent), linear-gradient(to top, rgba(0, 0, 0, 0.8) 0%, transparent 60%); pointer-events: none; z-index: 1; } Magistral 24B Upscaled to 34B. The latest Magistral model seems pretty good. Has some refreshing prose. This model is an uncensored, creative writing and RP model. It uses a new (still a work in progress) dataset I've been curating based on real character cards. Has some structural repetition, at this point it's a calling card of Mistral models. I think it's better than v3 though. Creation Process: Upscale > CPT > SFT > Merge After upscaling, was pretrained on approx 100MB of light novels and a subset of DCLM records. SFT on approx 10 million tokens, SFW / NSFW RP, stories and creative instruct. I've removed some chat data which I think hurt more than helped and replaced it with conversations from real character cards. Did some experimenting with lora methods. Particularly dora vs rslora. With dora the writing was fantastic, but the model wasn't able to handle its own creativity, even with further RLHF applied. Rslora took the data far better, but was significantly less adept at writing. Merged the two models together, using the stable version as a base, which seems to have successfully combined the positives of both models. Upscale basemodel: Darkhn/Magistral-Small-2509-Text-Only mergemethod: passthrough dtype: bfloat16 slices: - sources: - model: Darkhn/Magistral-Small-2509-Text-Only layerrange: [0, 29] - sources: - model: Darkhn/Magistral-Small-2509-Text-Only layerrange: [10, 40] Slerp Merge models: - model: ApocalypseParty/Magi-PT-2-SFT-1-DPO-3 - model: ApocalypseParty/Magi-PT-2-SFT-2 mergemethod: slerp basemodel: ApocalypseParty/Magi-PT-2-SFT-2 parameters: t: [0, 0, 0, 0.1, 0.2] dtype: bfloat16 Pretrain (2H100) # ==================== # MODEL CONFIGURATION # ==================== basemodel: ApocalypseParty/magistral-34b modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/textfilesminimaldataset.jsonl type: completion - path: ./data/filteredresults.jsonl type: completion datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 64 loraalpha: 64 loradropout: 0.0 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 3e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.0 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 16384 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 40 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Magi-PT-2 loggingsteps: 1 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Magi-PT # wandbentity: yourentity wandbname: Magi-PT-2 SFT (2H100) basemodel: ApocalypseParty/Magi-PT-2 modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken plugins: - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin loadin8bit: true loadinbit: false deepspeed: deepspeedconfigs/zero1.json datasetpreparedpath: lastrunprepared valsetsize: 0 outputdir: ./Magi-PT-2-SFT-2 lorar: 128 loraalpha: 128 loradropout: 0.05 loratargetlinear: true loratargetmodules: - gateproj - downproj - upproj - qproj - vproj - kproj - oproj gradientaccumulationsteps: 4 microbatchsize: 2 numepochs: 3 optimizer: adamwbnb8bit lrscheduler: cosine learningrate: 1e-5 gradientcheckpointing: true resumefromcheckpoint: loggingsteps: 1 flashattention: true warmupratio: 0.05 evalsperepoch: 1 savesperepoch: 2

NaNK
license:mit
11
0

MS3.2-PaintedFantasy-v2-24b-exl3-3bpw

body { font-family: 'Georgia', 'Times New Roman', serif; color: #dce4f0; / Soft off-white / line-height: 1.6; margin: 0; padding: 0; background-color: #161a25; / Deep blue from dark sky / } .lemonade-text { color: #89d8ff; / Bright blue from city lights / position: relative; z-index: 2; margin-left: 0.2em; text-shadow: 0 0 15px #89d8ff; } / Section styling / .section-container { background-color: rgba(32, 40, 56, 0.7); / Slightly transparent dark blue / margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #ff9966; / Sunset orange / box-shadow: 0 4px 15px rgba(255, 153, 102, 0.05); } .section-header { display: flex; align-items: center; background-color: rgba(255, 153, 102, 0.12); padding: 10px 20px; } .section-indicator { width: 8px; height: 20px; background-color: #ff9966; / Sunset orange / margin-right: 15px; box-shadow: 0 0 8px rgba(255, 153, 102, 0.2); } .section-title { font-family: 'Playfair Display', serif; / Using the new font / color: #ffb399; / Lighter sunset shade / font-size: 1.4rem; margin: 0; letter-spacing: 1px; font-weight: 400; text-transform: capitalize; } .section-content { padding: 20px; font-family: 'Crimson Text', serif; / Using the new font / color: #dce4f0; line-height: 1.6; } / Title styling / .title-container { background-color: #202838; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #ff9966; / Sunset orange / box-shadow: 0 6px 20px rgba(255, 153, 102, 0.07); } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Playfair Display', serif; } .title-main { color: #ffb399; / Lighter sunset shade / font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #a6c8e0; / Muted sky blue / font-size: 1.2rem; font-family: 'Crimson Text', serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(137, 216, 255, 0.08) 1px, rgba(0,0,0,0) 2px); / Rain effect with blue tint / z-index: 1; } / Data box styling / .data-box { background-color: rgba(22, 26, 37, 0.6); padding: 15px; border-left: 2px solid #ff9966; / Sunset orange / margin-bottom: 20px; box-shadow: 0 2px 10px rgba(255, 153, 102, 0.05); } .data-arrow { color: #ff9966; / Sunset orange / width: 20px; display: inline-block; } .data-label { color: #a6c8e0; / Muted sky blue / width: 80px; display: inline-block; } / Subheading styling / .subheading { color: #a6c8e0; / Muted sky blue / font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(166, 200, 224, 0.4); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Playfair Display', serif; } / Links / a { color: #89d8ff; / Bright blue from city lights / text-decoration: none; } a:hover { text-decoration: underline; color: #ffb399; / Lighter sunset shade on hover / } / Container / .container { max-width: 1200px; margin: 20px auto; padding: 40px 20px; background-color: #202838; / Darker container background / background-image: radial-gradient(circle at 20% 80%, rgba(255, 153, 102, 0.04) 0%, transparent 50%), / Sunset glow / radial-gradient(circle at 80% 20%, rgba(137, 216, 255, 0.04) 0%, transparent 50%), / Blue glow / radial-gradient(circle at 40% 40%, rgba(224, 230, 241, 0.02) 0%, transparent 50%); / Faint cloud/light glow / min-height: calc(100vh - 40px); border: 1px solid #ff9966; / Sunset orange / border-radius: 8px; box-shadow: 0 8px 32px rgba(255, 153, 102, 0.07); } / Dropdown styling / .dropdown-container { margin-top: 20px; } .dropdown-summary { cursor: pointer; padding: 10px 0; border-bottom: 1px dashed rgba(166, 200, 224, 0.4); color: #a6c8e0; / Muted sky blue / font-size: 1.1rem; font-weight: 400; text-transform: uppercase; letter-spacing: 1px; font-family: 'Playfair Display', serif; list-style: none; display: flex; align-items: center; } .dropdown-summary::-webkit-details-marker { display: none; } .dropdown-arrow { color: #ff9966; / Sunset orange / margin-right: 10px; transition: transform 0.3s ease; } .dropdown-container[open] .dropdown-arrow { transform: rotate(90deg); } .dropdown-content { margin-top: 15px; padding: 15px; background-color: rgba(22, 26, 37, 0.6); border-left: 2px solid #ff9966; / Sunset orange / box-shadow: 0 2px 10px rgba(255, 153, 102, 0.05); } .config-title { color: #a6c8e0; / Muted sky blue / font-size: 1rem; margin-bottom: 10px; font-family: 'Playfair Display', serif; text-transform: uppercase; letter-spacing: 1px; } This is an uncensored creative model intended to excel at character driven RP / ERP. Version 2 feels quite different from the original, with a heavy focus on reducing repetition across conversations and improving instruction following. Has a pretty unique writing style and sense of creativity (IMO). Pays the price with intermittent brain farts though. Training process: SFT > DPO > KTO SFT with RP/ERP, Stories and in character assistant data. DPO focused on reducing repetition, misgendered characters and slop. KTO focused on further reducing repetition and slop. Not optimized for cost / performance efficiency, YMMV. SFT 1H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ConicCat/Mistral-Small-3.2-AntiRep-24B modeltype: AutoModelForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./dataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 3 microbatchsize: 8 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 8192 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: true # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./PT-SFT1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: PF-SFT wandbentity: yourentity wandbname: runname

NaNK
license:apache-2.0
10
0

MS3.2-PaintedFantasy-Visage-v2-33B-exl3-4bpw

.container { --primary-accent: #EC83B1; --secondary-accent: #86C5E5; --tertiary-accent: #FDE484; --accent-rose: #F8A5C2; --bg-main: #1A1D2E; --bg-container: #232741; --bg-card: rgba(40, 45, 70, 0.7); --text-main: #E8ECF0; --text-muted: #B8C2D0; --white: #FFFFFF; --font-title: 'Inter', serif; --font-heading: 'Inter', serif; --font-body: 'Inter', serif; --font-code: 'JetBrains Mono', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; max-width: 1200px; margin: 20px auto; padding: 40px 20px; background-color: var(--bg-container); background-image: radial-gradient(circle at 20% 80%, rgba(236, 131, 177, 0.04) 0%, transparent 50%), radial-gradient(circle at 80% 20%, rgba(134, 197, 229, 0.04) 0%, transparent 50%), radial-gradient(circle at 40% 40%, rgba(253, 228, 132, 0.02) 0%, transparent 50%); min-height: calc(100vh - 40px); border: 1px solid var(--primary-accent); border-radius: 8px; box-shadow: 0 8px 32px rgba(236, 131, 177, 0.07); } .container .title-container { background-color: var(--bg-main); position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid var(--primary-accent); box-shadow: 0 6px 20px rgba(236, 131, 177, 0.07); } .container .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: var(--font-title); } .container .title-main { color: var(--accent-rose); font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .container .title-prefix { position: relative; z-index: 2; } .container .lemonade-text { color: var(--secondary-accent); position: relative; z-index: 2; margin-left: 0.2em; text-shadow: 0 0 15px var(--secondary-accent); } .container .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .container .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(134, 197, 229, 0.08) 1px, rgba(0,0,0,0) 2px); z-index: 1; } .container img { max-width: 100%; border: 3px solid var(--white); margin-bottom: 30px; box-shadow: 0 0 15px rgba(0, 0, 0, 0.3); } .container .section-container { background-color: var(--bg-card); margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: none !important; box-shadow: 0 4px 15px rgba(236, 131, 177, 0.05); } .container .section-header { display: flex; align-items: center; background-color: rgba(236, 131, 177, 0.12); padding: 10px 20px; border-bottom: none !important; } .container .section-indicator { width: 8px; height: 20px; background-color: var(--primary-accent); margin-right: 15px; box-shadow: 0 0 8px rgba(236, 131, 177, 0.2); } .container .section-title { font-family: var(--font-heading); color: var(--accent-rose); font-size: 1.4rem; margin: 0 !important; padding: 0 !important; letter-spacing: 1px; font-weight: 400; text-transform: capitalize; border-bottom: none !important; } .container .section-content { padding: 20px; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; } .container .subheading { color: var(--text-muted); font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(184, 194, 208, 0.4); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: var(--font-heading); } .container .data-box { background-color: rgba(26, 29, 46, 0.6); padding: 15px; border-left: 2px solid var(--primary-accent); margin-bottom: 20px; box-shadow: 0 2px 10px rgba(236, 131, 177, 0.05); } .container .data-row { display: flex; margin-bottom: 8px; align-items: center; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--primary-accent); width: 20px; display: inline-block; } .container .data-label { color: var(--text-muted); width: 80px; display: inline-block; } .container a { color: var(--secondary-accent); text-decoration: none; font-weight: 600; transition: color .3s; } .container a:hover { text-decoration: underline; color: var(--accent-rose); } .container .data-box a { position: relative; background-image: linear-gradient(to top, var(--primary-accent), var(--primary-accent)); background-position: 0 100%; background-repeat: no-repeat; background-size: 0% 2px; transition: background-size .3s, color .3s; } .container .data-box a:hover { color: var(--primary-accent); background-size: 100% 2px; } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; border-bottom: 1px dashed rgba(184, 194, 208, 0.4); color: var(--text-muted); font-size: 1.1rem; font-weight: 400; text-transform: uppercase; letter-spacing: 1px; font-family: var(--font-heading); list-style: none; display: flex; align-items: center; } .container .dropdown-summary::-webkit-details-marker { display: none; } .container .dropdown-arrow { color: var(--primary-accent); margin-right: 10px; transition: transform 0.3s ease; } .container details[open] .dropdown-arrow { transform: rotate(90deg); } .container .dropdown-content { margin-top: 15px; padding: 15px; background-color: rgba(26, 29, 46, 0.6); border-left: 2px solid var(--primary-accent); box-shadow: 0 2px 10px rgba(236, 131, 177, 0.05); } .container .config-title { color: var(--text-muted); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-heading); text-transform: uppercase; letter-spacing: 1px; } .container pre { background-color: var(--bg-main); padding: 15px; border: 1px solid rgba(134, 197, 229, 0.4); white-space: pre-wrap; word-wrap: break-word; color: var(--text-main); border-radius: 4px; } .container code { font-family: var(--font-code); background: transparent; padding: 0; } A surprisingly difficult model to work with. Removing the repetition was coming at the expense of the unique creativity the original upscale had. Decided on upscaling Painted Fantasy v2, healing it and then merging the original upscale back in. The result is a smarter, uncensored, creative model that excels at character driven RP / ERP where characters are portrayed creatively and proactively. Creation Process: Upscale > PT > SFT > KTO > DPO Pretrained on approx 300MB of light novels, stories and FineWeb-2 corpus. SFT on approx 8 million tokens, SFW / NSFW RP, stories and creative instruct data. KTO on antirep data created from the SFT datasets. Rejected examples generated by MS3.2 with repetitionpenalty=0.9 and OOC commands encouraging it to misgender, impersonate user etc. DPO on a high quality RP / NSFW dataset that is unreleased using rejected samples created in the same method as KTO. Resulting model was non repetitive, but had lost some of the spark the original upscale had. Merged the original upscale back in, making sure to not reintroduce repetition. Merge configurations used during the model creation process. Initial Upscale (Passthrough) basemodel: zerofata/MS3.2-PaintedFantasy-v2-24B dtype: bfloat16 slices: - sources: - model: zerofata/MS3.2-PaintedFantasy-v2-24B layerrange: [0, 29] - sources: - model: zerofata/MS3.2-PaintedFantasy-v2-24B layerrange: [10, 39] Final Merge (Slerp) models: - model: zerofata/MS3.2-PaintedFantasy-Visage-33B - model: ../axolotl/Visage-V2-PT-1-SFT-2-KTO-1-DPO-1/merged mergemethod: slerp basemodel: ../axolotl/Visage-V2-PT-1-SFT-2-KTO-1-DPO-1/merged parameters: t: [0.4, 0.2, 0, 0.2, 0.4] dtype: bfloat16 Not optimized for cost / performance efficiency, YMMV. Pretrain 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ../mergekit/pfv2upscale modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/pretraindatasetv5stripped.jsonl type: completion datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 64 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 4e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 12288 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 40 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-PT # wandbentity: yourentity wandbname: Visage-V2-PT-1 SFT 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/automateddataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/handcrafteddataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/instructdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/cwdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/storiesdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/cwclaudedataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/summariesdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 2 microbatchsize: 2 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 8192 samplepacking: true padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-SFT # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2 KTO 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # RL/DPO CONFIGURATION # ==================== rl: kto rlbeta: 0.1 ktodesirableweight: 1.25 ktoundesirableweight: 1.0 # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./handcrafteddatasetkto.jsonl type: llama3.argilla - path: ./approvedrpdatasetkto.jsonl type: llama3.argilla - path: ./instructdatasetkto.jsonl type: llama3.argilla datasetpreparedpath: trainoninputs: false # Only train on assistant responses removeunusedcolumns: False # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 32 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 4 learningrate: 5e-6 optimizer: adamw8bit lrscheduler: cosine warmupsteps: 15 weightdecay: 0.001 maxgradnorm: 0.01 # ==================== # SEQUENCE CONFIGURATION # ==================== sequencelen: 8192 padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # CHECKPOINTING # ==================== savesteps: 100 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2-KTO-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-KTO # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2-KTO-1 DPO 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # RL/DPO CONFIGURATION # ==================== rl: dpo rlbeta: 0.1 # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./handcrafteddatasetmistralrep.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 16 loraalpha: 32 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 2 gradientaccumulationsteps: 1 learningrate: 2e-6 optimizer: adamw8bit lrscheduler: cosine warmupsteps: 5 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE CONFIGURATION # ==================== sequencelen: 8192 padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # CHECKPOINTING # ==================== savesteps: 10 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2-DPO-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-DPO # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2-DPO-1

NaNK
license:apache-2.0
10
0

MS3.2-PaintedFantasy-Visage-v3-34B-exl3-4bpw

.container { --primary-accent: #C0C0C0; --secondary-accent: #4A9EFF; --glow-primary: rgba(192, 192, 192, 0.6); --glow-secondary: rgba(74, 158, 255, 0.6); --bg-main: #0B0A18; --bg-container: #110F24; --bg-card: rgba(20, 18, 40, 0.7); --text-main: #DCDCDC; --text-muted: #9E9E9E; --white: #FFFFFF; --border-color: #3C3A50; --font-title: 'Cinzel', serif; --font-body: 'EB Garamond', serif; --font-code: 'Courier New', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; font-weight: 400; max-width: 1100px; margin: 20px auto; padding: 25px; background-color: var(--bg-main); background-image: linear-gradient(rgba(11, 10, 24, 0.95), rgba(11, 10, 24, 0.95)), url('https://www.transparenttextures.com/patterns/stardust.png'); min-height: calc(100vh - 40px); border-radius: 8px; box-shadow: 0 0 25px rgba(0,0,0,0.7); border: 1px solid var(--border-color); } .container .title-container { background: linear-gradient(135deg, rgba(20, 18, 40, 0.8), rgba(30, 28, 50, 0.6)); margin-bottom: 30px; border: 1px solid var(--border-color); border-radius: 6px; padding: 25px; text-align: center; position: relative; box-shadow: 0 5px 15px rgba(0,0,0,0.4); overflow: hidden; } .container .title-main { color: var(--white); font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 4px; display: block; text-transform: uppercase; text-shadow: 0 0 4px var(--glow-primary), 0 0 8px var(--glow-primary), 0 0 12px var(--glow-primary); font-family: var(--font-title); } .container .lemonade-text { color: var(--secondary-accent); text-shadow: 0 0 8px var(--glow-secondary); } .container .title-subtitle { padding-left: 0; margin-top: 15px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-style: italic; font-weight: 400; letter-spacing: 2px; text-transform: uppercase; opacity: 0.8; } .container img { max-width: 100%; border: 2px solid var(--border-color); margin-bottom: 40px; box-shadow: 0 5px 15px rgba(0,0,0,0.5); border-radius: 4px; } .container .section-container { margin-bottom: 25px; padding-bottom: 25px; border-bottom: 1px dashed var(--border-color); } .container .section-container:last-of-type { border-bottom: none; padding-bottom: 0; margin-bottom: 0; } .container .section-header { display: flex; align-items: center; padding: 0 0 15px 0; } .container .section-title { font-family: var(--font-title); background: linear-gradient(45deg, var(--secondary-accent), var(--primary-accent)); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; font-size: 1.4rem; margin: 0 !important; padding: 0 0 10px 0 !important; letter-spacing: 1px; font-weight: 700; text-transform: uppercase; border: none !important; position: relative; display: inline-block; } .container .section-title::after { content: ''; position: absolute; bottom: 0; left: 0; width: 100%; height: 2px; background-image: linear-gradient(to right, var(--secondary-accent), var(--primary-accent)); box-shadow: 0 0 6px var(--glow-secondary), 0 0 6px var(--glow-primary); border-radius: 2px; } .container .section-content { padding: 20px 0 0 0; } .container .subheading { color: var(--secondary-accent); font-size: 1.1rem; margin-top: 20px; margin-bottom: 12px; font-weight: 700; display: block; text-transform: uppercase; letter-spacing: 2px; font-family: var(--font-title); border-bottom: 1px solid var(--secondary-accent); padding-bottom: 6px; text-shadow: 0 0 4px var(--glow-secondary); } .container .data-box { background-color: var(--bg-card); padding: 15px; border: 1px solid var(--border-color); border-left: 2px solid var(--primary-accent); margin-bottom: 15px; box-shadow: inset 0 0 6px rgba(0,0,0,0.4); border-radius: 4px; font-size: 1rem; } .container .data-row { display: flex; align-items: center; margin-bottom: 6px; padding: 5px 0; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--secondary-accent); font-weight: bold; margin-right: 10px; font-family: var(--font-code); font-size: 1rem; } .container .data-label { color: var(--white); font-weight: 600; font-family: var(--font-body); margin-right: 8px; min-width: 80px; } .container a { color: var(--primary-accent); text-decoration: none; font-weight: 600; transition: all .2s; } .container .data-row a { border-bottom: 1px dotted var(--primary-accent); } .container a:hover { text-decoration: none; color: var(--white); text-shadow: 0 0 5px var(--glow-primary); } .container .data-row a:hover { border-bottom-style: solid; } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; color: var(--text-muted); font-size: 1.1rem; font-weight: 700; text-transform: none; font-family: var(--font-title); letter-spacing: 1px; list-style: none; transition: color 0.2s ease; } .container .dropdown-summary:hover { color: var(--primary-accent); } .container .dropdown-arrow { color: var(--secondary-accent); margin-right: 10px; transition: transform 0.2s ease; } .container .dropdown-content { margin-top: 15px; padding: 20px; background-color: var(--bg-card); border: 1px solid var(--border-color); border-radius: 4px; } .container .config-title { color: var(--text-muted); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-body); text-transform: uppercase; letter-spacing: 1px; font-weight: 700; } .container pre { background-color: #1c1c1c; padding: 15px; border: 1px solid var(--border-color); white-space: pre-wrap; word-wrap: break-word; color: #c5c8c6; border-radius: 4px; box-shadow: inset 0 0 5px rgba(0,0,0,0.5); } .container pre code { background: none; color: inherit; padding: 0; border-radius: 0; } .container code { font-family: var(--font-code); color: var(--primary-accent); background: var(--border-color); padding: 2px 5px; border-radius: 4px; } No layer left behind edition. Upscale redone with the missing final layer included. The original upscales were always missing a layer, but I never troubleshooted to identify what layer was missing. Turns out it was the final layer. That's kind of an important one. This model is an uncensored, creative writing and RP model. Compared to the older version, it is smarter and I think has a bit less repetition. The old V2 version though is slightly more creative due to the instability it had. Creation Process: Upscale > CPT > SFT > DPO Pretrained on approx 300MB of light novel and FineWeb-2 corpus. SFT on approx 8 million tokens, SFW / NSFW RP, stories and creative instruct data. DPO on a high quality RP / NSFW dataset with a focus on improving instruction following, reducing repetition and fixing common model mistakes. Merge configurations used during the model creation process. Upscale (Passthrough) basemodel: ConicCat/Mistral-Small-3.2-AntiRep-24B mergemethod: passthrough dtype: bfloat16 slices: - sources: - model: ConicCat/Mistral-Small-3.2-AntiRep-24B layerrange: [0, 29] - sources: - model: ConicCat/Mistral-Small-3.2-AntiRep-24B layerrange: [10, 40] Not optimized for cost / performance efficiency, YMMV. Pretrain 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ../mergekit/pfv3upscale modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/pretraindatasetv5stripped.jsonl type: completion datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 64 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 4e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 12288 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 40 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V3-PT-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V3-PT wandbentity: yourentity wandbname: Visage-V3-PT-1 SFT 4H100 # ==================== MODEL CONFIGURATION ==================== basemodel: ./Visage-V3-PT-1/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken ==================== DATASET CONFIGURATION ==================== datasets: - path: ./data/dataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses ==================== QLORA CONFIGURATION ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true loramodulestosave: # Uncomment only if you added NEW tokens ==================== TRAINING PARAMETERS ==================== numepochs: 3 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 ==================== SEQUENCE & PACKING ==================== sequencelen: 8192 samplepacking: true padtosequencelen: true ==================== HARDWARE OPTIMIZATIONS ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this ==================== EVALUATION & CHECKPOINTING ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false ==================== LOGGING & OUTPUT ==================== outputdir: ./Visage-V3-PT-1-SFT-2 loggingsteps: 1 savesafetensors: true ==================== WANDB TRACKING ==================== wandbproject: Visage-V3-SFT wandbentity: yourentity wandbname: Visage-V3-PT-1-SFT-2 DPO 2H200 # ==================== MODEL CONFIGURATION ==================== basemodel: ./Visage-V3-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken ==================== RL/DPO CONFIGURATION ==================== rl: dpo rlbeta: 0.085 ==================== DATASET CONFIGURATION ==================== datasets: - path: ./data/handcrafteddatasetmistralrep.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] - path: ./data/approvedautomatedl3dataset.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses ==================== QLORA CONFIGURATION ==================== adapter: lora loadin8bit: true lorar: 16 loraalpha: 32 loradropout: 0.1 loratargetlinear: true loramodulestosave: # Uncomment only if you added NEW tokens ==================== TRAINING PARAMETERS ==================== numepochs: 1 microbatchsize: 2 gradientaccumulationsteps: 4 learningrate: 2e-6 optimizer: adamwtorchfused lrscheduler: cosine warmupsteps: 5 weightdecay: 0.01 maxgradnorm: 1.0 ==================== SEQUENCE CONFIGURATION ==================== sequencelen: 8192 padtosequencelen: true ==================== HARDWARE OPTIMIZATIONS ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this deepspeed: deepspeedconfigs/zero1.json ==================== CHECKPOINTING ==================== savesteps: 10 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false ==================== LOGGING & OUTPUT ==================== outputdir: ./Visage-V3-PT-1-SFT-2-DPO-2 loggingsteps: 1 savesafetensors: true ==================== WANDB TRACKING ==================== wandbproject: Visage-V3-DPO wandbentity: yourentity wandbname: Visage-V3-PT-1-SFT-2-DPO-2

NaNK
9
1

L3.3-GeneticLemonade-Unleashed-70B-4.5bpw-h6-exl2

NaNK
llama
8
0

MS3.2-PaintedFantasy-Visage-33B_exl3_4bpw

NaNK
license:apache-2.0
8
0

MS3.2-PaintedFantasy-Visage-33B_exl3_5bpw

NaNK
license:apache-2.0
8
0

MS3.2-PaintedFantasy-v2-24b-exl3-3.5bpw

body { font-family: 'Georgia', 'Times New Roman', serif; color: #dce4f0; / Soft off-white / line-height: 1.6; margin: 0; padding: 0; background-color: #161a25; / Deep blue from dark sky / } .lemonade-text { color: #89d8ff; / Bright blue from city lights / position: relative; z-index: 2; margin-left: 0.2em; text-shadow: 0 0 15px #89d8ff; } / Section styling / .section-container { background-color: rgba(32, 40, 56, 0.7); / Slightly transparent dark blue / margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #ff9966; / Sunset orange / box-shadow: 0 4px 15px rgba(255, 153, 102, 0.05); } .section-header { display: flex; align-items: center; background-color: rgba(255, 153, 102, 0.12); padding: 10px 20px; } .section-indicator { width: 8px; height: 20px; background-color: #ff9966; / Sunset orange / margin-right: 15px; box-shadow: 0 0 8px rgba(255, 153, 102, 0.2); } .section-title { font-family: 'Playfair Display', serif; / Using the new font / color: #ffb399; / Lighter sunset shade / font-size: 1.4rem; margin: 0; letter-spacing: 1px; font-weight: 400; text-transform: capitalize; } .section-content { padding: 20px; font-family: 'Crimson Text', serif; / Using the new font / color: #dce4f0; line-height: 1.6; } / Title styling / .title-container { background-color: #202838; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #ff9966; / Sunset orange / box-shadow: 0 6px 20px rgba(255, 153, 102, 0.07); } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Playfair Display', serif; } .title-main { color: #ffb399; / Lighter sunset shade / font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #a6c8e0; / Muted sky blue / font-size: 1.2rem; font-family: 'Crimson Text', serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(137, 216, 255, 0.08) 1px, rgba(0,0,0,0) 2px); / Rain effect with blue tint / z-index: 1; } / Data box styling / .data-box { background-color: rgba(22, 26, 37, 0.6); padding: 15px; border-left: 2px solid #ff9966; / Sunset orange / margin-bottom: 20px; box-shadow: 0 2px 10px rgba(255, 153, 102, 0.05); } .data-arrow { color: #ff9966; / Sunset orange / width: 20px; display: inline-block; } .data-label { color: #a6c8e0; / Muted sky blue / width: 80px; display: inline-block; } / Subheading styling / .subheading { color: #a6c8e0; / Muted sky blue / font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(166, 200, 224, 0.4); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Playfair Display', serif; } / Links / a { color: #89d8ff; / Bright blue from city lights / text-decoration: none; } a:hover { text-decoration: underline; color: #ffb399; / Lighter sunset shade on hover / } / Container / .container { max-width: 1200px; margin: 20px auto; padding: 40px 20px; background-color: #202838; / Darker container background / background-image: radial-gradient(circle at 20% 80%, rgba(255, 153, 102, 0.04) 0%, transparent 50%), / Sunset glow / radial-gradient(circle at 80% 20%, rgba(137, 216, 255, 0.04) 0%, transparent 50%), / Blue glow / radial-gradient(circle at 40% 40%, rgba(224, 230, 241, 0.02) 0%, transparent 50%); / Faint cloud/light glow / min-height: calc(100vh - 40px); border: 1px solid #ff9966; / Sunset orange / border-radius: 8px; box-shadow: 0 8px 32px rgba(255, 153, 102, 0.07); } / Dropdown styling / .dropdown-container { margin-top: 20px; } .dropdown-summary { cursor: pointer; padding: 10px 0; border-bottom: 1px dashed rgba(166, 200, 224, 0.4); color: #a6c8e0; / Muted sky blue / font-size: 1.1rem; font-weight: 400; text-transform: uppercase; letter-spacing: 1px; font-family: 'Playfair Display', serif; list-style: none; display: flex; align-items: center; } .dropdown-summary::-webkit-details-marker { display: none; } .dropdown-arrow { color: #ff9966; / Sunset orange / margin-right: 10px; transition: transform 0.3s ease; } .dropdown-container[open] .dropdown-arrow { transform: rotate(90deg); } .dropdown-content { margin-top: 15px; padding: 15px; background-color: rgba(22, 26, 37, 0.6); border-left: 2px solid #ff9966; / Sunset orange / box-shadow: 0 2px 10px rgba(255, 153, 102, 0.05); } .config-title { color: #a6c8e0; / Muted sky blue / font-size: 1rem; margin-bottom: 10px; font-family: 'Playfair Display', serif; text-transform: uppercase; letter-spacing: 1px; } This is an uncensored creative model intended to excel at character driven RP / ERP. Version 2 feels quite different from the original, with a heavy focus on reducing repetition across conversations and improving instruction following. Has a pretty unique writing style and sense of creativity (IMO). Pays the price with intermittent brain farts though. Training process: SFT > DPO > KTO SFT with RP/ERP, Stories and in character assistant data. DPO focused on reducing repetition, misgendered characters and slop. KTO focused on further reducing repetition and slop. Not optimized for cost / performance efficiency, YMMV. SFT 1H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ConicCat/Mistral-Small-3.2-AntiRep-24B modeltype: AutoModelForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./dataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 3 microbatchsize: 8 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 8192 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: true # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./PT-SFT1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: PF-SFT wandbentity: yourentity wandbname: runname

NaNK
license:apache-2.0
8
0

MS3.2-PaintedFantasy-Visage-v2-33B-exl3-6bpw

.container { --primary-accent: #EC83B1; --secondary-accent: #86C5E5; --tertiary-accent: #FDE484; --accent-rose: #F8A5C2; --bg-main: #1A1D2E; --bg-container: #232741; --bg-card: rgba(40, 45, 70, 0.7); --text-main: #E8ECF0; --text-muted: #B8C2D0; --white: #FFFFFF; --font-title: 'Inter', serif; --font-heading: 'Inter', serif; --font-body: 'Inter', serif; --font-code: 'JetBrains Mono', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; max-width: 1200px; margin: 20px auto; padding: 40px 20px; background-color: var(--bg-container); background-image: radial-gradient(circle at 20% 80%, rgba(236, 131, 177, 0.04) 0%, transparent 50%), radial-gradient(circle at 80% 20%, rgba(134, 197, 229, 0.04) 0%, transparent 50%), radial-gradient(circle at 40% 40%, rgba(253, 228, 132, 0.02) 0%, transparent 50%); min-height: calc(100vh - 40px); border: 1px solid var(--primary-accent); border-radius: 8px; box-shadow: 0 8px 32px rgba(236, 131, 177, 0.07); } .container .title-container { background-color: var(--bg-main); position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid var(--primary-accent); box-shadow: 0 6px 20px rgba(236, 131, 177, 0.07); } .container .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: var(--font-title); } .container .title-main { color: var(--accent-rose); font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .container .title-prefix { position: relative; z-index: 2; } .container .lemonade-text { color: var(--secondary-accent); position: relative; z-index: 2; margin-left: 0.2em; text-shadow: 0 0 15px var(--secondary-accent); } .container .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .container .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(134, 197, 229, 0.08) 1px, rgba(0,0,0,0) 2px); z-index: 1; } .container img { max-width: 100%; border: 3px solid var(--white); margin-bottom: 30px; box-shadow: 0 0 15px rgba(0, 0, 0, 0.3); } .container .section-container { background-color: var(--bg-card); margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: none !important; box-shadow: 0 4px 15px rgba(236, 131, 177, 0.05); } .container .section-header { display: flex; align-items: center; background-color: rgba(236, 131, 177, 0.12); padding: 10px 20px; border-bottom: none !important; } .container .section-indicator { width: 8px; height: 20px; background-color: var(--primary-accent); margin-right: 15px; box-shadow: 0 0 8px rgba(236, 131, 177, 0.2); } .container .section-title { font-family: var(--font-heading); color: var(--accent-rose); font-size: 1.4rem; margin: 0 !important; padding: 0 !important; letter-spacing: 1px; font-weight: 400; text-transform: capitalize; border-bottom: none !important; } .container .section-content { padding: 20px; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; } .container .subheading { color: var(--text-muted); font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(184, 194, 208, 0.4); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: var(--font-heading); } .container .data-box { background-color: rgba(26, 29, 46, 0.6); padding: 15px; border-left: 2px solid var(--primary-accent); margin-bottom: 20px; box-shadow: 0 2px 10px rgba(236, 131, 177, 0.05); } .container .data-row { display: flex; margin-bottom: 8px; align-items: center; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--primary-accent); width: 20px; display: inline-block; } .container .data-label { color: var(--text-muted); width: 80px; display: inline-block; } .container a { color: var(--secondary-accent); text-decoration: none; font-weight: 600; transition: color .3s; } .container a:hover { text-decoration: underline; color: var(--accent-rose); } .container .data-box a { position: relative; background-image: linear-gradient(to top, var(--primary-accent), var(--primary-accent)); background-position: 0 100%; background-repeat: no-repeat; background-size: 0% 2px; transition: background-size .3s, color .3s; } .container .data-box a:hover { color: var(--primary-accent); background-size: 100% 2px; } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; border-bottom: 1px dashed rgba(184, 194, 208, 0.4); color: var(--text-muted); font-size: 1.1rem; font-weight: 400; text-transform: uppercase; letter-spacing: 1px; font-family: var(--font-heading); list-style: none; display: flex; align-items: center; } .container .dropdown-summary::-webkit-details-marker { display: none; } .container .dropdown-arrow { color: var(--primary-accent); margin-right: 10px; transition: transform 0.3s ease; } .container details[open] .dropdown-arrow { transform: rotate(90deg); } .container .dropdown-content { margin-top: 15px; padding: 15px; background-color: rgba(26, 29, 46, 0.6); border-left: 2px solid var(--primary-accent); box-shadow: 0 2px 10px rgba(236, 131, 177, 0.05); } .container .config-title { color: var(--text-muted); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-heading); text-transform: uppercase; letter-spacing: 1px; } .container pre { background-color: var(--bg-main); padding: 15px; border: 1px solid rgba(134, 197, 229, 0.4); white-space: pre-wrap; word-wrap: break-word; color: var(--text-main); border-radius: 4px; } .container code { font-family: var(--font-code); background: transparent; padding: 0; } A surprisingly difficult model to work with. Removing the repetition was coming at the expense of the unique creativity the original upscale had. Decided on upscaling Painted Fantasy v2, healing it and then merging the original upscale back in. The result is a smarter, uncensored, creative model that excels at character driven RP / ERP where characters are portrayed creatively and proactively. Creation Process: Upscale > PT > SFT > KTO > DPO Pretrained on approx 300MB of light novels, stories and FineWeb-2 corpus. SFT on approx 8 million tokens, SFW / NSFW RP, stories and creative instruct data. KTO on antirep data created from the SFT datasets. Rejected examples generated by MS3.2 with repetitionpenalty=0.9 and OOC commands encouraging it to misgender, impersonate user etc. DPO on a high quality RP / NSFW dataset that is unreleased using rejected samples created in the same method as KTO. Resulting model was non repetitive, but had lost some of the spark the original upscale had. Merged the original upscale back in, making sure to not reintroduce repetition. Merge configurations used during the model creation process. Initial Upscale (Passthrough) basemodel: zerofata/MS3.2-PaintedFantasy-v2-24B dtype: bfloat16 slices: - sources: - model: zerofata/MS3.2-PaintedFantasy-v2-24B layerrange: [0, 29] - sources: - model: zerofata/MS3.2-PaintedFantasy-v2-24B layerrange: [10, 39] Final Merge (Slerp) models: - model: zerofata/MS3.2-PaintedFantasy-Visage-33B - model: ../axolotl/Visage-V2-PT-1-SFT-2-KTO-1-DPO-1/merged mergemethod: slerp basemodel: ../axolotl/Visage-V2-PT-1-SFT-2-KTO-1-DPO-1/merged parameters: t: [0.4, 0.2, 0, 0.2, 0.4] dtype: bfloat16 Not optimized for cost / performance efficiency, YMMV. Pretrain 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ../mergekit/pfv2upscale modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/pretraindatasetv5stripped.jsonl type: completion datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 64 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 4e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 12288 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 40 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-PT # wandbentity: yourentity wandbname: Visage-V2-PT-1 SFT 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/automateddataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/handcrafteddataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/instructdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/cwdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/storiesdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/cwclaudedataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/summariesdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 2 microbatchsize: 2 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 8192 samplepacking: true padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-SFT # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2 KTO 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # RL/DPO CONFIGURATION # ==================== rl: kto rlbeta: 0.1 ktodesirableweight: 1.25 ktoundesirableweight: 1.0 # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./handcrafteddatasetkto.jsonl type: llama3.argilla - path: ./approvedrpdatasetkto.jsonl type: llama3.argilla - path: ./instructdatasetkto.jsonl type: llama3.argilla datasetpreparedpath: trainoninputs: false # Only train on assistant responses removeunusedcolumns: False # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 32 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 4 learningrate: 5e-6 optimizer: adamw8bit lrscheduler: cosine warmupsteps: 15 weightdecay: 0.001 maxgradnorm: 0.01 # ==================== # SEQUENCE CONFIGURATION # ==================== sequencelen: 8192 padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # CHECKPOINTING # ==================== savesteps: 100 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2-KTO-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-KTO # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2-KTO-1 DPO 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # RL/DPO CONFIGURATION # ==================== rl: dpo rlbeta: 0.1 # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./handcrafteddatasetmistralrep.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 16 loraalpha: 32 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 2 gradientaccumulationsteps: 1 learningrate: 2e-6 optimizer: adamw8bit lrscheduler: cosine warmupsteps: 5 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE CONFIGURATION # ==================== sequencelen: 8192 padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # CHECKPOINTING # ==================== savesteps: 10 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2-DPO-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-DPO # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2-DPO-1

NaNK
license:apache-2.0
8
0

MS3.2-PaintedFantasy-Visage-33B

.container { --bg-main: #0A0C10; --bg-card: #10121A; --primary-accent: #FDE43B; --secondary-accent: #F8602C; --text-main: #F0F2F5; --text-dark: #10121A; --white: #FFFFFF; --font-title: 'Syncopate', sans-serif; --font-heading: 'Rajdhani', sans-serif; --font-body: 'Exo 2', sans-serif; --font-code: 'JetBrains Mono', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.7; max-width: 900px; margin: 40px auto; background-color: var(--bg-card); border: 2px solid var(--secondary-accent); padding: 25px 40px; box-shadow: 0 0 10px rgba(248, 96, 44, 0.4), 0 0 30px rgba(253, 228, 59, 0.2); position: relative; clip-path: polygon(0 0, 100% 0, 100% 100%, 15px 100%, 0 calc(100% - 15px)); } .container .title-container { text-align: left; padding-bottom: 25px; margin-bottom: 35px; border-bottom: 2px solid var(--primary-accent); position: relative; } .container .title-container::before, .container .title-container::after { all: unset; } .container .title-main { font-family: var(--font-title); font-size: 3rem; font-weight: 700; color: var(--white); text-transform: uppercase; letter-spacing: 5px; margin: 0; text-shadow: 0 0 12px rgba(253, 228, 59, 0.7); } .container .lemonade-text { color: var(--primary-accent); } .container .subtitle-text { font-family: var(--font-heading); font-size: 1.2rem; font-weight: 600; color: var(--secondary-accent); text-transform: uppercase; letter-spacing: 1px; text-shadow: 0 0 8px rgba(248, 96, 44, 0.6); } .container .glitchy-overlay, .container .title-wrapper, .container .title-prefix, .container .title-subtitle { all: unset; } .container .title-subtitle { display: block; margin-top: 5px; } .container img { max-width: 100%; border: 2px solid var(--white); margin-bottom: 30px; box-shadow: 0 0 15px rgba(255, 255, 255, 0.2); transform: rotate(-1deg); } .container .section-container { margin-bottom: 35px; padding-bottom: 35px; position: relative; border: none; } .container .section-container:not(:last-child)::after { content: ''; position: absolute; bottom: 0; left: 5%; right: 5%; height: 1px; background: linear-gradient(90deg, var(--bg-card), var(--primary-accent), var(--bg-card)); transform: skewY(-2deg); } .container .section-header, .container .section-content { all: unset; display: block; } .container .section-title { font-family: var(--font-heading); font-size: 1.6rem; font-weight: 600; color: var(--text-dark); background-color: var(--primary-accent); margin-bottom: 25px; text-transform: uppercase; letter-spacing: 1.5px; display: inline-block; padding: 8px 30px 8px 20px; clip-path: polygon(0 0, 100% 0, calc(100% - 20px) 100%, 0% 100%); } .container .subheading { font-family: var(--font-heading); font-size: 1.2rem; color: var(--secondary-accent); font-weight: 600; text-transform: uppercase; letter-spacing: 1px; margin-top: 25px; margin-bottom: 15px; border: none; } .container .data-box { background-color: var(--bg-main); border: 1px solid var(--secondary-accent); padding: 20px 25px; margin-top: 15px; clip-path: polygon(15px 0, 100% 0, 100% calc(100% - 15px), 0 100%, 0 15px); } .container .data-row { display: flex; margin-bottom: 12px; align-items: center; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { display: none; } .container .data-label { color: var(--text-main); width: 90px; font-weight: 600; flex-shrink: 0; opacity: 0.8; } .container a { color: var(--secondary-accent); text-decoration: none; font-weight: 600; transition: color .3s; } .container a:hover { color: var(--primary-accent); } / Add line hover effect only to specific text links / .container .data-box a { position: relative; background-image: linear-gradient(to top, var(--primary-accent), var(--primary-accent)); background-position: 0 100%; background-repeat: no-repeat; background-size: 0% 2px; transition: background-size .3s, color .3s; } .container .data-box a:hover { color: var(--primary-accent); background-size: 100% 2px; } .container .dropdown-container { margin-top: 25px; } .container .dropdown-summary { cursor: pointer; color: var(--secondary-accent); font-size: 1.2rem; font-family: var(--font-heading); font-weight: 600; text-transform: uppercase; letter-spacing: 1px; list-style: none; padding: 5px 0; display: flex; align-items: center; } .container .dropdown-summary::-webkit-details-marker { display: none; } .container .dropdown-arrow { color: var(--primary-accent); margin-right: 15px; transition: transform 0.2s ease; display: inline-block; } .container .dropdown-container[open] .dropdown-arrow { transform: rotate(90deg); } .container .dropdown-content { margin-top: 15px; padding: 20px; background-color: var(--bg-main); border-left: 3px solid var(--primary-accent); } .container .config-title { color: var(--secondary-accent); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-heading); text-transform: uppercase; letter-spacing: 1px; } .container pre { background-color: var(--bg-main); padding: 15px; border: 1px solid rgba(248, 96, 44, 0.4); white-space: pre-wrap; word-wrap: break-word; color: var(--text-main); } .container code { font-family: var(--font-code); } Another experimental release. Mistral Small 3.2 24B upscaled by 18 layers to create a 33.6B model. This model then went through pretraining, SFT & DPO. Can't guarantee the Mistral 3.2 repetition issues are fixed, but this model seems to be less repetitive than my previous attempt. This is an uncensored creative model intended to excel at character driven RP / ERP where characters are portrayed creatively and proactively. Creation process: Upscale > Pretrain > SFT > DPO All training was qlora (including pretrain). Pretrained on 177MB of data. Dataset consisteted mostly of Light Novels, NSFW stories, SFW stories and filled out with general corpus text from Huggingface FineWeb-2 dataset. The model then went through SFT using a dataset of approx 3.6 million tokens, 700 RP conversations, 1000 creative writing / instruct samples and about 100 summaries. The bulk of this data has been made public. Finally, DPO was used to make the model more consistent. basemodel: anthracite-core/Mistral-Small-3.2-24B-Instruct-2506-Text-Only mergemethod: passthrough dtype: bfloat16 slices: - sources: - model: anthracite-core/Mistral-Small-3.2-24B-Instruct-2506-Text-Only layerrange: [0, 29] - sources: - model: anthracite-core/Mistral-Small-3.2-24B-Instruct-2506-Text-Only layerrange: [10, 39] Not optimized for cost / performance efficiency, YMMV. SFT 1H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./UpscaleMistral-PT/merged modeltype: AutoModelForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./dataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 2 microbatchsize: 4 gradientaccumulationsteps: 2 learningrate: 1.5e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 8192 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: true # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 5 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./UpscaleMistral-PT-SFT-2 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: MS3-2-SFT wandbentity: yourentity wandbname: runname

NaNK
license:apache-2.0
7
21

MS3.2-PaintedFantasy-v2-24b-exl3-6bpw

body { font-family: 'Georgia', 'Times New Roman', serif; color: #dce4f0; / Soft off-white / line-height: 1.6; margin: 0; padding: 0; background-color: #161a25; / Deep blue from dark sky / } .lemonade-text { color: #89d8ff; / Bright blue from city lights / position: relative; z-index: 2; margin-left: 0.2em; text-shadow: 0 0 15px #89d8ff; } / Section styling / .section-container { background-color: rgba(32, 40, 56, 0.7); / Slightly transparent dark blue / margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #ff9966; / Sunset orange / box-shadow: 0 4px 15px rgba(255, 153, 102, 0.05); } .section-header { display: flex; align-items: center; background-color: rgba(255, 153, 102, 0.12); padding: 10px 20px; } .section-indicator { width: 8px; height: 20px; background-color: #ff9966; / Sunset orange / margin-right: 15px; box-shadow: 0 0 8px rgba(255, 153, 102, 0.2); } .section-title { font-family: 'Playfair Display', serif; / Using the new font / color: #ffb399; / Lighter sunset shade / font-size: 1.4rem; margin: 0; letter-spacing: 1px; font-weight: 400; text-transform: capitalize; } .section-content { padding: 20px; font-family: 'Crimson Text', serif; / Using the new font / color: #dce4f0; line-height: 1.6; } / Title styling / .title-container { background-color: #202838; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #ff9966; / Sunset orange / box-shadow: 0 6px 20px rgba(255, 153, 102, 0.07); } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Playfair Display', serif; } .title-main { color: #ffb399; / Lighter sunset shade / font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #a6c8e0; / Muted sky blue / font-size: 1.2rem; font-family: 'Crimson Text', serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(137, 216, 255, 0.08) 1px, rgba(0,0,0,0) 2px); / Rain effect with blue tint / z-index: 1; } / Data box styling / .data-box { background-color: rgba(22, 26, 37, 0.6); padding: 15px; border-left: 2px solid #ff9966; / Sunset orange / margin-bottom: 20px; box-shadow: 0 2px 10px rgba(255, 153, 102, 0.05); } .data-arrow { color: #ff9966; / Sunset orange / width: 20px; display: inline-block; } .data-label { color: #a6c8e0; / Muted sky blue / width: 80px; display: inline-block; } / Subheading styling / .subheading { color: #a6c8e0; / Muted sky blue / font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(166, 200, 224, 0.4); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Playfair Display', serif; } / Links / a { color: #89d8ff; / Bright blue from city lights / text-decoration: none; } a:hover { text-decoration: underline; color: #ffb399; / Lighter sunset shade on hover / } / Container / .container { max-width: 1200px; margin: 20px auto; padding: 40px 20px; background-color: #202838; / Darker container background / background-image: radial-gradient(circle at 20% 80%, rgba(255, 153, 102, 0.04) 0%, transparent 50%), / Sunset glow / radial-gradient(circle at 80% 20%, rgba(137, 216, 255, 0.04) 0%, transparent 50%), / Blue glow / radial-gradient(circle at 40% 40%, rgba(224, 230, 241, 0.02) 0%, transparent 50%); / Faint cloud/light glow / min-height: calc(100vh - 40px); border: 1px solid #ff9966; / Sunset orange / border-radius: 8px; box-shadow: 0 8px 32px rgba(255, 153, 102, 0.07); } / Dropdown styling / .dropdown-container { margin-top: 20px; } .dropdown-summary { cursor: pointer; padding: 10px 0; border-bottom: 1px dashed rgba(166, 200, 224, 0.4); color: #a6c8e0; / Muted sky blue / font-size: 1.1rem; font-weight: 400; text-transform: uppercase; letter-spacing: 1px; font-family: 'Playfair Display', serif; list-style: none; display: flex; align-items: center; } .dropdown-summary::-webkit-details-marker { display: none; } .dropdown-arrow { color: #ff9966; / Sunset orange / margin-right: 10px; transition: transform 0.3s ease; } .dropdown-container[open] .dropdown-arrow { transform: rotate(90deg); } .dropdown-content { margin-top: 15px; padding: 15px; background-color: rgba(22, 26, 37, 0.6); border-left: 2px solid #ff9966; / Sunset orange / box-shadow: 0 2px 10px rgba(255, 153, 102, 0.05); } .config-title { color: #a6c8e0; / Muted sky blue / font-size: 1rem; margin-bottom: 10px; font-family: 'Playfair Display', serif; text-transform: uppercase; letter-spacing: 1px; } This is an uncensored creative model intended to excel at character driven RP / ERP. Version 2 feels quite different from the original, with a heavy focus on reducing repetition across conversations and improving instruction following. Has a pretty unique writing style and sense of creativity (IMO). Pays the price with intermittent brain farts though. Training process: SFT > DPO > KTO SFT with RP/ERP, Stories and in character assistant data. DPO focused on reducing repetition, misgendered characters and slop. KTO focused on further reducing repetition and slop. Not optimized for cost / performance efficiency, YMMV. SFT 1H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ConicCat/Mistral-Small-3.2-AntiRep-24B modeltype: AutoModelForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./dataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 3 microbatchsize: 8 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 8192 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: true # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./PT-SFT1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: PF-SFT wandbentity: yourentity wandbname: runname

NaNK
license:apache-2.0
7
0

MS3.2-PaintedFantasy-Visage-v2-33B-exl3-3bpw

.container { --primary-accent: #EC83B1; --secondary-accent: #86C5E5; --tertiary-accent: #FDE484; --accent-rose: #F8A5C2; --bg-main: #1A1D2E; --bg-container: #232741; --bg-card: rgba(40, 45, 70, 0.7); --text-main: #E8ECF0; --text-muted: #B8C2D0; --white: #FFFFFF; --font-title: 'Inter', serif; --font-heading: 'Inter', serif; --font-body: 'Inter', serif; --font-code: 'JetBrains Mono', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; max-width: 1200px; margin: 20px auto; padding: 40px 20px; background-color: var(--bg-container); background-image: radial-gradient(circle at 20% 80%, rgba(236, 131, 177, 0.04) 0%, transparent 50%), radial-gradient(circle at 80% 20%, rgba(134, 197, 229, 0.04) 0%, transparent 50%), radial-gradient(circle at 40% 40%, rgba(253, 228, 132, 0.02) 0%, transparent 50%); min-height: calc(100vh - 40px); border: 1px solid var(--primary-accent); border-radius: 8px; box-shadow: 0 8px 32px rgba(236, 131, 177, 0.07); } .container .title-container { background-color: var(--bg-main); position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid var(--primary-accent); box-shadow: 0 6px 20px rgba(236, 131, 177, 0.07); } .container .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: var(--font-title); } .container .title-main { color: var(--accent-rose); font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .container .title-prefix { position: relative; z-index: 2; } .container .lemonade-text { color: var(--secondary-accent); position: relative; z-index: 2; margin-left: 0.2em; text-shadow: 0 0 15px var(--secondary-accent); } .container .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .container .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(134, 197, 229, 0.08) 1px, rgba(0,0,0,0) 2px); z-index: 1; } .container img { max-width: 100%; border: 3px solid var(--white); margin-bottom: 30px; box-shadow: 0 0 15px rgba(0, 0, 0, 0.3); } .container .section-container { background-color: var(--bg-card); margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: none !important; box-shadow: 0 4px 15px rgba(236, 131, 177, 0.05); } .container .section-header { display: flex; align-items: center; background-color: rgba(236, 131, 177, 0.12); padding: 10px 20px; border-bottom: none !important; } .container .section-indicator { width: 8px; height: 20px; background-color: var(--primary-accent); margin-right: 15px; box-shadow: 0 0 8px rgba(236, 131, 177, 0.2); } .container .section-title { font-family: var(--font-heading); color: var(--accent-rose); font-size: 1.4rem; margin: 0 !important; padding: 0 !important; letter-spacing: 1px; font-weight: 400; text-transform: capitalize; border-bottom: none !important; } .container .section-content { padding: 20px; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; } .container .subheading { color: var(--text-muted); font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(184, 194, 208, 0.4); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: var(--font-heading); } .container .data-box { background-color: rgba(26, 29, 46, 0.6); padding: 15px; border-left: 2px solid var(--primary-accent); margin-bottom: 20px; box-shadow: 0 2px 10px rgba(236, 131, 177, 0.05); } .container .data-row { display: flex; margin-bottom: 8px; align-items: center; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--primary-accent); width: 20px; display: inline-block; } .container .data-label { color: var(--text-muted); width: 80px; display: inline-block; } .container a { color: var(--secondary-accent); text-decoration: none; font-weight: 600; transition: color .3s; } .container a:hover { text-decoration: underline; color: var(--accent-rose); } .container .data-box a { position: relative; background-image: linear-gradient(to top, var(--primary-accent), var(--primary-accent)); background-position: 0 100%; background-repeat: no-repeat; background-size: 0% 2px; transition: background-size .3s, color .3s; } .container .data-box a:hover { color: var(--primary-accent); background-size: 100% 2px; } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; border-bottom: 1px dashed rgba(184, 194, 208, 0.4); color: var(--text-muted); font-size: 1.1rem; font-weight: 400; text-transform: uppercase; letter-spacing: 1px; font-family: var(--font-heading); list-style: none; display: flex; align-items: center; } .container .dropdown-summary::-webkit-details-marker { display: none; } .container .dropdown-arrow { color: var(--primary-accent); margin-right: 10px; transition: transform 0.3s ease; } .container details[open] .dropdown-arrow { transform: rotate(90deg); } .container .dropdown-content { margin-top: 15px; padding: 15px; background-color: rgba(26, 29, 46, 0.6); border-left: 2px solid var(--primary-accent); box-shadow: 0 2px 10px rgba(236, 131, 177, 0.05); } .container .config-title { color: var(--text-muted); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-heading); text-transform: uppercase; letter-spacing: 1px; } .container pre { background-color: var(--bg-main); padding: 15px; border: 1px solid rgba(134, 197, 229, 0.4); white-space: pre-wrap; word-wrap: break-word; color: var(--text-main); border-radius: 4px; } .container code { font-family: var(--font-code); background: transparent; padding: 0; } A surprisingly difficult model to work with. Removing the repetition was coming at the expense of the unique creativity the original upscale had. Decided on upscaling Painted Fantasy v2, healing it and then merging the original upscale back in. The result is a smarter, uncensored, creative model that excels at character driven RP / ERP where characters are portrayed creatively and proactively. Creation Process: Upscale > PT > SFT > KTO > DPO Pretrained on approx 300MB of light novels, stories and FineWeb-2 corpus. SFT on approx 8 million tokens, SFW / NSFW RP, stories and creative instruct data. KTO on antirep data created from the SFT datasets. Rejected examples generated by MS3.2 with repetitionpenalty=0.9 and OOC commands encouraging it to misgender, impersonate user etc. DPO on a high quality RP / NSFW dataset that is unreleased using rejected samples created in the same method as KTO. Resulting model was non repetitive, but had lost some of the spark the original upscale had. Merged the original upscale back in, making sure to not reintroduce repetition. Merge configurations used during the model creation process. Initial Upscale (Passthrough) basemodel: zerofata/MS3.2-PaintedFantasy-v2-24B dtype: bfloat16 slices: - sources: - model: zerofata/MS3.2-PaintedFantasy-v2-24B layerrange: [0, 29] - sources: - model: zerofata/MS3.2-PaintedFantasy-v2-24B layerrange: [10, 39] Final Merge (Slerp) models: - model: zerofata/MS3.2-PaintedFantasy-Visage-33B - model: ../axolotl/Visage-V2-PT-1-SFT-2-KTO-1-DPO-1/merged mergemethod: slerp basemodel: ../axolotl/Visage-V2-PT-1-SFT-2-KTO-1-DPO-1/merged parameters: t: [0.4, 0.2, 0, 0.2, 0.4] dtype: bfloat16 Not optimized for cost / performance efficiency, YMMV. Pretrain 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ../mergekit/pfv2upscale modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/pretraindatasetv5stripped.jsonl type: completion datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 64 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 4e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 12288 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 40 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-PT # wandbentity: yourentity wandbname: Visage-V2-PT-1 SFT 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/automateddataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/handcrafteddataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/instructdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/cwdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/storiesdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/cwclaudedataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/summariesdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 2 microbatchsize: 2 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 8192 samplepacking: true padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-SFT # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2 KTO 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # RL/DPO CONFIGURATION # ==================== rl: kto rlbeta: 0.1 ktodesirableweight: 1.25 ktoundesirableweight: 1.0 # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./handcrafteddatasetkto.jsonl type: llama3.argilla - path: ./approvedrpdatasetkto.jsonl type: llama3.argilla - path: ./instructdatasetkto.jsonl type: llama3.argilla datasetpreparedpath: trainoninputs: false # Only train on assistant responses removeunusedcolumns: False # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 32 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 4 learningrate: 5e-6 optimizer: adamw8bit lrscheduler: cosine warmupsteps: 15 weightdecay: 0.001 maxgradnorm: 0.01 # ==================== # SEQUENCE CONFIGURATION # ==================== sequencelen: 8192 padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # CHECKPOINTING # ==================== savesteps: 100 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2-KTO-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-KTO # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2-KTO-1 DPO 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # RL/DPO CONFIGURATION # ==================== rl: dpo rlbeta: 0.1 # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./handcrafteddatasetmistralrep.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 16 loraalpha: 32 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 2 gradientaccumulationsteps: 1 learningrate: 2e-6 optimizer: adamw8bit lrscheduler: cosine warmupsteps: 5 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE CONFIGURATION # ==================== sequencelen: 8192 padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # CHECKPOINTING # ==================== savesteps: 10 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2-DPO-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-DPO # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2-DPO-1

NaNK
license:apache-2.0
7
0

MS3.2-PaintedFantasy-Visage-33B_exl3_3bpw

NaNK
license:apache-2.0
6
0

L3.3-GeneticLemonade-Opus-70B_exl3-4.25bpw

NaNK
llama
6
0

nomad-llama-70b-4.25bpw-exl3

NaNK
llama
6
0

MS3.2-PaintedFantasy-v2-24b-exl3-5bpw

body { font-family: 'Georgia', 'Times New Roman', serif; color: #dce4f0; / Soft off-white / line-height: 1.6; margin: 0; padding: 0; background-color: #161a25; / Deep blue from dark sky / } .lemonade-text { color: #89d8ff; / Bright blue from city lights / position: relative; z-index: 2; margin-left: 0.2em; text-shadow: 0 0 15px #89d8ff; } / Section styling / .section-container { background-color: rgba(32, 40, 56, 0.7); / Slightly transparent dark blue / margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #ff9966; / Sunset orange / box-shadow: 0 4px 15px rgba(255, 153, 102, 0.05); } .section-header { display: flex; align-items: center; background-color: rgba(255, 153, 102, 0.12); padding: 10px 20px; } .section-indicator { width: 8px; height: 20px; background-color: #ff9966; / Sunset orange / margin-right: 15px; box-shadow: 0 0 8px rgba(255, 153, 102, 0.2); } .section-title { font-family: 'Playfair Display', serif; / Using the new font / color: #ffb399; / Lighter sunset shade / font-size: 1.4rem; margin: 0; letter-spacing: 1px; font-weight: 400; text-transform: capitalize; } .section-content { padding: 20px; font-family: 'Crimson Text', serif; / Using the new font / color: #dce4f0; line-height: 1.6; } / Title styling / .title-container { background-color: #202838; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #ff9966; / Sunset orange / box-shadow: 0 6px 20px rgba(255, 153, 102, 0.07); } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Playfair Display', serif; } .title-main { color: #ffb399; / Lighter sunset shade / font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #a6c8e0; / Muted sky blue / font-size: 1.2rem; font-family: 'Crimson Text', serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(137, 216, 255, 0.08) 1px, rgba(0,0,0,0) 2px); / Rain effect with blue tint / z-index: 1; } / Data box styling / .data-box { background-color: rgba(22, 26, 37, 0.6); padding: 15px; border-left: 2px solid #ff9966; / Sunset orange / margin-bottom: 20px; box-shadow: 0 2px 10px rgba(255, 153, 102, 0.05); } .data-arrow { color: #ff9966; / Sunset orange / width: 20px; display: inline-block; } .data-label { color: #a6c8e0; / Muted sky blue / width: 80px; display: inline-block; } / Subheading styling / .subheading { color: #a6c8e0; / Muted sky blue / font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(166, 200, 224, 0.4); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Playfair Display', serif; } / Links / a { color: #89d8ff; / Bright blue from city lights / text-decoration: none; } a:hover { text-decoration: underline; color: #ffb399; / Lighter sunset shade on hover / } / Container / .container { max-width: 1200px; margin: 20px auto; padding: 40px 20px; background-color: #202838; / Darker container background / background-image: radial-gradient(circle at 20% 80%, rgba(255, 153, 102, 0.04) 0%, transparent 50%), / Sunset glow / radial-gradient(circle at 80% 20%, rgba(137, 216, 255, 0.04) 0%, transparent 50%), / Blue glow / radial-gradient(circle at 40% 40%, rgba(224, 230, 241, 0.02) 0%, transparent 50%); / Faint cloud/light glow / min-height: calc(100vh - 40px); border: 1px solid #ff9966; / Sunset orange / border-radius: 8px; box-shadow: 0 8px 32px rgba(255, 153, 102, 0.07); } / Dropdown styling / .dropdown-container { margin-top: 20px; } .dropdown-summary { cursor: pointer; padding: 10px 0; border-bottom: 1px dashed rgba(166, 200, 224, 0.4); color: #a6c8e0; / Muted sky blue / font-size: 1.1rem; font-weight: 400; text-transform: uppercase; letter-spacing: 1px; font-family: 'Playfair Display', serif; list-style: none; display: flex; align-items: center; } .dropdown-summary::-webkit-details-marker { display: none; } .dropdown-arrow { color: #ff9966; / Sunset orange / margin-right: 10px; transition: transform 0.3s ease; } .dropdown-container[open] .dropdown-arrow { transform: rotate(90deg); } .dropdown-content { margin-top: 15px; padding: 15px; background-color: rgba(22, 26, 37, 0.6); border-left: 2px solid #ff9966; / Sunset orange / box-shadow: 0 2px 10px rgba(255, 153, 102, 0.05); } .config-title { color: #a6c8e0; / Muted sky blue / font-size: 1rem; margin-bottom: 10px; font-family: 'Playfair Display', serif; text-transform: uppercase; letter-spacing: 1px; } This is an uncensored creative model intended to excel at character driven RP / ERP. Version 2 feels quite different from the original, with a heavy focus on reducing repetition across conversations and improving instruction following. Has a pretty unique writing style and sense of creativity (IMO). Pays the price with intermittent brain farts though. Training process: SFT > DPO > KTO SFT with RP/ERP, Stories and in character assistant data. DPO focused on reducing repetition, misgendered characters and slop. KTO focused on further reducing repetition and slop. Not optimized for cost / performance efficiency, YMMV. SFT 1H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ConicCat/Mistral-Small-3.2-AntiRep-24B modeltype: AutoModelForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./dataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 3 microbatchsize: 8 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 8192 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: true # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./PT-SFT1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: PF-SFT wandbentity: yourentity wandbname: runname

NaNK
license:apache-2.0
6
0

MS3.2-PaintedFantasy-Visage-v4-34b-exl3-3bpw

NaNK
license:mit
6
0

L3.3-GeneticLemonade-Unleashed-70B-6bpw-h8-exl2

NaNK
llama
5
0

MS3.2-PaintedFantasy-24B_exl3-5bpw

NaNK
license:apache-2.0
5
0

edens-fall-l3.3-70b-0.3c-exl3-4.25bpw

NaNK
llama
5
0

MS3.2-PaintedFantasy-Visage-v2-33B-exl3-5bpw

.container { --primary-accent: #EC83B1; --secondary-accent: #86C5E5; --tertiary-accent: #FDE484; --accent-rose: #F8A5C2; --bg-main: #1A1D2E; --bg-container: #232741; --bg-card: rgba(40, 45, 70, 0.7); --text-main: #E8ECF0; --text-muted: #B8C2D0; --white: #FFFFFF; --font-title: 'Inter', serif; --font-heading: 'Inter', serif; --font-body: 'Inter', serif; --font-code: 'JetBrains Mono', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; max-width: 1200px; margin: 20px auto; padding: 40px 20px; background-color: var(--bg-container); background-image: radial-gradient(circle at 20% 80%, rgba(236, 131, 177, 0.04) 0%, transparent 50%), radial-gradient(circle at 80% 20%, rgba(134, 197, 229, 0.04) 0%, transparent 50%), radial-gradient(circle at 40% 40%, rgba(253, 228, 132, 0.02) 0%, transparent 50%); min-height: calc(100vh - 40px); border: 1px solid var(--primary-accent); border-radius: 8px; box-shadow: 0 8px 32px rgba(236, 131, 177, 0.07); } .container .title-container { background-color: var(--bg-main); position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid var(--primary-accent); box-shadow: 0 6px 20px rgba(236, 131, 177, 0.07); } .container .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: var(--font-title); } .container .title-main { color: var(--accent-rose); font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .container .title-prefix { position: relative; z-index: 2; } .container .lemonade-text { color: var(--secondary-accent); position: relative; z-index: 2; margin-left: 0.2em; text-shadow: 0 0 15px var(--secondary-accent); } .container .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .container .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(134, 197, 229, 0.08) 1px, rgba(0,0,0,0) 2px); z-index: 1; } .container img { max-width: 100%; border: 3px solid var(--white); margin-bottom: 30px; box-shadow: 0 0 15px rgba(0, 0, 0, 0.3); } .container .section-container { background-color: var(--bg-card); margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: none !important; box-shadow: 0 4px 15px rgba(236, 131, 177, 0.05); } .container .section-header { display: flex; align-items: center; background-color: rgba(236, 131, 177, 0.12); padding: 10px 20px; border-bottom: none !important; } .container .section-indicator { width: 8px; height: 20px; background-color: var(--primary-accent); margin-right: 15px; box-shadow: 0 0 8px rgba(236, 131, 177, 0.2); } .container .section-title { font-family: var(--font-heading); color: var(--accent-rose); font-size: 1.4rem; margin: 0 !important; padding: 0 !important; letter-spacing: 1px; font-weight: 400; text-transform: capitalize; border-bottom: none !important; } .container .section-content { padding: 20px; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; } .container .subheading { color: var(--text-muted); font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(184, 194, 208, 0.4); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: var(--font-heading); } .container .data-box { background-color: rgba(26, 29, 46, 0.6); padding: 15px; border-left: 2px solid var(--primary-accent); margin-bottom: 20px; box-shadow: 0 2px 10px rgba(236, 131, 177, 0.05); } .container .data-row { display: flex; margin-bottom: 8px; align-items: center; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--primary-accent); width: 20px; display: inline-block; } .container .data-label { color: var(--text-muted); width: 80px; display: inline-block; } .container a { color: var(--secondary-accent); text-decoration: none; font-weight: 600; transition: color .3s; } .container a:hover { text-decoration: underline; color: var(--accent-rose); } .container .data-box a { position: relative; background-image: linear-gradient(to top, var(--primary-accent), var(--primary-accent)); background-position: 0 100%; background-repeat: no-repeat; background-size: 0% 2px; transition: background-size .3s, color .3s; } .container .data-box a:hover { color: var(--primary-accent); background-size: 100% 2px; } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; border-bottom: 1px dashed rgba(184, 194, 208, 0.4); color: var(--text-muted); font-size: 1.1rem; font-weight: 400; text-transform: uppercase; letter-spacing: 1px; font-family: var(--font-heading); list-style: none; display: flex; align-items: center; } .container .dropdown-summary::-webkit-details-marker { display: none; } .container .dropdown-arrow { color: var(--primary-accent); margin-right: 10px; transition: transform 0.3s ease; } .container details[open] .dropdown-arrow { transform: rotate(90deg); } .container .dropdown-content { margin-top: 15px; padding: 15px; background-color: rgba(26, 29, 46, 0.6); border-left: 2px solid var(--primary-accent); box-shadow: 0 2px 10px rgba(236, 131, 177, 0.05); } .container .config-title { color: var(--text-muted); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-heading); text-transform: uppercase; letter-spacing: 1px; } .container pre { background-color: var(--bg-main); padding: 15px; border: 1px solid rgba(134, 197, 229, 0.4); white-space: pre-wrap; word-wrap: break-word; color: var(--text-main); border-radius: 4px; } .container code { font-family: var(--font-code); background: transparent; padding: 0; } A surprisingly difficult model to work with. Removing the repetition was coming at the expense of the unique creativity the original upscale had. Decided on upscaling Painted Fantasy v2, healing it and then merging the original upscale back in. The result is a smarter, uncensored, creative model that excels at character driven RP / ERP where characters are portrayed creatively and proactively. Creation Process: Upscale > PT > SFT > KTO > DPO Pretrained on approx 300MB of light novels, stories and FineWeb-2 corpus. SFT on approx 8 million tokens, SFW / NSFW RP, stories and creative instruct data. KTO on antirep data created from the SFT datasets. Rejected examples generated by MS3.2 with repetitionpenalty=0.9 and OOC commands encouraging it to misgender, impersonate user etc. DPO on a high quality RP / NSFW dataset that is unreleased using rejected samples created in the same method as KTO. Resulting model was non repetitive, but had lost some of the spark the original upscale had. Merged the original upscale back in, making sure to not reintroduce repetition. Merge configurations used during the model creation process. Initial Upscale (Passthrough) basemodel: zerofata/MS3.2-PaintedFantasy-v2-24B dtype: bfloat16 slices: - sources: - model: zerofata/MS3.2-PaintedFantasy-v2-24B layerrange: [0, 29] - sources: - model: zerofata/MS3.2-PaintedFantasy-v2-24B layerrange: [10, 39] Final Merge (Slerp) models: - model: zerofata/MS3.2-PaintedFantasy-Visage-33B - model: ../axolotl/Visage-V2-PT-1-SFT-2-KTO-1-DPO-1/merged mergemethod: slerp basemodel: ../axolotl/Visage-V2-PT-1-SFT-2-KTO-1-DPO-1/merged parameters: t: [0.4, 0.2, 0, 0.2, 0.4] dtype: bfloat16 Not optimized for cost / performance efficiency, YMMV. Pretrain 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ../mergekit/pfv2upscale modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/pretraindatasetv5stripped.jsonl type: completion datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 64 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 4e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 12288 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 40 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-PT # wandbentity: yourentity wandbname: Visage-V2-PT-1 SFT 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/automateddataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/handcrafteddataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/instructdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/cwdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/storiesdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/cwclaudedataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/summariesdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 2 microbatchsize: 2 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 8192 samplepacking: true padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-SFT # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2 KTO 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # RL/DPO CONFIGURATION # ==================== rl: kto rlbeta: 0.1 ktodesirableweight: 1.25 ktoundesirableweight: 1.0 # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./handcrafteddatasetkto.jsonl type: llama3.argilla - path: ./approvedrpdatasetkto.jsonl type: llama3.argilla - path: ./instructdatasetkto.jsonl type: llama3.argilla datasetpreparedpath: trainoninputs: false # Only train on assistant responses removeunusedcolumns: False # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 32 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 4 learningrate: 5e-6 optimizer: adamw8bit lrscheduler: cosine warmupsteps: 15 weightdecay: 0.001 maxgradnorm: 0.01 # ==================== # SEQUENCE CONFIGURATION # ==================== sequencelen: 8192 padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # CHECKPOINTING # ==================== savesteps: 100 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2-KTO-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-KTO # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2-KTO-1 DPO 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # RL/DPO CONFIGURATION # ==================== rl: dpo rlbeta: 0.1 # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./handcrafteddatasetmistralrep.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 16 loraalpha: 32 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 2 gradientaccumulationsteps: 1 learningrate: 2e-6 optimizer: adamw8bit lrscheduler: cosine warmupsteps: 5 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE CONFIGURATION # ==================== sequencelen: 8192 padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # CHECKPOINTING # ==================== savesteps: 10 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2-DPO-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-DPO # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2-DPO-1

NaNK
license:apache-2.0
5
0

MS3.2 PaintedFantasy Visage V3 34B Exl3 5bpw

.container { --primary-accent: #C0C0C0; --secondary-accent: #4A9EFF; --glow-primary: rgba(192, 192, 192, 0.6); --glow-secondary: rgba(74, 158, 255, 0.6); --bg-main: #0B0A18; --bg-container: #110F24; --bg-card: rgba(20, 18, 40, 0.7); --text-main: #DCDCDC; --text-muted: #9E9E9E; --white: #FFFFFF; --border-color: #3C3A50; --font-title: 'Cinzel', serif; --font-body: 'EB Garamond', serif; --font-code: 'Courier New', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; font-weight: 400; max-width: 1100px; margin: 20px auto; padding: 25px; background-color: var(--bg-main); background-image: linear-gradient(rgba(11, 10, 24, 0.95), rgba(11, 10, 24, 0.95)), url('https://www.transparenttextures.com/patterns/stardust.png'); min-height: calc(100vh - 40px); border-radius: 8px; box-shadow: 0 0 25px rgba(0,0,0,0.7); border: 1px solid var(--border-color); } .container .title-container { background: linear-gradient(135deg, rgba(20, 18, 40, 0.8), rgba(30, 28, 50, 0.6)); margin-bottom: 30px; border: 1px solid var(--border-color); border-radius: 6px; padding: 25px; text-align: center; position: relative; box-shadow: 0 5px 15px rgba(0,0,0,0.4); overflow: hidden; } .container .title-main { color: var(--white); font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 4px; display: block; text-transform: uppercase; text-shadow: 0 0 4px var(--glow-primary), 0 0 8px var(--glow-primary), 0 0 12px var(--glow-primary); font-family: var(--font-title); } .container .lemonade-text { color: var(--secondary-accent); text-shadow: 0 0 8px var(--glow-secondary); } .container .title-subtitle { padding-left: 0; margin-top: 15px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-style: italic; font-weight: 400; letter-spacing: 2px; text-transform: uppercase; opacity: 0.8; } .container img { max-width: 100%; border: 2px solid var(--border-color); margin-bottom: 40px; box-shadow: 0 5px 15px rgba(0,0,0,0.5); border-radius: 4px; } .container .section-container { margin-bottom: 25px; padding-bottom: 25px; border-bottom: 1px dashed var(--border-color); } .container .section-container:last-of-type { border-bottom: none; padding-bottom: 0; margin-bottom: 0; } .container .section-header { display: flex; align-items: center; padding: 0 0 15px 0; } .container .section-title { font-family: var(--font-title); background: linear-gradient(45deg, var(--secondary-accent), var(--primary-accent)); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; font-size: 1.4rem; margin: 0 !important; padding: 0 0 10px 0 !important; letter-spacing: 1px; font-weight: 700; text-transform: uppercase; border: none !important; position: relative; display: inline-block; } .container .section-title::after { content: ''; position: absolute; bottom: 0; left: 0; width: 100%; height: 2px; background-image: linear-gradient(to right, var(--secondary-accent), var(--primary-accent)); box-shadow: 0 0 6px var(--glow-secondary), 0 0 6px var(--glow-primary); border-radius: 2px; } .container .section-content { padding: 20px 0 0 0; } .container .subheading { color: var(--secondary-accent); font-size: 1.1rem; margin-top: 20px; margin-bottom: 12px; font-weight: 700; display: block; text-transform: uppercase; letter-spacing: 2px; font-family: var(--font-title); border-bottom: 1px solid var(--secondary-accent); padding-bottom: 6px; text-shadow: 0 0 4px var(--glow-secondary); } .container .data-box { background-color: var(--bg-card); padding: 15px; border: 1px solid var(--border-color); border-left: 2px solid var(--primary-accent); margin-bottom: 15px; box-shadow: inset 0 0 6px rgba(0,0,0,0.4); border-radius: 4px; font-size: 1rem; } .container .data-row { display: flex; align-items: center; margin-bottom: 6px; padding: 5px 0; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--secondary-accent); font-weight: bold; margin-right: 10px; font-family: var(--font-code); font-size: 1rem; } .container .data-label { color: var(--white); font-weight: 600; font-family: var(--font-body); margin-right: 8px; min-width: 80px; } .container a { color: var(--primary-accent); text-decoration: none; font-weight: 600; transition: all .2s; } .container .data-row a { border-bottom: 1px dotted var(--primary-accent); } .container a:hover { text-decoration: none; color: var(--white); text-shadow: 0 0 5px var(--glow-primary); } .container .data-row a:hover { border-bottom-style: solid; } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; color: var(--text-muted); font-size: 1.1rem; font-weight: 700; text-transform: none; font-family: var(--font-title); letter-spacing: 1px; list-style: none; transition: color 0.2s ease; } .container .dropdown-summary:hover { color: var(--primary-accent); } .container .dropdown-arrow { color: var(--secondary-accent); margin-right: 10px; transition: transform 0.2s ease; } .container .dropdown-content { margin-top: 15px; padding: 20px; background-color: var(--bg-card); border: 1px solid var(--border-color); border-radius: 4px; } .container .config-title { color: var(--text-muted); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-body); text-transform: uppercase; letter-spacing: 1px; font-weight: 700; } .container pre { background-color: #1c1c1c; padding: 15px; border: 1px solid var(--border-color); white-space: pre-wrap; word-wrap: break-word; color: #c5c8c6; border-radius: 4px; box-shadow: inset 0 0 5px rgba(0,0,0,0.5); } .container pre code { background: none; color: inherit; padding: 0; border-radius: 0; } .container code { font-family: var(--font-code); color: var(--primary-accent); background: var(--border-color); padding: 2px 5px; border-radius: 4px; } No layer left behind edition. Upscale redone with the missing final layer included. The original upscales were always missing a layer, but I never troubleshooted to identify what layer was missing. Turns out it was the final layer. That's kind of an important one. This model is an uncensored, creative writing and RP model. Compared to the older version, it is smarter and I think has a bit less repetition. The old V2 version though is slightly more creative due to the instability it had. Creation Process: Upscale > CPT > SFT > DPO Pretrained on approx 300MB of light novel and FineWeb-2 corpus. SFT on approx 8 million tokens, SFW / NSFW RP, stories and creative instruct data. DPO on a high quality RP / NSFW dataset with a focus on improving instruction following, reducing repetition and fixing common model mistakes. Merge configurations used during the model creation process. Upscale (Passthrough) basemodel: ConicCat/Mistral-Small-3.2-AntiRep-24B mergemethod: passthrough dtype: bfloat16 slices: - sources: - model: ConicCat/Mistral-Small-3.2-AntiRep-24B layerrange: [0, 29] - sources: - model: ConicCat/Mistral-Small-3.2-AntiRep-24B layerrange: [10, 40] Not optimized for cost / performance efficiency, YMMV. Pretrain 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ../mergekit/pfv3upscale modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/pretraindatasetv5stripped.jsonl type: completion datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 64 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 4e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 12288 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 40 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V3-PT-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V3-PT wandbentity: yourentity wandbname: Visage-V3-PT-1 SFT 4H100 # ==================== MODEL CONFIGURATION ==================== basemodel: ./Visage-V3-PT-1/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken ==================== DATASET CONFIGURATION ==================== datasets: - path: ./data/dataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses ==================== QLORA CONFIGURATION ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true loramodulestosave: # Uncomment only if you added NEW tokens ==================== TRAINING PARAMETERS ==================== numepochs: 3 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 ==================== SEQUENCE & PACKING ==================== sequencelen: 8192 samplepacking: true padtosequencelen: true ==================== HARDWARE OPTIMIZATIONS ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this ==================== EVALUATION & CHECKPOINTING ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false ==================== LOGGING & OUTPUT ==================== outputdir: ./Visage-V3-PT-1-SFT-2 loggingsteps: 1 savesafetensors: true ==================== WANDB TRACKING ==================== wandbproject: Visage-V3-SFT wandbentity: yourentity wandbname: Visage-V3-PT-1-SFT-2 DPO 2H200 # ==================== MODEL CONFIGURATION ==================== basemodel: ./Visage-V3-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken ==================== RL/DPO CONFIGURATION ==================== rl: dpo rlbeta: 0.085 ==================== DATASET CONFIGURATION ==================== datasets: - path: ./data/handcrafteddatasetmistralrep.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] - path: ./data/approvedautomatedl3dataset.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses ==================== QLORA CONFIGURATION ==================== adapter: lora loadin8bit: true lorar: 16 loraalpha: 32 loradropout: 0.1 loratargetlinear: true loramodulestosave: # Uncomment only if you added NEW tokens ==================== TRAINING PARAMETERS ==================== numepochs: 1 microbatchsize: 2 gradientaccumulationsteps: 4 learningrate: 2e-6 optimizer: adamwtorchfused lrscheduler: cosine warmupsteps: 5 weightdecay: 0.01 maxgradnorm: 1.0 ==================== SEQUENCE CONFIGURATION ==================== sequencelen: 8192 padtosequencelen: true ==================== HARDWARE OPTIMIZATIONS ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this deepspeed: deepspeedconfigs/zero1.json ==================== CHECKPOINTING ==================== savesteps: 10 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false ==================== LOGGING & OUTPUT ==================== outputdir: ./Visage-V3-PT-1-SFT-2-DPO-2 loggingsteps: 1 savesafetensors: true ==================== WANDB TRACKING ==================== wandbproject: Visage-V3-DPO wandbentity: yourentity wandbname: Visage-V3-PT-1-SFT-2-DPO-2

NaNK
4
1

L3.3-GeneticLemonade-Unleashed-v2.1-70B-4.5bpw-hb6-exl2

NaNK
llama
4
0

MS3.2-PaintedFantasy-24B_exl3-3bpw

NaNK
license:apache-2.0
4
0

tesseract-v2.0-llama-70b_exl3-4.25bpw

NaNK
llama
4
0

MS3.2-PaintedFantasy-Visage-v3-34B-exl3-3bpw

.container { --primary-accent: #C0C0C0; --secondary-accent: #4A9EFF; --glow-primary: rgba(192, 192, 192, 0.6); --glow-secondary: rgba(74, 158, 255, 0.6); --bg-main: #0B0A18; --bg-container: #110F24; --bg-card: rgba(20, 18, 40, 0.7); --text-main: #DCDCDC; --text-muted: #9E9E9E; --white: #FFFFFF; --border-color: #3C3A50; --font-title: 'Cinzel', serif; --font-body: 'EB Garamond', serif; --font-code: 'Courier New', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; font-weight: 400; max-width: 1100px; margin: 20px auto; padding: 25px; background-color: var(--bg-main); background-image: linear-gradient(rgba(11, 10, 24, 0.95), rgba(11, 10, 24, 0.95)), url('https://www.transparenttextures.com/patterns/stardust.png'); min-height: calc(100vh - 40px); border-radius: 8px; box-shadow: 0 0 25px rgba(0,0,0,0.7); border: 1px solid var(--border-color); } .container .title-container { background: linear-gradient(135deg, rgba(20, 18, 40, 0.8), rgba(30, 28, 50, 0.6)); margin-bottom: 30px; border: 1px solid var(--border-color); border-radius: 6px; padding: 25px; text-align: center; position: relative; box-shadow: 0 5px 15px rgba(0,0,0,0.4); overflow: hidden; } .container .title-main { color: var(--white); font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 4px; display: block; text-transform: uppercase; text-shadow: 0 0 4px var(--glow-primary), 0 0 8px var(--glow-primary), 0 0 12px var(--glow-primary); font-family: var(--font-title); } .container .lemonade-text { color: var(--secondary-accent); text-shadow: 0 0 8px var(--glow-secondary); } .container .title-subtitle { padding-left: 0; margin-top: 15px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-style: italic; font-weight: 400; letter-spacing: 2px; text-transform: uppercase; opacity: 0.8; } .container img { max-width: 100%; border: 2px solid var(--border-color); margin-bottom: 40px; box-shadow: 0 5px 15px rgba(0,0,0,0.5); border-radius: 4px; } .container .section-container { margin-bottom: 25px; padding-bottom: 25px; border-bottom: 1px dashed var(--border-color); } .container .section-container:last-of-type { border-bottom: none; padding-bottom: 0; margin-bottom: 0; } .container .section-header { display: flex; align-items: center; padding: 0 0 15px 0; } .container .section-title { font-family: var(--font-title); background: linear-gradient(45deg, var(--secondary-accent), var(--primary-accent)); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; font-size: 1.4rem; margin: 0 !important; padding: 0 0 10px 0 !important; letter-spacing: 1px; font-weight: 700; text-transform: uppercase; border: none !important; position: relative; display: inline-block; } .container .section-title::after { content: ''; position: absolute; bottom: 0; left: 0; width: 100%; height: 2px; background-image: linear-gradient(to right, var(--secondary-accent), var(--primary-accent)); box-shadow: 0 0 6px var(--glow-secondary), 0 0 6px var(--glow-primary); border-radius: 2px; } .container .section-content { padding: 20px 0 0 0; } .container .subheading { color: var(--secondary-accent); font-size: 1.1rem; margin-top: 20px; margin-bottom: 12px; font-weight: 700; display: block; text-transform: uppercase; letter-spacing: 2px; font-family: var(--font-title); border-bottom: 1px solid var(--secondary-accent); padding-bottom: 6px; text-shadow: 0 0 4px var(--glow-secondary); } .container .data-box { background-color: var(--bg-card); padding: 15px; border: 1px solid var(--border-color); border-left: 2px solid var(--primary-accent); margin-bottom: 15px; box-shadow: inset 0 0 6px rgba(0,0,0,0.4); border-radius: 4px; font-size: 1rem; } .container .data-row { display: flex; align-items: center; margin-bottom: 6px; padding: 5px 0; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--secondary-accent); font-weight: bold; margin-right: 10px; font-family: var(--font-code); font-size: 1rem; } .container .data-label { color: var(--white); font-weight: 600; font-family: var(--font-body); margin-right: 8px; min-width: 80px; } .container a { color: var(--primary-accent); text-decoration: none; font-weight: 600; transition: all .2s; } .container .data-row a { border-bottom: 1px dotted var(--primary-accent); } .container a:hover { text-decoration: none; color: var(--white); text-shadow: 0 0 5px var(--glow-primary); } .container .data-row a:hover { border-bottom-style: solid; } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; color: var(--text-muted); font-size: 1.1rem; font-weight: 700; text-transform: none; font-family: var(--font-title); letter-spacing: 1px; list-style: none; transition: color 0.2s ease; } .container .dropdown-summary:hover { color: var(--primary-accent); } .container .dropdown-arrow { color: var(--secondary-accent); margin-right: 10px; transition: transform 0.2s ease; } .container .dropdown-content { margin-top: 15px; padding: 20px; background-color: var(--bg-card); border: 1px solid var(--border-color); border-radius: 4px; } .container .config-title { color: var(--text-muted); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-body); text-transform: uppercase; letter-spacing: 1px; font-weight: 700; } .container pre { background-color: #1c1c1c; padding: 15px; border: 1px solid var(--border-color); white-space: pre-wrap; word-wrap: break-word; color: #c5c8c6; border-radius: 4px; box-shadow: inset 0 0 5px rgba(0,0,0,0.5); } .container pre code { background: none; color: inherit; padding: 0; border-radius: 0; } .container code { font-family: var(--font-code); color: var(--primary-accent); background: var(--border-color); padding: 2px 5px; border-radius: 4px; } No layer left behind edition. Upscale redone with the missing final layer included. The original upscales were always missing a layer, but I never troubleshooted to identify what layer was missing. Turns out it was the final layer. That's kind of an important one. This model is an uncensored, creative writing and RP model. Compared to the older version, it is smarter and I think has a bit less repetition. The old V2 version though is slightly more creative due to the instability it had. Creation Process: Upscale > CPT > SFT > DPO Pretrained on approx 300MB of light novel and FineWeb-2 corpus. SFT on approx 8 million tokens, SFW / NSFW RP, stories and creative instruct data. DPO on a high quality RP / NSFW dataset with a focus on improving instruction following, reducing repetition and fixing common model mistakes. Merge configurations used during the model creation process. Upscale (Passthrough) basemodel: ConicCat/Mistral-Small-3.2-AntiRep-24B mergemethod: passthrough dtype: bfloat16 slices: - sources: - model: ConicCat/Mistral-Small-3.2-AntiRep-24B layerrange: [0, 29] - sources: - model: ConicCat/Mistral-Small-3.2-AntiRep-24B layerrange: [10, 40] Not optimized for cost / performance efficiency, YMMV. Pretrain 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ../mergekit/pfv3upscale modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/pretraindatasetv5stripped.jsonl type: completion datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 64 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 4e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 12288 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 40 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V3-PT-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V3-PT wandbentity: yourentity wandbname: Visage-V3-PT-1 SFT 4H100 # ==================== MODEL CONFIGURATION ==================== basemodel: ./Visage-V3-PT-1/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken ==================== DATASET CONFIGURATION ==================== datasets: - path: ./data/dataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses ==================== QLORA CONFIGURATION ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true loramodulestosave: # Uncomment only if you added NEW tokens ==================== TRAINING PARAMETERS ==================== numepochs: 3 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 ==================== SEQUENCE & PACKING ==================== sequencelen: 8192 samplepacking: true padtosequencelen: true ==================== HARDWARE OPTIMIZATIONS ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this ==================== EVALUATION & CHECKPOINTING ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false ==================== LOGGING & OUTPUT ==================== outputdir: ./Visage-V3-PT-1-SFT-2 loggingsteps: 1 savesafetensors: true ==================== WANDB TRACKING ==================== wandbproject: Visage-V3-SFT wandbentity: yourentity wandbname: Visage-V3-PT-1-SFT-2 DPO 2H200 # ==================== MODEL CONFIGURATION ==================== basemodel: ./Visage-V3-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken ==================== RL/DPO CONFIGURATION ==================== rl: dpo rlbeta: 0.085 ==================== DATASET CONFIGURATION ==================== datasets: - path: ./data/handcrafteddatasetmistralrep.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] - path: ./data/approvedautomatedl3dataset.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses ==================== QLORA CONFIGURATION ==================== adapter: lora loadin8bit: true lorar: 16 loraalpha: 32 loradropout: 0.1 loratargetlinear: true loramodulestosave: # Uncomment only if you added NEW tokens ==================== TRAINING PARAMETERS ==================== numepochs: 1 microbatchsize: 2 gradientaccumulationsteps: 4 learningrate: 2e-6 optimizer: adamwtorchfused lrscheduler: cosine warmupsteps: 5 weightdecay: 0.01 maxgradnorm: 1.0 ==================== SEQUENCE CONFIGURATION ==================== sequencelen: 8192 padtosequencelen: true ==================== HARDWARE OPTIMIZATIONS ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this deepspeed: deepspeedconfigs/zero1.json ==================== CHECKPOINTING ==================== savesteps: 10 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false ==================== LOGGING & OUTPUT ==================== outputdir: ./Visage-V3-PT-1-SFT-2-DPO-2 loggingsteps: 1 savesafetensors: true ==================== WANDB TRACKING ==================== wandbproject: Visage-V3-DPO wandbentity: yourentity wandbname: Visage-V3-PT-1-SFT-2-DPO-2

NaNK
4
0

MS3.2-PaintedFantasy-Visage-v2-33B

.container { --primary-accent: #EC83B1; --secondary-accent: #86C5E5; --tertiary-accent: #FDE484; --accent-rose: #F8A5C2; --bg-main: #1A1D2E; --bg-container: #232741; --bg-card: rgba(40, 45, 70, 0.7); --text-main: #E8ECF0; --text-muted: #B8C2D0; --white: #FFFFFF; --font-title: 'Inter', serif; --font-heading: 'Inter', serif; --font-body: 'Inter', serif; --font-code: 'JetBrains Mono', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; max-width: 1200px; margin: 20px auto; padding: 40px 20px; background-color: var(--bg-container); background-image: radial-gradient(circle at 20% 80%, rgba(236, 131, 177, 0.04) 0%, transparent 50%), radial-gradient(circle at 80% 20%, rgba(134, 197, 229, 0.04) 0%, transparent 50%), radial-gradient(circle at 40% 40%, rgba(253, 228, 132, 0.02) 0%, transparent 50%); min-height: calc(100vh - 40px); border: 1px solid var(--primary-accent); border-radius: 8px; box-shadow: 0 8px 32px rgba(236, 131, 177, 0.07); } .container .title-container { background-color: var(--bg-main); position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid var(--primary-accent); box-shadow: 0 6px 20px rgba(236, 131, 177, 0.07); } .container .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: var(--font-title); } .container .title-main { color: var(--accent-rose); font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .container .title-prefix { position: relative; z-index: 2; } .container .lemonade-text { color: var(--secondary-accent); position: relative; z-index: 2; margin-left: 0.2em; text-shadow: 0 0 15px var(--secondary-accent); } .container .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .container .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(134, 197, 229, 0.08) 1px, rgba(0,0,0,0) 2px); z-index: 1; } .container img { max-width: 100%; border: 3px solid var(--white); margin-bottom: 30px; box-shadow: 0 0 15px rgba(0, 0, 0, 0.3); } .container .section-container { background-color: var(--bg-card); margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: none !important; box-shadow: 0 4px 15px rgba(236, 131, 177, 0.05); } .container .section-header { display: flex; align-items: center; background-color: rgba(236, 131, 177, 0.12); padding: 10px 20px; border-bottom: none !important; } .container .section-indicator { width: 8px; height: 20px; background-color: var(--primary-accent); margin-right: 15px; box-shadow: 0 0 8px rgba(236, 131, 177, 0.2); } .container .section-title { font-family: var(--font-heading); color: var(--accent-rose); font-size: 1.4rem; margin: 0 !important; padding: 0 !important; letter-spacing: 1px; font-weight: 400; text-transform: capitalize; border-bottom: none !important; } .container .section-content { padding: 20px; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; } .container .subheading { color: var(--text-muted); font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(184, 194, 208, 0.4); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: var(--font-heading); } .container .data-box { background-color: rgba(26, 29, 46, 0.6); padding: 15px; border-left: 2px solid var(--primary-accent); margin-bottom: 20px; box-shadow: 0 2px 10px rgba(236, 131, 177, 0.05); } .container .data-row { display: flex; margin-bottom: 8px; align-items: center; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--primary-accent); width: 20px; display: inline-block; } .container .data-label { color: var(--text-muted); width: 80px; display: inline-block; } .container a { color: var(--secondary-accent); text-decoration: none; font-weight: 600; transition: color .3s; } .container a:hover { text-decoration: underline; color: var(--accent-rose); } .container .data-box a { position: relative; background-image: linear-gradient(to top, var(--primary-accent), var(--primary-accent)); background-position: 0 100%; background-repeat: no-repeat; background-size: 0% 2px; transition: background-size .3s, color .3s; } .container .data-box a:hover { color: var(--primary-accent); background-size: 100% 2px; } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; border-bottom: 1px dashed rgba(184, 194, 208, 0.4); color: var(--text-muted); font-size: 1.1rem; font-weight: 400; text-transform: uppercase; letter-spacing: 1px; font-family: var(--font-heading); list-style: none; display: flex; align-items: center; } .container .dropdown-summary::-webkit-details-marker { display: none; } .container .dropdown-arrow { color: var(--primary-accent); margin-right: 10px; transition: transform 0.3s ease; } .container details[open] .dropdown-arrow { transform: rotate(90deg); } .container .dropdown-content { margin-top: 15px; padding: 15px; background-color: rgba(26, 29, 46, 0.6); border-left: 2px solid var(--primary-accent); box-shadow: 0 2px 10px rgba(236, 131, 177, 0.05); } .container .config-title { color: var(--text-muted); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-heading); text-transform: uppercase; letter-spacing: 1px; } .container pre { background-color: var(--bg-main); padding: 15px; border: 1px solid rgba(134, 197, 229, 0.4); white-space: pre-wrap; word-wrap: break-word; color: var(--text-main); border-radius: 4px; } .container code { font-family: var(--font-code); background: transparent; padding: 0; } A surprisingly difficult model to work with. Removing the repetition was coming at the expense of the unique creativity the original upscale had. Decided on upscaling Painted Fantasy v2, healing it and then merging the original upscale back in. The result is a smarter, uncensored, creative model that excels at character driven RP / ERP where characters are portrayed creatively and proactively. Creation Process: Upscale > PT > SFT > KTO > DPO Pretrained on approx 300MB of light novels, SFW / NSFW stories and FineWeb-2 corpus. SFT on approx 8 million tokens, SFW / NSFW RP, stories and creative instruct data. KTO on antirep data created from the SFT datasets. Rejected examples generated by MS3.2 with repetitionpenalty=0.9 and OOC commands encouraging it to misgender, impersonate user etc. DPO on a high quality RP / NSFW dataset that is unreleased using rejected samples created in the same method as KTO. Resulting model was non repetitive, but had lost some of the spark the original upscale had. Merged the original upscale back in, making sure to not reintroduce repetition. Merge configurations used during the model creation process. Initial Upscale (Passthrough) basemodel: zerofata/MS3.2-PaintedFantasy-v2-24B dtype: bfloat16 slices: - sources: - model: zerofata/MS3.2-PaintedFantasy-v2-24B layerrange: [0, 29] - sources: - model: zerofata/MS3.2-PaintedFantasy-v2-24B layerrange: [10, 39] Final Merge (Slerp) models: - model: zerofata/MS3.2-PaintedFantasy-Visage-33B - model: ../axolotl/Visage-V2-PT-1-SFT-2-KTO-1-DPO-1/merged mergemethod: slerp basemodel: ../axolotl/Visage-V2-PT-1-SFT-2-KTO-1-DPO-1/merged parameters: t: [0.4, 0.2, 0, 0.2, 0.4] dtype: bfloat16 Not optimized for cost / performance efficiency, YMMV. Pretrain 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ../mergekit/pfv2upscale modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/pretraindatasetv5stripped.jsonl type: completion datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 64 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 4e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 12288 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 40 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-PT # wandbentity: yourentity wandbname: Visage-V2-PT-1 SFT 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/automateddataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/handcrafteddataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/instructdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/cwdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/storiesdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/cwclaudedataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] - path: ./data/summariesdataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 2 microbatchsize: 2 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 8192 samplepacking: true padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-SFT # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2 KTO 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # RL/DPO CONFIGURATION # ==================== rl: kto rlbeta: 0.1 ktodesirableweight: 1.25 ktoundesirableweight: 1.0 # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./handcrafteddatasetkto.jsonl type: llama3.argilla - path: ./approvedrpdatasetkto.jsonl type: llama3.argilla - path: ./instructdatasetkto.jsonl type: llama3.argilla datasetpreparedpath: trainoninputs: false # Only train on assistant responses removeunusedcolumns: False # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 32 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 4 learningrate: 5e-6 optimizer: adamw8bit lrscheduler: cosine warmupsteps: 15 weightdecay: 0.001 maxgradnorm: 0.01 # ==================== # SEQUENCE CONFIGURATION # ==================== sequencelen: 8192 padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # CHECKPOINTING # ==================== savesteps: 100 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2-KTO-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-KTO # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2-KTO-1 DPO 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ./Visage-V2-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # RL/DPO CONFIGURATION # ==================== rl: dpo rlbeta: 0.1 # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./handcrafteddatasetmistralrep.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 16 loraalpha: 32 loradropout: 0.1 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 2 gradientaccumulationsteps: 1 learningrate: 2e-6 optimizer: adamw8bit lrscheduler: cosine warmupsteps: 5 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE CONFIGURATION # ==================== sequencelen: 8192 padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # CHECKPOINTING # ==================== savesteps: 10 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V2-PT-1-SFT-2-DPO-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V2-DPO # wandbentity: yourentity wandbname: Visage-V2-PT-1-SFT-2-DPO-1

NaNK
license:apache-2.0
3
7

MS3.2 PaintedFantasy Visage V3 34B Exl3 6bpw

.container { --primary-accent: #C0C0C0; --secondary-accent: #4A9EFF; --glow-primary: rgba(192, 192, 192, 0.6); --glow-secondary: rgba(74, 158, 255, 0.6); --bg-main: #0B0A18; --bg-container: #110F24; --bg-card: rgba(20, 18, 40, 0.7); --text-main: #DCDCDC; --text-muted: #9E9E9E; --white: #FFFFFF; --border-color: #3C3A50; --font-title: 'Cinzel', serif; --font-body: 'EB Garamond', serif; --font-code: 'Courier New', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; font-weight: 400; max-width: 1100px; margin: 20px auto; padding: 25px; background-color: var(--bg-main); background-image: linear-gradient(rgba(11, 10, 24, 0.95), rgba(11, 10, 24, 0.95)), url('https://www.transparenttextures.com/patterns/stardust.png'); min-height: calc(100vh - 40px); border-radius: 8px; box-shadow: 0 0 25px rgba(0,0,0,0.7); border: 1px solid var(--border-color); } .container .title-container { background: linear-gradient(135deg, rgba(20, 18, 40, 0.8), rgba(30, 28, 50, 0.6)); margin-bottom: 30px; border: 1px solid var(--border-color); border-radius: 6px; padding: 25px; text-align: center; position: relative; box-shadow: 0 5px 15px rgba(0,0,0,0.4); overflow: hidden; } .container .title-main { color: var(--white); font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 4px; display: block; text-transform: uppercase; text-shadow: 0 0 4px var(--glow-primary), 0 0 8px var(--glow-primary), 0 0 12px var(--glow-primary); font-family: var(--font-title); } .container .lemonade-text { color: var(--secondary-accent); text-shadow: 0 0 8px var(--glow-secondary); } .container .title-subtitle { padding-left: 0; margin-top: 15px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-style: italic; font-weight: 400; letter-spacing: 2px; text-transform: uppercase; opacity: 0.8; } .container img { max-width: 100%; border: 2px solid var(--border-color); margin-bottom: 40px; box-shadow: 0 5px 15px rgba(0,0,0,0.5); border-radius: 4px; } .container .section-container { margin-bottom: 25px; padding-bottom: 25px; border-bottom: 1px dashed var(--border-color); } .container .section-container:last-of-type { border-bottom: none; padding-bottom: 0; margin-bottom: 0; } .container .section-header { display: flex; align-items: center; padding: 0 0 15px 0; } .container .section-title { font-family: var(--font-title); background: linear-gradient(45deg, var(--secondary-accent), var(--primary-accent)); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; font-size: 1.4rem; margin: 0 !important; padding: 0 0 10px 0 !important; letter-spacing: 1px; font-weight: 700; text-transform: uppercase; border: none !important; position: relative; display: inline-block; } .container .section-title::after { content: ''; position: absolute; bottom: 0; left: 0; width: 100%; height: 2px; background-image: linear-gradient(to right, var(--secondary-accent), var(--primary-accent)); box-shadow: 0 0 6px var(--glow-secondary), 0 0 6px var(--glow-primary); border-radius: 2px; } .container .section-content { padding: 20px 0 0 0; } .container .subheading { color: var(--secondary-accent); font-size: 1.1rem; margin-top: 20px; margin-bottom: 12px; font-weight: 700; display: block; text-transform: uppercase; letter-spacing: 2px; font-family: var(--font-title); border-bottom: 1px solid var(--secondary-accent); padding-bottom: 6px; text-shadow: 0 0 4px var(--glow-secondary); } .container .data-box { background-color: var(--bg-card); padding: 15px; border: 1px solid var(--border-color); border-left: 2px solid var(--primary-accent); margin-bottom: 15px; box-shadow: inset 0 0 6px rgba(0,0,0,0.4); border-radius: 4px; font-size: 1rem; } .container .data-row { display: flex; align-items: center; margin-bottom: 6px; padding: 5px 0; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--secondary-accent); font-weight: bold; margin-right: 10px; font-family: var(--font-code); font-size: 1rem; } .container .data-label { color: var(--white); font-weight: 600; font-family: var(--font-body); margin-right: 8px; min-width: 80px; } .container a { color: var(--primary-accent); text-decoration: none; font-weight: 600; transition: all .2s; } .container .data-row a { border-bottom: 1px dotted var(--primary-accent); } .container a:hover { text-decoration: none; color: var(--white); text-shadow: 0 0 5px var(--glow-primary); } .container .data-row a:hover { border-bottom-style: solid; } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; color: var(--text-muted); font-size: 1.1rem; font-weight: 700; text-transform: none; font-family: var(--font-title); letter-spacing: 1px; list-style: none; transition: color 0.2s ease; } .container .dropdown-summary:hover { color: var(--primary-accent); } .container .dropdown-arrow { color: var(--secondary-accent); margin-right: 10px; transition: transform 0.2s ease; } .container .dropdown-content { margin-top: 15px; padding: 20px; background-color: var(--bg-card); border: 1px solid var(--border-color); border-radius: 4px; } .container .config-title { color: var(--text-muted); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-body); text-transform: uppercase; letter-spacing: 1px; font-weight: 700; } .container pre { background-color: #1c1c1c; padding: 15px; border: 1px solid var(--border-color); white-space: pre-wrap; word-wrap: break-word; color: #c5c8c6; border-radius: 4px; box-shadow: inset 0 0 5px rgba(0,0,0,0.5); } .container pre code { background: none; color: inherit; padding: 0; border-radius: 0; } .container code { font-family: var(--font-code); color: var(--primary-accent); background: var(--border-color); padding: 2px 5px; border-radius: 4px; } No layer left behind edition. Upscale redone with the missing final layer included. The original upscales were always missing a layer, but I never troubleshooted to identify what layer was missing. Turns out it was the final layer. That's kind of an important one. This model is an uncensored, creative writing and RP model. Compared to the older version, it is smarter and I think has a bit less repetition. The old V2 version though is slightly more creative due to the instability it had. Creation Process: Upscale > CPT > SFT > DPO Pretrained on approx 300MB of light novel and FineWeb-2 corpus. SFT on approx 8 million tokens, SFW / NSFW RP, stories and creative instruct data. DPO on a high quality RP / NSFW dataset with a focus on improving instruction following, reducing repetition and fixing common model mistakes. Merge configurations used during the model creation process. Upscale (Passthrough) basemodel: ConicCat/Mistral-Small-3.2-AntiRep-24B mergemethod: passthrough dtype: bfloat16 slices: - sources: - model: ConicCat/Mistral-Small-3.2-AntiRep-24B layerrange: [0, 29] - sources: - model: ConicCat/Mistral-Small-3.2-AntiRep-24B layerrange: [10, 40] Not optimized for cost / performance efficiency, YMMV. Pretrain 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ../mergekit/pfv3upscale modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/pretraindatasetv5stripped.jsonl type: completion datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 64 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 4e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 12288 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 40 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V3-PT-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V3-PT wandbentity: yourentity wandbname: Visage-V3-PT-1 SFT 4H100 # ==================== MODEL CONFIGURATION ==================== basemodel: ./Visage-V3-PT-1/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken ==================== DATASET CONFIGURATION ==================== datasets: - path: ./data/dataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses ==================== QLORA CONFIGURATION ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true loramodulestosave: # Uncomment only if you added NEW tokens ==================== TRAINING PARAMETERS ==================== numepochs: 3 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 ==================== SEQUENCE & PACKING ==================== sequencelen: 8192 samplepacking: true padtosequencelen: true ==================== HARDWARE OPTIMIZATIONS ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this ==================== EVALUATION & CHECKPOINTING ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false ==================== LOGGING & OUTPUT ==================== outputdir: ./Visage-V3-PT-1-SFT-2 loggingsteps: 1 savesafetensors: true ==================== WANDB TRACKING ==================== wandbproject: Visage-V3-SFT wandbentity: yourentity wandbname: Visage-V3-PT-1-SFT-2 DPO 2H200 # ==================== MODEL CONFIGURATION ==================== basemodel: ./Visage-V3-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken ==================== RL/DPO CONFIGURATION ==================== rl: dpo rlbeta: 0.085 ==================== DATASET CONFIGURATION ==================== datasets: - path: ./data/handcrafteddatasetmistralrep.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] - path: ./data/approvedautomatedl3dataset.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses ==================== QLORA CONFIGURATION ==================== adapter: lora loadin8bit: true lorar: 16 loraalpha: 32 loradropout: 0.1 loratargetlinear: true loramodulestosave: # Uncomment only if you added NEW tokens ==================== TRAINING PARAMETERS ==================== numepochs: 1 microbatchsize: 2 gradientaccumulationsteps: 4 learningrate: 2e-6 optimizer: adamwtorchfused lrscheduler: cosine warmupsteps: 5 weightdecay: 0.01 maxgradnorm: 1.0 ==================== SEQUENCE CONFIGURATION ==================== sequencelen: 8192 padtosequencelen: true ==================== HARDWARE OPTIMIZATIONS ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this deepspeed: deepspeedconfigs/zero1.json ==================== CHECKPOINTING ==================== savesteps: 10 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false ==================== LOGGING & OUTPUT ==================== outputdir: ./Visage-V3-PT-1-SFT-2-DPO-2 loggingsteps: 1 savesafetensors: true ==================== WANDB TRACKING ==================== wandbproject: Visage-V3-DPO wandbentity: yourentity wandbname: Visage-V3-PT-1-SFT-2-DPO-2

NaNK
3
2

L3.3-Genetic-Lemonade-Sunset-70B-4.5bpw-hb6-exl2

NaNK
llama
3
0

L3.3-GeneticLemonade-Unleashed-v2-70b_6bpw-hb8-exl2

/ Base styling for cyberpunk theme / body {font-family: sans-serif; background-color: #080c14; color: #e1e9f0; line-height: 1.6; margin: 0; padding: 0;} / Animation classes / / Remove flicker-text rules / / New static style for LEMONADE / .lemonade-text { color: #33ff99; position: relative; / Keep relative positioning / z-index: 2; margin-left: 0.2em; text-shadow: 0 0 10px #33ff99; / Add static glow / } / Section styling / .section-container {background-color: rgba(8, 12, 20, 0.7); margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #33ff99;} .section-header {display: flex; align-items: center; background-color: rgba(0, 195, 255, 0.1); padding: 10px 20px;} .section-indicator {width: 8px; height: 20px; background-color: #33ff99; margin-right: 15px;} .section-title {font-family: 'Orbitron', sans-serif; color: #e1e9f0; font-size: 1.3rem; margin: 0; letter-spacing: 2px; text-transform: uppercase; font-weight: 500;} .section-content {padding: 20px; font-family: sans-serif; color: #e1e9f0; line-height: 1.6;} / Title styling / .title-container { background-color: #080c14; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #33ff99; } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Orbitron', sans-serif; } .title-main { color: #e1e9f0; font-size: 2.5rem; / Reduced font size / font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #00c3ff; font-size: 1.2rem; / Reduced font size / font-family: 'Orbitron', sans-serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(0,0,0,0.1) 1px, rgba(0,0,0,0) 2px); z-index: 1; } / Data box styling / .data-box {background-color: rgba(0, 0, 0, 0.2); padding: 15px; border-left: 2px solid #33ff99; margin-bottom: 20px;} .data-row {display: flex; margin-bottom: 8px;} .data-arrow {color: #33ff99; width: 20px; display: inline-block;} .data-label {color: #00c3ff; width: 80px; display: inline-block;} / Subheading styling / .subheading {color: #00c3ff; font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(0, 195, 255, 0.3); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Orbitron', sans-serif;} / Links / a {color: #00c3ff; text-decoration: none;} a:hover {text-decoration: underline;} / Container / .container {max-width: 1200px; margin: 0 auto; padding: 40px 20px;} / Cyberpunk grid background / .cyber-grid-bg {position: fixed; top: 0; left: 0; right: 0; bottom: 0; background-color: #05071b; background-image: linear-gradient(rgba(0, 194, 255, 0.03) 1px, transparent 1px), linear-gradient(90deg, rgba(0, 194, 255, 0.03) 1px, transparent 1px); background-size: 20px 20px; z-index: -1;} An experimental release. zerofata/GeneticLemonade-Unleashed qlora trained on a test dataset. Performance is improved from the original in my testing, but there are possibly (likely?) areas where the model will underperform which I am looking for feedback on. This is a creative model intended to excel at character driven RP / ERP. It has not been tested or trained on adventure stories or any large amounts of creative writing. Play with these, they are not the 'best' settings just a stable baseline. Recommended Samplers Llama-3-Instruct-Names but you will need to uncheck "System same as user". Model was trained on a tiny synthetic dataset of 450k tokens, approximately 130 conversations. Data was generated by script and then manually reviewed / edited. The dataset is approximately 60% SFW and 40% NSFW. 90% multi turn RP conversations, 5% creative writing and 5% miscellaneous. It is an experiment to see how models perform when provided with small amounts of high quality synthetic data, as opposed to human data.

NaNK
llama
3
0

L3.3-GeneticLemonade-Final-v2-70B_4.65bpw-hb6-exl2

NaNK
llama
3
0

Tesseract-V0.2-Llama-70b_exl3-4.5bpw

NaNK
llama
3
0

Tesseract-V0.2-Llama-70b_exl3-4.25bpw

NaNK
llama
3
0

MS3.2-PaintedFantasy-24B_exl3-6bpw

NaNK
license:apache-2.0
3
0

Austral-70b-Winton_exl3-4.25bpw

NaNK
llama
3
0

MS3.2-PaintedFantasy-24B_exl3-4bpw

NaNK
license:apache-2.0
2
0

MS3.2-PaintedFantasy-Visage-v3-34B-exl3-4.25bpw

.container { --primary-accent: #C0C0C0; --secondary-accent: #4A9EFF; --glow-primary: rgba(192, 192, 192, 0.6); --glow-secondary: rgba(74, 158, 255, 0.6); --bg-main: #0B0A18; --bg-container: #110F24; --bg-card: rgba(20, 18, 40, 0.7); --text-main: #DCDCDC; --text-muted: #9E9E9E; --white: #FFFFFF; --border-color: #3C3A50; --font-title: 'Cinzel', serif; --font-body: 'EB Garamond', serif; --font-code: 'Courier New', monospace; font-family: var(--font-body); color: var(--text-main); line-height: 1.6; font-weight: 400; max-width: 1100px; margin: 20px auto; padding: 25px; background-color: var(--bg-main); background-image: linear-gradient(rgba(11, 10, 24, 0.95), rgba(11, 10, 24, 0.95)), url('https://www.transparenttextures.com/patterns/stardust.png'); min-height: calc(100vh - 40px); border-radius: 8px; box-shadow: 0 0 25px rgba(0,0,0,0.7); border: 1px solid var(--border-color); } .container .title-container { background: linear-gradient(135deg, rgba(20, 18, 40, 0.8), rgba(30, 28, 50, 0.6)); margin-bottom: 30px; border: 1px solid var(--border-color); border-radius: 6px; padding: 25px; text-align: center; position: relative; box-shadow: 0 5px 15px rgba(0,0,0,0.4); overflow: hidden; } .container .title-main { color: var(--white); font-size: 2.5rem; font-weight: 700; margin: 0; letter-spacing: 4px; display: block; text-transform: uppercase; text-shadow: 0 0 4px var(--glow-primary), 0 0 8px var(--glow-primary), 0 0 12px var(--glow-primary); font-family: var(--font-title); } .container .lemonade-text { color: var(--secondary-accent); text-shadow: 0 0 8px var(--glow-secondary); } .container .title-subtitle { padding-left: 0; margin-top: 15px; } .container .subtitle-text { color: var(--text-muted); font-size: 1.2rem; font-family: var(--font-body); font-style: italic; font-weight: 400; letter-spacing: 2px; text-transform: uppercase; opacity: 0.8; } .container img { max-width: 100%; border: 2px solid var(--border-color); margin-bottom: 40px; box-shadow: 0 5px 15px rgba(0,0,0,0.5); border-radius: 4px; } .container .section-container { margin-bottom: 25px; padding-bottom: 25px; border-bottom: 1px dashed var(--border-color); } .container .section-container:last-of-type { border-bottom: none; padding-bottom: 0; margin-bottom: 0; } .container .section-header { display: flex; align-items: center; padding: 0 0 15px 0; } .container .section-title { font-family: var(--font-title); background: linear-gradient(45deg, var(--secondary-accent), var(--primary-accent)); background-clip: text; -webkit-background-clip: text; -webkit-text-fill-color: transparent; font-size: 1.4rem; margin: 0 !important; padding: 0 0 10px 0 !important; letter-spacing: 1px; font-weight: 700; text-transform: uppercase; border: none !important; position: relative; display: inline-block; } .container .section-title::after { content: ''; position: absolute; bottom: 0; left: 0; width: 100%; height: 2px; background-image: linear-gradient(to right, var(--secondary-accent), var(--primary-accent)); box-shadow: 0 0 6px var(--glow-secondary), 0 0 6px var(--glow-primary); border-radius: 2px; } .container .section-content { padding: 20px 0 0 0; } .container .subheading { color: var(--secondary-accent); font-size: 1.1rem; margin-top: 20px; margin-bottom: 12px; font-weight: 700; display: block; text-transform: uppercase; letter-spacing: 2px; font-family: var(--font-title); border-bottom: 1px solid var(--secondary-accent); padding-bottom: 6px; text-shadow: 0 0 4px var(--glow-secondary); } .container .data-box { background-color: var(--bg-card); padding: 15px; border: 1px solid var(--border-color); border-left: 2px solid var(--primary-accent); margin-bottom: 15px; box-shadow: inset 0 0 6px rgba(0,0,0,0.4); border-radius: 4px; font-size: 1rem; } .container .data-row { display: flex; align-items: center; margin-bottom: 6px; padding: 5px 0; } .container .data-row:last-child { margin-bottom: 0; } .container .data-arrow { color: var(--secondary-accent); font-weight: bold; margin-right: 10px; font-family: var(--font-code); font-size: 1rem; } .container .data-label { color: var(--white); font-weight: 600; font-family: var(--font-body); margin-right: 8px; min-width: 80px; } .container a { color: var(--primary-accent); text-decoration: none; font-weight: 600; transition: all .2s; } .container .data-row a { border-bottom: 1px dotted var(--primary-accent); } .container a:hover { text-decoration: none; color: var(--white); text-shadow: 0 0 5px var(--glow-primary); } .container .data-row a:hover { border-bottom-style: solid; } .container .dropdown-container { margin-top: 20px; } .container .dropdown-summary { cursor: pointer; padding: 10px 0; color: var(--text-muted); font-size: 1.1rem; font-weight: 700; text-transform: none; font-family: var(--font-title); letter-spacing: 1px; list-style: none; transition: color 0.2s ease; } .container .dropdown-summary:hover { color: var(--primary-accent); } .container .dropdown-arrow { color: var(--secondary-accent); margin-right: 10px; transition: transform 0.2s ease; } .container .dropdown-content { margin-top: 15px; padding: 20px; background-color: var(--bg-card); border: 1px solid var(--border-color); border-radius: 4px; } .container .config-title { color: var(--text-muted); font-size: 1rem; margin-bottom: 10px; font-family: var(--font-body); text-transform: uppercase; letter-spacing: 1px; font-weight: 700; } .container pre { background-color: #1c1c1c; padding: 15px; border: 1px solid var(--border-color); white-space: pre-wrap; word-wrap: break-word; color: #c5c8c6; border-radius: 4px; box-shadow: inset 0 0 5px rgba(0,0,0,0.5); } .container pre code { background: none; color: inherit; padding: 0; border-radius: 0; } .container code { font-family: var(--font-code); color: var(--primary-accent); background: var(--border-color); padding: 2px 5px; border-radius: 4px; } No layer left behind edition. Upscale redone with the missing final layer included. The original upscales were always missing a layer, but I never troubleshooted to identify what layer was missing. Turns out it was the final layer. That's kind of an important one. This model is an uncensored, creative writing and RP model. Compared to the older version, it is smarter and I think has a bit less repetition. The old V2 version though is slightly more creative due to the instability it had. Creation Process: Upscale > CPT > SFT > DPO Pretrained on approx 300MB of light novel and FineWeb-2 corpus. SFT on approx 8 million tokens, SFW / NSFW RP, stories and creative instruct data. DPO on a high quality RP / NSFW dataset with a focus on improving instruction following, reducing repetition and fixing common model mistakes. Merge configurations used during the model creation process. Upscale (Passthrough) basemodel: ConicCat/Mistral-Small-3.2-AntiRep-24B mergemethod: passthrough dtype: bfloat16 slices: - sources: - model: ConicCat/Mistral-Small-3.2-AntiRep-24B layerrange: [0, 29] - sources: - model: ConicCat/Mistral-Small-3.2-AntiRep-24B layerrange: [10, 40] Not optimized for cost / performance efficiency, YMMV. Pretrain 4H100 # ==================== # MODEL CONFIGURATION # ==================== basemodel: ../mergekit/pfv3upscale modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken # ==================== # DATASET CONFIGURATION # ==================== datasets: - path: ./data/pretraindatasetv5stripped.jsonl type: completion datasetpreparedpath: trainoninputs: false # Only train on assistant responses # ==================== # QLORA CONFIGURATION # ==================== adapter: qlora loadin4bit: true lorar: 32 loraalpha: 64 loradropout: 0.05 loratargetlinear: true # loramodulestosave: # Uncomment only if you added NEW tokens # ==================== # TRAINING PARAMETERS # ==================== numepochs: 1 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 4e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 # ==================== # SEQUENCE & PACKING # ==================== sequencelen: 12288 samplepacking: true evalsamplepacking: false padtosequencelen: true # ==================== # HARDWARE OPTIMIZATIONS # ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this # ==================== # EVALUATION & CHECKPOINTING # ==================== savestrategy: steps savesteps: 40 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true greaterisbetter: false # ==================== # LOGGING & OUTPUT # ==================== outputdir: ./Visage-V3-PT-1 loggingsteps: 2 savesafetensors: true # ==================== # WANDB TRACKING # ==================== wandbproject: Visage-V3-PT wandbentity: yourentity wandbname: Visage-V3-PT-1 SFT 4H100 # ==================== MODEL CONFIGURATION ==================== basemodel: ./Visage-V3-PT-1/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken ==================== DATASET CONFIGURATION ==================== datasets: - path: ./data/dataset.jsonl type: chattemplate split: train chattemplatestrategy: tokenizer fieldmessages: messages messagepropertymappings: role: role content: content roles: user: ["user"] assistant: ["assistant"] system: ["system"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses ==================== QLORA CONFIGURATION ==================== adapter: qlora loadin4bit: true lorar: 128 loraalpha: 128 loradropout: 0.1 loratargetlinear: true loramodulestosave: # Uncomment only if you added NEW tokens ==================== TRAINING PARAMETERS ==================== numepochs: 3 microbatchsize: 4 gradientaccumulationsteps: 1 learningrate: 1e-5 optimizer: pagedadamw8bit lrscheduler: rex warmupratio: 0.05 weightdecay: 0.01 maxgradnorm: 1.0 ==================== SEQUENCE & PACKING ==================== sequencelen: 8192 samplepacking: true padtosequencelen: true ==================== HARDWARE OPTIMIZATIONS ==================== bf16: auto flashattention: true gradientcheckpointing: offload deepspeed: deepspeedconfigs/zero1.json plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this ==================== EVALUATION & CHECKPOINTING ==================== savestrategy: steps savesteps: 20 savetotallimit: 5 # Keep best + last few checkpoints loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false ==================== LOGGING & OUTPUT ==================== outputdir: ./Visage-V3-PT-1-SFT-2 loggingsteps: 1 savesafetensors: true ==================== WANDB TRACKING ==================== wandbproject: Visage-V3-SFT wandbentity: yourentity wandbname: Visage-V3-PT-1-SFT-2 DPO 2H200 # ==================== MODEL CONFIGURATION ==================== basemodel: ./Visage-V3-PT-1-SFT-2/merged modeltype: MistralForCausalLM tokenizertype: AutoTokenizer chattemplate: mistralv7tekken ==================== RL/DPO CONFIGURATION ==================== rl: dpo rlbeta: 0.085 ==================== DATASET CONFIGURATION ==================== datasets: - path: ./data/handcrafteddatasetmistralrep.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] - path: ./data/approvedautomatedl3dataset.jsonl type: chattemplate.default fieldmessages: messages fieldchosen: chosen fieldrejected: rejected messagepropertymappings: role: role content: content roles: system: ["system"] user: ["user"] assistant: ["assistant"] datasetpreparedpath: trainoninputs: false # Only train on assistant responses ==================== QLORA CONFIGURATION ==================== adapter: lora loadin8bit: true lorar: 16 loraalpha: 32 loradropout: 0.1 loratargetlinear: true loramodulestosave: # Uncomment only if you added NEW tokens ==================== TRAINING PARAMETERS ==================== numepochs: 1 microbatchsize: 2 gradientaccumulationsteps: 4 learningrate: 2e-6 optimizer: adamwtorchfused lrscheduler: cosine warmupsteps: 5 weightdecay: 0.01 maxgradnorm: 1.0 ==================== SEQUENCE CONFIGURATION ==================== sequencelen: 8192 padtosequencelen: true ==================== HARDWARE OPTIMIZATIONS ==================== bf16: auto tf32: false flashattention: true gradientcheckpointing: offload plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cutcrossentropy.CutCrossEntropyPlugin cutcrossentropy: true ligerrope: true ligerrmsnorm: true ligerlayernorm: true ligergluactivation: true ligercrossentropy: false # Cut Cross Entropy overrides this ligerfusedlinearcrossentropy: false # Cut Cross Entropy overrides this deepspeed: deepspeedconfigs/zero1.json ==================== CHECKPOINTING ==================== savesteps: 10 savetotallimit: 10 loadbestmodelatend: true metricforbestmodel: evalloss greaterisbetter: false ==================== LOGGING & OUTPUT ==================== outputdir: ./Visage-V3-PT-1-SFT-2-DPO-2 loggingsteps: 1 savesafetensors: true ==================== WANDB TRACKING ==================== wandbproject: Visage-V3-DPO wandbentity: yourentity wandbname: Visage-V3-PT-1-SFT-2-DPO-2

NaNK
2
0

L3.3-GeneticLemonade-Final-70B-6bpw-h8-exl2

NaNK
llama
1
0

L3.3-Genetic-Lemonade-Sunset-70B

NaNK
llama
1
0

L3.3-GeneticLemonade-Unleashed-v2-70B-4.5bpw-hb6-exl2

NaNK
llama
1
0

L3.3-GeneticLemonade-Unleashed-v2.1-70B-4bpw-hb6-exl2

/ Base styling for cyberpunk theme / body {font-family: sans-serif; background-color: #080c14; color: #e1e9f0; line-height: 1.6; margin: 0; padding: 0;} / Animation classes / / Remove flicker-text rules / / New static style for LEMONADE / .lemonade-text { color: #33ff99; position: relative; / Keep relative positioning / z-index: 2; margin-left: 0.2em; text-shadow: 0 0 10px #33ff99; / Add static glow / } / Section styling / .section-container {background-color: rgba(8, 12, 20, 0.7); margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #33ff99;} .section-header {display: flex; align-items: center; background-color: rgba(0, 195, 255, 0.1); padding: 10px 20px;} .section-indicator {width: 8px; height: 20px; background-color: #33ff99; margin-right: 15px;} .section-title {font-family: 'Orbitron', sans-serif; color: #e1e9f0; font-size: 1.3rem; margin: 0; letter-spacing: 2px; text-transform: uppercase; font-weight: 500;} .section-content {padding: 20px; font-family: sans-serif; color: #e1e9f0; line-height: 1.6;} / Title styling / .title-container { background-color: #080c14; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #33ff99; } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Orbitron', sans-serif; } .title-main { color: #e1e9f0; font-size: 2.5rem; / Reduced font size / font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #00c3ff; font-size: 1.2rem; / Reduced font size / font-family: 'Orbitron', sans-serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(0,0,0,0.1) 1px, rgba(0,0,0,0) 2px); z-index: 1; } / Data box styling / .data-box {background-color: rgba(0, 0, 0, 0.2); padding: 15px; border-left: 2px solid #33ff99; margin-bottom: 20px;} .data-row {display: flex; margin-bottom: 8px;} .data-arrow {color: #33ff99; width: 20px; display: inline-block;} .data-label {color: #00c3ff; width: 80px; display: inline-block;} / Subheading styling / .subheading {color: #00c3ff; font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(0, 195, 255, 0.3); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Orbitron', sans-serif;} / Links / a {color: #00c3ff; text-decoration: none;} a:hover {text-decoration: underline;} / Container / .container {max-width: 1200px; margin: 0 auto; padding: 40px 20px;} / Cyberpunk grid background / .cyber-grid-bg {position: fixed; top: 0; left: 0; right: 0; bottom: 0; background-color: #05071b; background-image: linear-gradient(rgba(0, 194, 255, 0.03) 1px, transparent 1px), linear-gradient(90deg, rgba(0, 194, 255, 0.03) 1px, transparent 1px); background-size: 20px 20px; z-index: -1;} An experimental release. zerofata/GeneticLemonade-Unleashed qlora trained on a test dataset. Performance is improved from the original in my testing, but there are possibly (likely?) areas where the model will underperform which I am looking for feedback on. This is a creative model intended to excel at character driven RP / ERP. It has not been tested or trained on adventure stories or any large amounts of creative writing. Play with these, they are not the 'best' settings just a stable baseline. Recommended Samplers Llama-3-Instruct-Names but you will need to uncheck "System same as user". Model was trained on a tiny synthetic dataset of 640k tokens, approximately 190 conversations. Data was generated by script and then manually reviewed / edited. The dataset is approximately 60% SFW and 40% NSFW. 90% multi turn RP conversations, 5% creative writing and 5% miscellaneous. It is an experiment to see how models perform when provided with small amounts of high quality synthetic data, as opposed to human data.

NaNK
llama
1
0

L3.3-GeneticLemonade-Unleashed-v3-70B_4.5bpw-hb6-exl2

/ Base styling for cyberpunk theme / body {font-family: sans-serif; background-color: #080c14; color: #e1e9f0; line-height: 1.6; margin: 0; padding: 0;} .lemonade-text { color: #33ff99; position: relative; / Keep relative positioning / z-index: 2; margin-left: 0.2em; text-shadow: 0 0 10px #33ff99; / Add static glow / } / Section styling / .section-container {background-color: rgba(8, 12, 20, 0.7); margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #33ff99;} .section-header {display: flex; align-items: center; background-color: rgba(0, 195, 255, 0.1); padding: 10px 20px;} .section-indicator {width: 8px; height: 20px; background-color: #33ff99; margin-right: 15px;} .section-title {font-family: 'Orbitron', sans-serif; color: #e1e9f0; font-size: 1.3rem; margin: 0; letter-spacing: 2px; text-transform: uppercase; font-weight: 500;} .section-content {padding: 20px; font-family: sans-serif; color: #e1e9f0; line-height: 1.6;} / Title styling / .title-container { background-color: #080c14; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #33ff99; } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Orbitron', sans-serif; } .title-main { color: #e1e9f0; font-size: 2.5rem; / Reduced font size / font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #00c3ff; font-size: 1.2rem; / Reduced font size / font-family: 'Orbitron', sans-serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(0,0,0,0.1) 1px, rgba(0,0,0,0) 2px); z-index: 1; } / Data box styling / .data-box {background-color: rgba(0, 0, 0, 0.2); padding: 15px; border-left: 2px solid #33ff99; margin-bottom: 20px;} .data-row {display: flex; margin-bottom: 8px;} .data-arrow {color: #33ff99; width: 20px; display: inline-block;} .data-label {color: #00c3ff; width: 80px; display: inline-block;} / Subheading styling / .subheading {color: #00c3ff; font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(0, 195, 255, 0.3); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Orbitron', sans-serif;} / Links / a {color: #00c3ff; text-decoration: none;} a:hover {text-decoration: underline;} / Container / .container {max-width: 1200px; margin: 0 auto; padding: 40px 20px;} / Cyberpunk grid background / .cyber-grid-bg {position: fixed; top: 0; left: 0; right: 0; bottom: 0; background-color: #05071b; background-image: linear-gradient(rgba(0, 194, 255, 0.03) 1px, transparent 1px), linear-gradient(90deg, rgba(0, 194, 255, 0.03) 1px, transparent 1px); background-size: 20px 20px; z-index: -1;} An experimental release. zerofata/GeneticLemonade-Unleashed SFT+DPO QLora finetune. This is a creative model intended to excel at character driven RP / ERP. It has not been tested or trained on adventure stories or any large amounts of creative writing. This model is designed to provide longer, narrative heavy responses where characters are portrayed accurately and proactively. Play with these, they are not the 'best' settings just a stable baseline. Something interesting to note is this model supports higher temps than would normally be recommended for other L3 models. Recommended Samplers Llama-3-Instruct-Names but you will need to uncheck "System same as user". The model first went through SFT with a small synthetic dataset of 2.9 million tokens, approximately 750 conversations. Primarily RP data with small amounts of random instruct / assistant data and creative writing. The model then went through DPO training using approx 1100 chosen examples from the SFT dataset that were of exceptional quality or showed verifiable instruction following. Rejected samples were generated using another Llama 3.3 finetune that is known for poor instruction following. Axolotl configs Neither are optimized for cost / performance efficiency, YMMV. SFT 1H200

NaNK
llama
1
0

L3.3-GeneticLemonade-Unleashed-v3-70B_6bpw-hb8-exl2

/ Base styling for cyberpunk theme / body {font-family: sans-serif; background-color: #080c14; color: #e1e9f0; line-height: 1.6; margin: 0; padding: 0;} .lemonade-text { color: #33ff99; position: relative; / Keep relative positioning / z-index: 2; margin-left: 0.2em; text-shadow: 0 0 10px #33ff99; / Add static glow / } / Section styling / .section-container {background-color: rgba(8, 12, 20, 0.7); margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #33ff99;} .section-header {display: flex; align-items: center; background-color: rgba(0, 195, 255, 0.1); padding: 10px 20px;} .section-indicator {width: 8px; height: 20px; background-color: #33ff99; margin-right: 15px;} .section-title {font-family: 'Orbitron', sans-serif; color: #e1e9f0; font-size: 1.3rem; margin: 0; letter-spacing: 2px; text-transform: uppercase; font-weight: 500;} .section-content {padding: 20px; font-family: sans-serif; color: #e1e9f0; line-height: 1.6;} / Title styling / .title-container { background-color: #080c14; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #33ff99; } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Orbitron', sans-serif; } .title-main { color: #e1e9f0; font-size: 2.5rem; / Reduced font size / font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #00c3ff; font-size: 1.2rem; / Reduced font size / font-family: 'Orbitron', sans-serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(0,0,0,0.1) 1px, rgba(0,0,0,0) 2px); z-index: 1; } / Data box styling / .data-box {background-color: rgba(0, 0, 0, 0.2); padding: 15px; border-left: 2px solid #33ff99; margin-bottom: 20px;} .data-row {display: flex; margin-bottom: 8px;} .data-arrow {color: #33ff99; width: 20px; display: inline-block;} .data-label {color: #00c3ff; width: 80px; display: inline-block;} / Subheading styling / .subheading {color: #00c3ff; font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(0, 195, 255, 0.3); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Orbitron', sans-serif;} / Links / a {color: #00c3ff; text-decoration: none;} a:hover {text-decoration: underline;} / Container / .container {max-width: 1200px; margin: 0 auto; padding: 40px 20px;} / Cyberpunk grid background / .cyber-grid-bg {position: fixed; top: 0; left: 0; right: 0; bottom: 0; background-color: #05071b; background-image: linear-gradient(rgba(0, 194, 255, 0.03) 1px, transparent 1px), linear-gradient(90deg, rgba(0, 194, 255, 0.03) 1px, transparent 1px); background-size: 20px 20px; z-index: -1;} An experimental release. zerofata/GeneticLemonade-Unleashed SFT+DPO QLora finetune. This is a creative model intended to excel at character driven RP / ERP. It has not been tested or trained on adventure stories or any large amounts of creative writing. This model is designed to provide longer, narrative heavy responses where characters are portrayed accurately and proactively. Play with these, they are not the 'best' settings just a stable baseline. Something interesting to note is this model supports higher temps than would normally be recommended for other L3 models. Recommended Samplers Llama-3-Instruct-Names but you will need to uncheck "System same as user". The model first went through SFT with a small synthetic dataset of 2.9 million tokens, approximately 750 conversations. Primarily RP data with small amounts of random instruct / assistant data and creative writing. The model then went through DPO training using approx 1100 chosen examples from the SFT dataset that were of exceptional quality or showed verifiable instruction following. Rejected samples were generated using another Llama 3.3 finetune that is known for poor instruction following. Axolotl configs Neither are optimized for cost / performance efficiency, YMMV. SFT 1H200

NaNK
llama
1
0

L3.3-GeneticLemonade-Final-v2-70B_4.5bpw-hb6-exl2

NaNK
llama
1
0

mithril-llama-70b-exl3-4.25bpw

NaNK
llama
1
0

L3.3-GeneticLemonade-Unleashed-v2.1-70B

/ Base styling for cyberpunk theme / body {font-family: sans-serif; background-color: #080c14; color: #e1e9f0; line-height: 1.6; margin: 0; padding: 0;} / Animation classes / / Remove flicker-text rules / / New static style for LEMONADE / .lemonade-text { color: #33ff99; position: relative; / Keep relative positioning / z-index: 2; margin-left: 0.2em; text-shadow: 0 0 10px #33ff99; / Add static glow / } / Section styling / .section-container {background-color: rgba(8, 12, 20, 0.7); margin-bottom: 30px; position: relative; overflow: hidden; border-bottom: 1px solid #33ff99;} .section-header {display: flex; align-items: center; background-color: rgba(0, 195, 255, 0.1); padding: 10px 20px;} .section-indicator {width: 8px; height: 20px; background-color: #33ff99; margin-right: 15px;} .section-title {font-family: 'Orbitron', sans-serif; color: #e1e9f0; font-size: 1.3rem; margin: 0; letter-spacing: 2px; text-transform: uppercase; font-weight: 500;} .section-content {padding: 20px; font-family: sans-serif; color: #e1e9f0; line-height: 1.6;} / Title styling / .title-container { background-color: #080c14; position: relative; overflow: hidden; margin-bottom: 40px; border-left: 3px solid #33ff99; } .title-wrapper { position: relative; z-index: 2; padding: 25px 20px 30px 30px; font-family: 'Orbitron', sans-serif; } .title-main { color: #e1e9f0; font-size: 2.5rem; / Reduced font size / font-weight: 700; margin: 0; letter-spacing: 2px; display: inline-block; position: relative; text-transform: uppercase; } .title-subtitle { padding-left: 15px; margin-top: 5px; margin-left: 5px; } .subtitle-text { color: #00c3ff; font-size: 1.2rem; / Reduced font size / font-family: 'Orbitron', sans-serif; font-weight: 300; letter-spacing: 3px; text-transform: uppercase; display: inline-block; } .glitchy-overlay { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: repeating-linear-gradient(0deg, rgba(0,0,0,0) 0, rgba(0,0,0,0.1) 1px, rgba(0,0,0,0) 2px); z-index: 1; } / Data box styling / .data-box {background-color: rgba(0, 0, 0, 0.2); padding: 15px; border-left: 2px solid #33ff99; margin-bottom: 20px;} .data-row {display: flex; margin-bottom: 8px;} .data-arrow {color: #33ff99; width: 20px; display: inline-block;} .data-label {color: #00c3ff; width: 80px; display: inline-block;} / Subheading styling / .subheading {color: #00c3ff; font-size: 1.1rem; margin-top: 20px; margin-bottom: 15px; font-weight: 400; border-bottom: 1px dashed rgba(0, 195, 255, 0.3); display: inline-block; text-transform: uppercase; letter-spacing: 1px; font-family: 'Orbitron', sans-serif;} / Links / a {color: #00c3ff; text-decoration: none;} a:hover {text-decoration: underline;} / Container / .container {max-width: 1200px; margin: 0 auto; padding: 40px 20px;} / Cyberpunk grid background / .cyber-grid-bg {position: fixed; top: 0; left: 0; right: 0; bottom: 0; background-color: #05071b; background-image: linear-gradient(rgba(0, 194, 255, 0.03) 1px, transparent 1px), linear-gradient(90deg, rgba(0, 194, 255, 0.03) 1px, transparent 1px); background-size: 20px 20px; z-index: -1;} An experimental release. zerofata/GeneticLemonade-Unleashed qlora trained on a test dataset. Performance is improved from the original in my testing, but there are possibly (likely?) areas where the model will underperform which I am looking for feedback on. This is a creative model intended to excel at character driven RP / ERP. It has not been tested or trained on adventure stories or any large amounts of creative writing. Play with these, they are not the 'best' settings just a stable baseline. Recommended Samplers Llama-3-Instruct-Names but you will need to uncheck "System same as user". Model was trained on a tiny synthetic dataset of 640k tokens, approximately 190 conversations. Data was generated by script and then manually reviewed / edited. The dataset is approximately 60% SFW and 40% NSFW. 90% multi turn RP conversations, 5% creative writing and 5% miscellaneous. It is an experiment to see how models perform when provided with small amounts of high quality synthetic data, as opposed to human data.

NaNK
llama
0
3

Llama-3.3-70B-Vulpecula-r1-4.5bpw-hb6-exl2

NaNK
llama
0
1

L3.3-GeneticLemonade-Unleashed-v2-70B

NaNK
llama
0
1