Qwen3.5-27B-Marvin-DPO-V2

2
by
ToastyPigeon
Language Model
OTHER
27B params
New
0 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
61GB+ RAM
Mobile
Laptop
Server
Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
26GB+ RAM

Code Examples

Training Configyaml
# Combined DPO V2: antirep + style + thinking — on Marvin V2 base
# 402 pairs, 1 epoch, beta=0.1, LR=5e-6
# V2: think masking enabled, 63% of pairs have think blocks

model_name_or_path: ToastyPigeon/Qwen3.5-27B-Marvin-V2
output_dir: runs/qwen35-27b-combined-dpo-v2

attn_implementation: flash_attention_2
bf16: true
gradient_checkpointing: true
gradient_checkpointing_kwargs:
  use_reentrant: false

model_parallel: true
max_memory:
  0: "18GiB"
  1: "18GiB"

chunked_mlp: true
chunked_mlp_chunks: 8

max_length: 2048
max_prompt_length: 512
max_completion_length: 1536

use_chunked_dpo: true
chunked_dpo_size: 4096
precompute_ref_log_probs: true
mask_thinking: true

per_device_train_batch_size: 1
gradient_accumulation_steps: 4

use_peft: true
load_in_4bit: true
bnb_4bit_quant_type: nf4
lora_r: 32
lora_alpha: 16
lora_dropout: 0.0
use_rslora: true
lora_target_modules:
  - in_proj_qkv
  - in_proj_z
  - in_proj_a
  - in_proj_b
  - out_proj
  - q_proj
  - k_proj
  - v_proj
  - o_proj
  - gate_proj
  - up_proj
  - down_proj

beta: 0.1
loss_type: sigmoid

learning_rate: 5.0e-6
lr_scheduler_type: cosine
warmup_ratio: 0.1
weight_decay: 0.0
max_grad_norm: 1.0
optim: paged_adamw_8bit
num_train_epochs: 1

logging_steps: 1
save_strategy: epoch
save_total_limit: 1
report_to: none

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.