medical-reasoning-gpt-oss-20b

97
63
20.0B
1 language
license:apache-2.0
by
dousery
Language Model
OTHER
20B params
New
97 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
45GB+ RAM
Mobile
Laptop
Server
Quick Summary

This is a fine-tuned version of openai/gpt-oss-20b specifically optimized for medical reasoning and clinical decision-making.

Device Compatibility

Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
19GB+ RAM

Code Examples

🚀 Quick Startpythontransformers
#pip install torch --index-url https://download.pytorch.org/whl/cu128
#pip install "trl>=0.20.0" "peft>=0.17.0" "transformers>=4.55.0"

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
import re

base_model_name = "openai/gpt-oss-20b"
adapter_name = "dousery/medical-reasoning-gpt-oss-20b"

tokenizer = AutoTokenizer.from_pretrained(base_model_name)

base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

model = PeftModel.from_pretrained(base_model, adapter_name)
model = model.merge_and_unload()

messages = [
    {"role": "system", "content": "You are a medical reasoning assistant."},
    {"role": "user", "content": (
        """A 55-year-old man has chest pain and elevated troponin I without ST elevation.
         What is the diagnosis and what additional test would you order next?"""
    )}
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=2048,
    temperature=0.2,
    do_sample=False
)

raw_output = tokenizer.decode(outputs[0], skip_special_tokens=False)

#  PARSING THE OUTPUT
thinking_pattern = r"<\|end\|><\|start\|>assistant<\|channel\|>analysis<\|message\|>(.*?)<\|end\|>"
final_pattern = r"<\|start\|>assistant<\|channel\|>final<\|message\|>(.*?)<\|return\|>"

thinking_match = re.search(thinking_pattern, raw_output, re.DOTALL)
final_match = re.search(final_pattern, raw_output, re.DOTALL)

thinking_text = thinking_match.group(1).strip() if thinking_match else "N/A"
final_text = final_match.group(1).strip() if final_match else "N/A"

print("Thinking:", thinking_text)
print("\nFinal:", final_text)

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.