radiologist_llama

28
2
mllama
by
Cosmobillian
Image Model
OTHER
New
28 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
Unknown
Mobile
Laptop
Server
Quick Summary

AI model with specialized capabilities.

Training Data Analysis

🟡 Average (4.8/10)

Researched training datasets used by radiologist_llama with quality assessment

Specialized For

general
science
multilingual
reasoning

Training Datasets (4)

common crawl
🔴 2.5/10
general
science
Key Strengths
  • Scale and Accessibility: At 9.5+ petabytes, Common Crawl provides unprecedented scale for training d...
  • Diversity: The dataset captures billions of web pages across multiple domains and content types, ena...
  • Comprehensive Coverage: Despite limitations, Common Crawl attempts to represent the broader web acro...
Considerations
  • Biased Coverage: The crawling process prioritizes frequently linked domains, making content from dig...
  • Large-Scale Problematic Content: Contains significant amounts of hate speech, pornography, violent c...
c4
🔵 6/10
general
multilingual
Key Strengths
  • Scale and Accessibility: 750GB of publicly available, filtered text
  • Systematic Filtering: Documented heuristics enable reproducibility
  • Language Diversity: Despite English-only, captures diverse writing styles
Considerations
  • English-Only: Limits multilingual applications
  • Filtering Limitations: Offensive content and low-quality text remain despite filtering
wikipedia
🟡 5/10
science
multilingual
Key Strengths
  • High-Quality Content: Wikipedia articles are subject to community review, fact-checking, and citatio...
  • Multilingual Coverage: Available in 300+ languages, enabling training of models that understand and ...
  • Structured Knowledge: Articles follow consistent formatting with clear sections, allowing models to ...
Considerations
  • Language Inequality: Low-resource language editions have significantly lower quality, fewer articles...
  • Biased Coverage: Reflects biases in contributor demographics; topics related to Western culture and ...
arxiv
🟡 5.5/10
science
reasoning
Key Strengths
  • Scientific Authority: Peer-reviewed content from established repository
  • Domain-Specific: Specialized vocabulary and concepts
  • Mathematical Content: Includes complex equations and notation
Considerations
  • Specialized: Primarily technical and mathematical content
  • English-Heavy: Predominantly English-language papers

Explore our comprehensive training dataset analysis

View All Datasets

Code Examples

👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
👨‍💻 How to Use (Inference)bashpytorch
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    import torch; v = re.match(r"[0-9\\.]{3,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.32.post2" if v == "2.8.0" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.55.4
!pip install --no-deps trl==0.22.2
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)
2. Run Inference with Pythonpythontransformers
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
import torch

# Load the model and tokenizer in 16-bit (float16)
# If you have less VRAM, you can use load_in_4bit=True
model, tokenizer = FastVisionModel.from_pretrained(
    "Cosmobillian/radiologist_llama",
    dtype=torch.float16,
    load_in_4bit=False, # False is ideal since the model was saved in 16-bit
)

# Prepare the model for inference
FastVisionModel.for_inference(model)

# Load your image (specify the path to your own X-ray image)
try:
    image = Image.open("path/to/your/xray.jpg")
except FileNotFoundError:
    print("Please provide a valid file path instead of 'path/to/your/xray.jpg'.")
    # Creating a blank image as a placeholder
    image = Image.new('RGB', (512, 512), 'black')


# The instruction format the model was trained on
instruction = "You are an expert radiographer. Describe accurately what you see in this image."

# Format the messages according to the chat template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

# Prepare the inputs with the tokenizer
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens=False, # Already present in the template
    return_tensors="pt",
).to("cuda")

# Use TextStreamer for real-time output
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

print("Model is generating the report...\n---")

# Run the model and stream the output
_ = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=256 # Maximum number of tokens to generate
)

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.