OpAgent-32B

1
license:apache-2.0
by
codefuse-ai
Image Model
OTHER
32B params
New
0 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
72GB+ RAM
Mobile
Laptop
Server
Quick Summary

AI model with specialized capabilities.

Device Compatibility

Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
30GB+ RAM

Code Examples

--- 1. Helper function to encode image ---pythonvllm
import base64
from vllm import LLM, SamplingParams
from PIL import Image
from io import BytesIO

# --- 1. Helper function to encode image ---
def encode_image_to_base64(image_path):
    with Image.open(image_path) as img:
        buffered = BytesIO()
        img.save(buffered, format="PNG")
        return base64.b64encode(buffered.getvalue()).decode('utf-8')

# --- 2. Initialize the vLLM engine ---
# Ensure you have enough GPU memory.
model_id = "codefuse-ai/OpAgent-32B"
llm = LLM(
    model=model_id,
    trust_remote_code=True,
    tensor_parallel_size=1,  # Adjust based on your GPU setup
    gpu_memory_utilization=0.9
)

# --- 3. Prepare the prompt ---
# The prompt must include the system message, task description, and the screenshot.
task_description = "Search for wireless headphones under $50"
screenshot_path = "path/to/your/screenshot.png" # Replace with your screenshot path
base64_image = encode_image_to_base64(screenshot_path)

# This prompt format is crucial for the agent's performance
prompt = f"""system
You are a helpful web agent. Your goal is to perform tasks on a web page based on a screenshot and a user's instruction.
Output the thinking process in <think> </think> tags, and for each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:\n<think> ... </think><tool_call>{"name": <function-name>, "arguments": <args-json-object>}</tool_call>.
user
[SCREENSHOT]
Task: {task_description}
assistant
"""

# --- 4. Generate the action ---
sampling_params = SamplingParams(temperature=0.0, max_tokens=1024)

# The model expects the image to be passed via the `images` parameter
outputs = llm.generate(
    prompts=[prompt],
    sampling_params=sampling_params,
    images=[base64_image]
)

# --- 5. Print the result ---
for output in outputs:
    generated_text = output.outputs[0].text
    print("--- Generated Action ---")
    print(generated_text)

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.