molmoact-7b-d-awq
19
1
2 languages
license:apache-2.0
by
ronantakizawa
Image Model
OTHER
7B params
New
19 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
16GB+ RAM
Mobile
Laptop
Server
Quick Summary
This is a 4-bit AWQ quantized version of allenai/MolmoAct-7B-D-0812 using LLM Compressor.
Device Compatibility
Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
7GB+ RAM
Code Examples
Usagepythontransformers
from transformers import AutoModelForImageTextToText, AutoProcessor, GenerationConfig
from PIL import Image
import requests
# Load model and processor
processor = AutoProcessor.from_pretrained(
"ronantakizawa/molmoact-7b-d-awq-w4a16",
trust_remote_code=True,
torch_dtype='auto',
device_map='auto'
)
model = AutoModelForCausalLM.from_pretrained(
"ronantakizawa/molmoact-7b-d-awq-w4a16",
trust_remote_code=True,
torch_dtype='auto',
device_map='auto'
)
# Process the image and text
inputs = processor.process(
images=[Image.open(requests.get("https://picsum.photos/id/237/536/354", stream=True).raw)],
text="What actions can be performed with the objects in this image?"
)
# Move inputs to the correct device and make a batch of size 1
inputs = {k: v.to(model.device).unsqueeze(0) for k, v in inputs.items()}
# Generate output
output = model.generate_from_batch(
inputs,
GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
tokenizer=processor.tokenizer
)
# Decode the generated tokens
generated_tokens = output[0, inputs['input_ids'].size(1):]
generated_text = processor.tokenizer.decode(generated_tokens, skip_special_tokens=True)
print(generated_text)Usagepythontransformers
from transformers import AutoModelForImageTextToText, AutoProcessor, GenerationConfig
from PIL import Image
import requests
# Load model and processor
processor = AutoProcessor.from_pretrained(
"ronantakizawa/molmoact-7b-d-awq-w4a16",
trust_remote_code=True,
torch_dtype='auto',
device_map='auto'
)
model = AutoModelForCausalLM.from_pretrained(
"ronantakizawa/molmoact-7b-d-awq-w4a16",
trust_remote_code=True,
torch_dtype='auto',
device_map='auto'
)
# Process the image and text
inputs = processor.process(
images=[Image.open(requests.get("https://picsum.photos/id/237/536/354", stream=True).raw)],
text="What actions can be performed with the objects in this image?"
)
# Move inputs to the correct device and make a batch of size 1
inputs = {k: v.to(model.device).unsqueeze(0) for k, v in inputs.items()}
# Generate output
output = model.generate_from_batch(
inputs,
GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
tokenizer=processor.tokenizer
)
# Decode the generated tokens
generated_tokens = output[0, inputs['input_ids'].size(1):]
generated_text = processor.tokenizer.decode(generated_tokens, skip_special_tokens=True)
print(generated_text)Licensebibtex
@misc{molmoact-7b-d-awq,
title={MolmoAct-7B-D AWQ 4-bit},
author={Quantized by ronantakizawa},
year={2025},
url={https://huggingface.co/ronantakizawa/molmoact-7b-d-awq-w4a16}
}Licensebibtex
@misc{molmoact-7b-d-awq,
title={MolmoAct-7B-D AWQ 4-bit},
author={Quantized by ronantakizawa},
year={2025},
url={https://huggingface.co/ronantakizawa/molmoact-7b-d-awq-w4a16}
}Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.