PairJudge-RM

2
1
1 language
license:mit
by
THU-KEG
Other
OTHER
2501.13007B params
New
2 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
5591GB+ RAM
Mobile
Laptop
Server
Quick Summary

PairJudge RM is a pairwise judge reward model designed to enhance Best-of-N sampling for mathematical reasoning tasks.

Device Compatibility

Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
2330GB+ RAM

Code Examples

Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM

# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()

# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")

# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."

# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.