PairJudge-RM
2
1
1 language
license:mit
by
THU-KEG
Other
OTHER
2501.13007B params
New
2 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
5591GB+ RAM
Mobile
Laptop
Server
Quick Summary
PairJudge RM is a pairwise judge reward model designed to enhance Best-of-N sampling for mathematical reasoning tasks.
Device Compatibility
Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
2330GB+ RAM
Code Examples
Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Usagepythontransformers
from transformers import AutoTokenizer, AutoModelForCausalLM
# template file is avaliable in [https://github.com/THU-KEG/PairwiseRM/blob/main/prompt/compare_0_ex.md]
TEMPLATE = open("prompts/compare_0_ex.md", "r").read()
# Load the tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained("THU-KEG/PairJudgeRM")
model = AutoModelForCausalLM.from_pretrained("THU-KEG/PairJudgeRM")
# Example math problem and candidate solutions
question = "If one equilateral triangle in a regular hexagon has a perimeter of 21 inches, what is the hexagon’s perimeter?"
response_a = "Each side is 7 inches; hexagon perimeter is 42 inches."
response_b = "The triangle's perimeter is 21 inches; hexagon perimeter is 126 inches."
# Construct the input prompt for pairwise judgment
input_text = template.format(question=question, response_a=response_a, response_b=response_b)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate the judgment with a chain-of-thought explanation
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.