dfurman
Llama-3-8B-Orpo-v0.1
This is an ORPO fine-tune of meta-llama/Meta-Llama-3-8B on 4k samples of mlabonne/orpo-dpo-mix-40k. It's a successful fine-tune that follows the ChatML template! This model uses a context window of 8k. It was trained with the ChatML template. | Model ID | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------: | --------: | --------: | ---------: | --------: | --------: | --------: | | meta-llama/Meta-Llama-3-8B-Instruct 📄 | 66.87 | 60.75 | 78.55 | 67.07 | 51.65 | 74.51 | 68.69 | | dfurman/Llama-3-8B-Orpo-v0.1 📄 | 64.67 | 60.67 | 82.56 | 66.59 | 50.47 | 79.01 | 48.75 | | meta-llama/Meta-Llama-3-8B 📄 | 62.35 | 59.22 | 82.02 | 66.49 | 43.95 | 77.11 | 45.34 | You can find the experiment on W&B at this address. Open LLM Leaderboard Evaluation Results Detailed results can be found here | Metric |Value| |-------------------|----:| |Avg. |11.01| |IFEval (0-Shot) |30.00| |BBH (3-Shot) |13.77| |MATH Lvl 5 (4-Shot)| 3.78| |GPQA (0-shot) | 1.57| |MuSR (0-shot) | 2.73| |MMLU-PRO (5-shot) |14.23|
CalmeRys-78B-Orpo-v0.1
This model is a finetune of `MaziyarPanahi/calme-2.4-rys-78b` on 1.5k rows of the `mlabonne/orpo-dpo-mix-40k` dataset. It was trained as a generalist language model for a variety of text generation use cases, including support of agentic capabilities, roleplaying, reasoning, multi-turn conversations, long context coherence, and more. As of Oct 2024, this is the top ranking model on the Open LLM Leaderboard 🏆. Thanks go out to mlabonne, MaziyarPanahi, et al. for the source dataset and base model. You can find the experiment on W&B at this link. Here are a few visualizations: | Metric |Value| |-------------------|----:| |Avg. |50.78| |IFEval (0-Shot) |81.63| |BBH (3-Shot) |61.92| |MATH Lvl 5 (4-Shot)|37.92| |GPQA (0-shot) |20.02| |MuSR (0-shot) |36.37| |MMLU-PRO (5-shot) |66.80|
Falcon-40B-Chat-v0.1
Qwen2-72B-Orpo-v0.1
This model is a finetune of `Qwen/Qwen2-72B-Instruct` on 1.5k rows of `mlabonne/orpo-dpo-mix-40k`. It was trained as a generalist language model for a variety of text generation use cases, including support of agentic capabilities, roleplaying, reasoning, multi-turn conversations, long context coherence, and more. Thanks go out to mlabonne, Qwen, et al. for the source dataset and base model. You can find the experiment on W&B at this address. Open LLM Leaderboard Evaluation Results Detailed results can be found here | Metric |Value| |-------------------|----:| |Avg. |43.32| |IFEval (0-Shot) |78.80| |BBH (3-Shot) |57.41| |MATH Lvl 5 (4-Shot)|35.42| |GPQA (0-shot) |17.90| |MuSR (0-shot) |20.87| |MMLU-PRO (5-shot) |49.50|
Mixtral-8x7B-Instruct-v0.1
Mistral-7B-Instruct-v0.2
deberta-v3-base-imdb
Llama-2-70B-Instruct-v0.1
LLaMA-13B
Llama-2-13B-Instruct-v0.2
Llama-3-70B-Orpo-v0.1
This is an ORPO fine-tune of meta-llama/Meta-Llama-3-70B on 2k samples of mlabonne/orpo-dpo-mix-40k. It's a successful fine-tune that follows the ChatML template! This model uses a context window of 8k. It was trained with the ChatML template. | Model ID | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------: | --------: | --------: | ---------: | --------: | --------: | --------: | | meta-llama/Meta-Llama-3-70B-Instruct 📄 | 77.88 | 71.42 | 85.69 | 80.06 | 61.81 | 82.87 | 85.44 | | dfurman/Llama-3-70B-Orpo-v0.1 📄 | 74.67 | 68.69 | 88.01 | 79.39 | 49.62 | 85.48 | 76.8 | | meta-llama/Meta-Llama-3-70B 📄 | 73.96 | 68.77 | 87.98 | 79.23 | 45.56 | 85.32 | 76.88 | You can find the experiment on W&B at this address. Open LLM Leaderboard Evaluation Results Detailed results can be found here | Metric |Value| |-------------------|----:| |Avg. |17.92| |IFEval (0-Shot) |20.49| |BBH (3-Shot) |24.09| |MATH Lvl 5 (4-Shot)|13.52| |GPQA (0-shot) | 1.01| |MuSR (0-shot) |16.28| |MMLU-PRO (5-shot) |32.14|