sethuiyer
Medichat-Llama3-8B
SynthIQ_GGUF
MedleyMD-GGUF
Qwen2.5-7B-Anvita
Evaluation Results | Metric | Value | |-------------------------|--------------:| | Avg. | 29.18 | | IFEval (0-Shot) | 64.8 | | BBH (3-Shot) | 35.48 | | MATH Level 5 (4-Shot)| 15.86 | | GPQA (0-Shot) | 10.29 | | MuSR (0-Shot) | 13.47 | | MMLU-PRO (5-Shot) | 35.17 | Detailed results can be found here. Personal Benchmarks - check PERSONALBENCHMARK.md merge
Llamazing-3.1-8B-Instruct-Q3_K_M-GGUF
Nandine-7b-GGUF
Chikuma_10.7B_v2
LlamaZero-3.1-8B-Experimental-1208
This is a merge of pre-trained language models created using mergekit. Open LLM Leaderboard Evaluation Results Detailed results can be found here | Metric |Value| |-------------------|----:| |Avg. |21.77| |IFEval (0-Shot) |60.51| |BBH (3-Shot) |28.61| |MATH Lvl 5 (4-Shot)| 9.67| |GPQA (0-shot) | 2.46| |MuSR (0-shot) | 7.15| |MMLU-PRO (5-shot) |22.22|
CodeCalc-Mistral-7B
Dr_Samantha-7b
Dr_Samantha_7b_mistral
Nandine-7b
Diana-7B-GGUF
Llamaverse-3.1-8B-Instruct
A Unified Multidisciplinary Language Model Llamaverse-3.1-8B-Instruct is a state-of-the-art language model built on the foundation of MathCoder2-Llama-3-8B, pretrained on MathCode-Pile. This dataset, which embeds mathematical reasoning steps in natural language and code, provides a rock-solid foundation for advanced logical reasoning. By merging MathCoder2 with 10 specialized models using the Model Stock merge method, Llamaverse-3.1-8B-Instruct becomes an unparalleled polymath, excelling in mathematics, biomedical diagnostics, storytelling, coding, and more. This model was merged using the Model Stock merge method using MathGenie/MathCoder2-Llama-3-8B as a base. This model works amazing with the Divine Intellect preset! Models Merged The following models were included in the merge: ruslandev/llama-3-8b-samantha nvidia/OpenMath2-Llama3.1-8B NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS TsinghuaC3I/Llama-3.1-8B-UltraMedical tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b Skywork/Skywork-o1-Open-Llama-3.1-8B Undi95/Llama-3-LewdPlay-8B Locutusque/Hercules-6.1-Llama-3.1-8B RefuelAI/Llama-3-Refueled rombodawg/Llama-3-8B-Instruct-Coder Model Stock Merge Ensures balanced integration of diverse expertise without dominance by any single model. Multidisciplinary Expertise Llamaverse-3.1-8B-Instruct integrates the strengths of 10 specialized models, including: 1. ruslandev/llama-3-8b-samantha: Empathetic and human-like interaction capabilities. 2. nvidia/OpenMath2-Llama3.1-8B: Advanced mathematical problem-solving. 3. NeverSleep/Llama-3-Lumimaid-8B: Creative storytelling and roleplay. 4. TsinghuaC3I/Llama-3.1-8B-UltraMedical: Clinical-grade biomedical insights. 5. tohur/natsumura-storytelling-rp: Immersive narrative generation. 6. Skywork/Skywork-o1-Open-Llama-3.1-8B: Reflective reasoning and complex problem-solving. 7. Undi95/Llama-3-LewdPlay-8B: Unconventional creativity and boundary-pushing dialogue. 8. Locutusque/Hercules-6.1-Llama-3.1-8B: Expertise in physics, biology, chemistry, and engineering. 9. RefuelAI/Llama-3-Refueled: Robust NLP capabilities for classification and entity extraction. 10. rombodawg/Llama-3-8B-Instruct-Coder: Efficient coding and software development. Ethical Considerations Llamaverse-3.1-8B-Instruct must be used responsibly. Users should: - Avoid deploying the model in high-stakes scenarios without human oversight. - Be mindful of potential biases and ethical concerns, particularly in sensitive applications. - Use the model’s creative and unconventional capabilities responsibly. The following YAML configuration was used to produce this model: Built by merging models from MathGenie, ruslandev, nvidia, NeverSleep, TsinghuaC3I, tohur, Skywork, Undi95, Locutusque, RefuelAI, and rombodawg. Released under the Llama 3.1 Community License.
Chikuma_10.7B
SynthIQ-7b
Llama-3.1-8B-Experimental-1206-Instruct
1. Logical and Boolean Reasoning – Excels in tasks requiring clear, rule-based logic and manipulation of true/false statements. 2. Focused Domain Knowledge – Strong at certain specialized tasks (sports rules, ruin names, hyperbaton) that blend world knowledge with language comprehension. 3. Good Instruction Compliance – High prompt-level and instance-level accuracy (both strict and loose) indicate that it follows user instructions effectively, even in more complex or nuanced prompts. 4. Reasonable Multi-step Reasoning – While not the best in every logic category, it still shows solid performance (60%+) on tasks like disambiguation and causal reasoning. 5. Extended Context Window (138k) – The large 138k token context allows the model to handle lengthy inputs and maintain coherence across extensive passages or multi-turn conversations. This is especially valuable for tasks like long-document question answering, summarization, or complex scenario analysis where context retention is crucial. Open LLM Leaderboard Evaluation Results Detailed results can be found here | Metric |Value| |-------------------|----:| |Avg. |25.67| |IFEval (0-Shot) |69.67| |BBH (3-Shot) |30.06| |MATH Lvl 5 (4-Shot)|11.10| |GPQA (0-shot) | 6.60| |MuSR (0-shot) | 8.50| |MMLU-PRO (5-shot) |28.10|