IntervitensInc
Wan2.1-I2V-14B-480P-FP16
Wan2.1-T2V-1.3B-FP16
Wan2.1-T2V-14B-FP16
Mistral-Nemo-Base-2407-chatml
Wan2.1-I2V-14B-720P-FP16
DeepSeek-V3.1-Terminus-Channel-int8
ScikitLLM-Model-GGUF-Imatrix
pangu-pro-moe-model
Reuploaded from https://gitcode.com/ascend-tribe/pangu-pro-moe-model 我们提出了一种新型的分组混合专家模型(Mixture of Grouped Experts, MoGE),它在专家选择阶段对专家进行分组,并约束 token 在每个组内激活等量专家,从而实现设备间天然的负载均衡。基于 MoGE 架构,我们构建了总参数量 72B、激活参数量 16B 的盘古 Pro MoE 模型: 词表大小:153376 层数: 48 MoGE 配置:4 个共享专家,64 个路由专家分 8 组、每组激活 1 个专家 训练阶段:预训练和后训练 预训练预料:15T 详细报告参见: 中文技术报告地址:盘古 Pro MoE:昇腾原生的分组混合专家模型 英文技术报告地址:Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity 昇腾推理系统加速代码和MindIE 与 vLLM-Ascend 配套软件版本已经推出。量化权重将于近期推出,敬请期待。 Pangu Pro MoE 模型根据 Pangu Model License Agreement 授权,旨在允许使用并促进人工智能技术的进一步发展。有关详细信息,请参阅模型存储库根目录中的 `LICENSE` 文件。 由于Pangu Pro MoE(“模型”)所依赖的技术固有的限制,以及人工智能生成的内容是由盘古自动生成的,我们无法对以下事项做出任何保证: 1. 该模型的输出通过AI算法自动生成,不能排除某些信息可能存在缺陷、不合理或引起不适的可能性,生成的内容不代表华为的态度或立场; 2. 无法保证该模型100%准确、可靠、功能齐全、及时、安全、无错误、不间断、持续稳定或无任何故障; 3. 该模型的输出内容不构成任何建议或决策,也不保证生成的内容的真实性、完整性、准确性、及时性、合法性、功能性或实用性。生成的内容不能替代医疗、法律等领域的专业人士回答您的问题。生成的内容仅供参考,不代表华为的任何态度、立场或观点。您需要根据实际情况做出独立判断,华为不承担任何责任。
internlm2_5-20b-llamafied
Text generation model converted using a specific script and edited tokenizer to match the behavior of the original. Matches the original model at temperature=0 after some quick tests, but not extensively verified.
Qwen3-235B-A22B-Thinking-2507-tt-ckpt
GLM-4.6-Channel-int8
📖 Check out the GLM-4.6 technical blog , technical report(GLM-4.5) , and Zhipu AI technical documentation . Compared with GLM-4.5, GLM-4.6 brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex agentic tasks. Superior coding performance: The model achieves higher scores on code benchmarks and demonstrates better real-world performance in applications such as Claude Code、Cline、Roo Code and Kilo Code, including improvements in generating visually polished front-end pages. Advanced reasoning: GLM-4.6 shows a clear improvement in reasoning performance and supports tool use during inference, leading to stronger overall capability. More capable agents: GLM-4.6 exhibits stronger performance in tool using and search-based agents, and integrates more effectively within agent frameworks. Refined writing: Better aligns with human preferences in style and readability, and performs more naturally in role-playing scenarios. We evaluated GLM-4.6 across eight public benchmarks covering agents, reasoning, and coding. Results show clear gains over GLM-4.5, with GLM-4.6 also holding competitive advantages over leading domestic and international models such as DeepSeek-V3.1-Terminus and Claude Sonnet 4. Both GLM-4.5 and GLM-4.6 use the same inference method. For general evaluations, we recommend using a sampling temperature of 1.0. For code-related evaluation tasks (such as LCB), it is further recommended to set: - For tool-integrated reasoning, please refer to this doc. - For search benchmark, we design a specific format for searching toolcall in thinking mode to support search agent, please refer to this. for the detailed template.