Akicou

20 models • 1 total models in database

Sort by:

ViSWE

Model Details - Name: Akicou/ViSWE - Base Model: Skywork/Skywork-SWE-32B - Merged Models: Skywork/Skywork-SWE-32B, TIGER-Lab/VisCoder2-32B - Merge Method: Arcee Fusion (MergeKit) - selectively merges important elements via dynamic thresholds - Dtype: bfloat16 - Architecture: Qwen2.5-Coder-32B-Instruct based, 33B params - Tensor Type: BF16 Description Merged model combining Skywork-SWE-32B (code-agent for software-engineering tasks like bug fixing on GitHub) and VisCoder2-32B (executable visualization code generation across 12 languages with self-debugging). - All GGUF Model Quants can be found at: Akicou/ViSWE-GGUF Intended Uses - Software-engineering tasks (e.g., bug fixing, feature implementation) - Visualization code generation and rendering - Multi-language support for executable visuals - Iterative self-debugging Limitations - Requires multiple GPUs for inference (32B params, 32K context) - Performance varies by task complexity, repository, and language - No explicit safety or bias mitigations Training Data - Skywork-SWE: 8,209 SWE trajectories - VisCoder2: VisCode-Multi-679K dataset for visualizations Citation - Skywork-SWE: Zeng et al. (2025), arXiv:2506.19290 - VisCoder2: Ni et al. (2025), arXiv:2510.23642

NaNK

license:apache-2.0

MiniMax-M2.1-REAP-40-GGUF

NaNK

llama.cpp

Solar-Open-69B-REAP

NaNK

—

MiniMax-M2-5-REAP-19

license:mit

LLaDA2.1-mini-256k-dynamic-ntk

license:apache-2.0

DeepKAT-32B

Model Overview DeepKAT-32B is a state-of-the-art open-source coding agent merged using Arcee MergeKit's TIES method. It fuses the strengths of two leading RL-tuned models on the Qwen3-32B base: - agentica-org/DeepSWE-Preview (primary, 33B): Excels in complex codebase navigation, multi-file editing, and SWE-Bench resolution (59% verified with hybrid strategies). Anchors deep reasoning and tool-use. - Kwaipilot/KAT-Dev (secondary, 32B): Boosts multi-stage RL for trajectory pruning and agentic workflows, achieving 62.4% on SWE-Bench Verified (ranks #5 open-source). The result? A cohesive 32B model for software engineering tasks: bug fixing, code generation, refactoring, and autonomous dev agents. Expect ~60-65% SWE-Bench gains over individual parents due to synergistic RL blending. Key Features - Architecture: Qwen3-32B dense (rotary embeddings, grouped-query attention). - Training: Merged via density/weight gradients for progressive integration; no additional fine-tuning. - Strengths: High-fidelity code synthesis, multi-turn tool chaining, sparse reward handling. - Limitations: May hallucinate on unseen languages; test on domain-specific repos. Benchmarks | Benchmark | DeepKAT-32B (Est.) | DeepSWE-Preview | KAT-Dev | |-----------|---------------------|-----------------|---------| | SWE-Bench Verified | 62% | 59% | 62.4% | | HumanEval (Pass@1) | 85% | 82% | 84% | | MultiPL-E (Avg.) | 78% | 76% | 77% | Estimates based on TIES merge trends; validate post-merge. Usage python inputs = tokenizer(prompt, returntensors="pt").to(model.device) outputs = model.generate(inputs, maxnewtokens=512, temperature=0.7) print(tokenizer.decode(outputs[0])) system You are DeepKAT, a expert coding agent. Think step-by-step, use tools if needed. @misc{deepkat32b, title = {DeepKAT-32B: Merged Coding Agent from DeepSWE and KAT-Dev}, author = {Akicou}, year = {2025}, publisher = {Hugging Face} } ```

NaNK

license:mit