grapevine-AI

77 models • 1 total models in database

Sort by:

Qwen3-Coder-30B-A3B-Instruct-GGUF

What is this? Qwen3-Coder-FlashことQwen3-Coder-30B-A3B-InstructをGGUFフォーマットに変換したものです。 imatrix dataset 日本語能力を重視し、日本語が多量に含まれるTFMC/imatrix-dataset-for-japanese-llmデータセットを使用しました。 Note Q4KMおよびQ5KMクオンツのみ、Q80クオンツを再量子化するという方法で作成しました。これは、通常通りBF16から量子化しようとすると原因不明のエラーが発生してしまうためです。 Environment Windows版llama.cpp-b5902を使用して量子化作業を実施しました。

NaNK

license:apache-2.0

gemma-2-27b-it-gguf

NaNK

—

EZO-Common-9B-gemma-2-it-gguf

NaNK

—

phi-4-gguf

license:mit

Qwen3-30B-A3B-Instruct-2507-GGUF

NaNK

license:apache-2.0

CALM3-22B-Chat-GGUF

NaNK

license:apache-2.0

Llama-70B-DeepSeek-R1-Distill-GGUF

What is this? DeepSeek (深度求索)が公式自らDeepSeek-R1をLlama-3.3-70B-Instructに蒸留したThinkingモデル、DeepSeek-R1-Distill-Llama-70BをGGUFフォーマットに変換したものです。 imatrix dataset 日本語能力を重視し、日本語が多量に含まれるTFMC/imatrix-dataset-for-japanese-llmデータセットを使用しました。なお、計算リソースの関係上imatrixの算出においてはQ80量子化モデルを使用しました。 system \n\nここにsystemプロンプトを書きます user \n\nここにMessageを書きます assistant \n\n ``` --> Environment Windows版llama.cpp-b4514およびllama.cpp-b4453同時リリースのconvert-hf-to-gguf.pyを使用して量子化作業を実施しました。

NaNK

license:llama3.3

EZO-Humanities-9B-gemma-2-it-gguf

NaNK

—

aya-expanse-8b-gguf

NaNK

license:cc-by-nc-4.0

c4ai-command-r-08-2024-gguf

license:cc-by-nc-4.0

Gemma-2-9B-It-SPPO-Iter3-GGUF

NaNK

—

c4ai-command-r-plus-08-2024-gguf

license:cc-by-nc-4.0

DeepSeek-R1-0528-Qwen3-8B-GGUF

NaNK

license:mit

Llama-3.2-3B-Instruct-GGUF

NaNK

license:llama3.2

gemma-2-9b-it-gguf

NaNK

—

aya-expanse-32b-gguf

NaNK

license:cc-by-nc-4.0

gemma-2-9b-it-SimPO-GGUF

NaNK

—

Llama-3.1-70B-Japanese-Instruct-2407-GGUF

NaNK

license:llama3.1

Qwen2.5-Coder-32B-Instruct-GGUF

What is this? Qwen2.5のコード特化型派生モデルQwen2.5-Coder-32B-Instructを日本語imatrixで量子化したものです。なお、モデル本来のコンテキスト長は131072となっていますが、32768以降の位置エンコーディングが特殊なため、32768トークンを超える文章では正常に機能しない恐れがあります。よって、（メモリの占有を防ぐという意味でも）`-c`オプションで適当なコンテキスト長に制限することを強く推奨します。 imatrix dataset 日本語能力を重視し、日本語が多量に含まれるTFMC/imatrix-dataset-for-japanese-llmデータセットを使用しました。なお、計算リソースの関係上imatrixの算出においてはQ80量子化モデルを使用しました。 Environment Windows版llama.cpp-b3621およびllama.cpp-b3472同時リリースのconvert-hf-to-gguf.pyを使用して量子化作業を実施しました。

NaNK

license:apache-2.0

Qwen2-57B-A14B-Instruct-GGUF

NaNK

license:apache-2.0

c4ai-command-r-v01-gguf

license:cc-by-nc-4.0

gemma-3-12B-it-gguf

NaNK

—

stockmark-100b-instruct-v0.1-gguf

NaNK

license:mit

aya-23-35B-gguf

NaNK

license:cc-by-nc-4.0

Qwen2.5-Coder-0.5B-Instruct-GGUF

What is this? 投機的デコードに活用できるQwen2.5-Coderの超小型モデルQwen2.5-Coder-0.5B-Instructを日本語imatrixで量子化したものです。 imatrix dataset 日本語能力を重視し、日本語が多量に含まれるTFMC/imatrix-dataset-for-japanese-llmデータセットを使用しました。なお、imatrixの算出においてはf32精度のモデルを使用しました。これは、本来の数値精度であるbf16でのimatrix計算に現行のCUDA版llama.cppが対応していないためです。 Environment Windows版llama.cpp-b4170およびllama.cpp-b3472同時リリースのconvert-hf-to-gguf.pyを使用して量子化作業を実施しました。

NaNK

license:apache-2.0

Llama-3.1-70B-EZO-1.1-it-GGUF

NaNK

license:llama3.1

Mistral-Small-24B-Instruct-2501-reasoning-gguf

NaNK

license:apache-2.0

Llama-3.2-1B-Instruct-GGUF

NaNK

license:llama3.2

RakutenAI-2.0-8x7B-instruct-GGUF

NaNK

license:apache-2.0

Qwen2.5-32B-Instruct-GGUF-Japanese-imatrix

NaNK

license:apache-2.0

qwq-bakeneko-32b-gguf

NaNK

license:apache-2.0

Llama3-Athene-70B-GGUF

NaNK

license:llama3

Athene-V2-Chat-GGUF

—

Codestral-22B-v0.1-GGUF

What is this? Mistral AIのFill-in-the-middle (FIM)対応コーディング特化モデル、Codestral-22B-v0.1をGGUFフォーマットに変換したものです。 imatrix dataset 日本語能力を重視し、日本語が多量に含まれるTFMC/imatrix-dataset-for-japanese-llmデータセットを使用しました。なお、imatrixの算出においてはf32精度のモデルを使用しました。これは、本来の数値精度であるbf16でのimatrix計算に現行のCUDA版llama.cppが対応していないためです。 Environment Windows(CUDA12)版llama.cpp-b4178、およびllama.cppの4286回目のcommit時のconverthftogguf.pyを使用して量子化作業を実施しました。

NaNK

—

Llama-3.1-70B-Instruct-GGUF

NaNK

license:llama3.1

ABEJA-QwQ32b-Reasoning-Japanese-v1.0-GGUF

NaNK

license:apache-2.0

Qwen2-72B-Instruct-GGUF

NaNK

—

Qwen2.5-Coder-1.5B-Instruct-GGUF

NaNK

license:apache-2.0

calm3-22b-chat-selfimprove-experimental-gguf

caution! このGGUFは本来の性能を十分に発揮できていない「暫定版」です。これは2025年初旬現在のllama.cppがCALM3モデル固有のpre-tokenization（≒前処理）をサポートしていないことに起因します。妥協策として、pre-tokenization処理は別モデルのものを利用するように改造してありますが、この措置によってモデルの性能低下が引き起こされている可能性があります。 What is this? CyberAgentの日英バイリンガル言語モデルcalm3-22b-chat-selfimprove-experimentalをGGUFフォーマットに変換したものです。 imatrix dataset 日本語能力を重視し、日本語が多量に含まれるTFMC/imatrix-dataset-for-japanese-llmデータセットを使用しました。 Chat template Environment Windows版llama.cpp-b4514およびpre-tokenization関連に細工を施した改造版convert-hf-to-gguf.pyを使用して量子化作業を実施しました。

NaNK

license:apache-2.0

qwen2.5-bakeneko-32b-instruct-gguf

NaNK

license:apache-2.0

sarashina2-70b-gguf

NaNK

license:mit

EZO-Qwen2.5-72B-Instruct-GGUF

NaNK

license:apache-2.0

EZO-Qwen2.5-32B-Instruct-GGUF

NaNK

license:apache-2.0

llama-70b-r1-1776-distill-gguf

NaNK

license:llama3.3

Arcee-Blitz-GGUF

license:apache-2.0

deepseek-r1-distill-qwen2.5-bakeneko-32b-gguf

NaNK

license:apache-2.0

c4ai-command-a-03-2025-gguf

license:cc-by-nc-4.0

Qwen2.5-72B-Instruct-GGUF-Japanese-imatrix

NaNK

—

Llama-3.1-Nemotron-70B-Instruct-GGUF

What is this? NVIDIA社によるLlama 3.1 70Bの微調整モデルLlama-3.1-Nemotron-70B-InstructをGGUFフォーマットに変換したものです。 imatrix dataset 日本語能力を重視し、日本語が多量に含まれるTFMC/imatrix-dataset-for-japanese-llmデータセットを使用しました。なお、計算リソースの関係上imatrixの算出においてはQ80量子化モデルを使用しました。 Environment Windows版llama.cpp-b3621およびllama.cpp-b3472同時リリースのconvert-hf-to-gguf.pyを使用して量子化作業を実施しました。

NaNK

license:llama3.1

c4ai-command-r7b-12-2024-GGUF

NaNK

license:cc-by-nc-4.0

QwQ-32B-GGUF

NaNK

license:apache-2.0

QwQ-32B-Preview-GGUF

NaNK

license:apache-2.0

Qwen2.5-32B-Instruct-GGUF

NaNK

license:apache-2.0

qwen2.5-bakeneko-32b-instruct-v2-gguf

NaNK

license:apache-2.0

Llama-3.3-70B-Instruct-GGUF

NaNK

license:llama3.3

Qwen2.5-72B-Instruct-GGUF

NaNK

—

karakuri-lm-32b-thinking-2501-exp-gguf

caution! 思考をさせるためは``--jinja`` オプションにてこのモデル特有のシステムプロンプトを読み込む必要があります。このオプションを使用するにはllama.cpp-b4524以降への更新が必要です。 What is this? KARAKURI Inc.によるQwQ-32B-Previewの日本語ファインチューニングモデル、karakuri-lm-32b-thinking-2501-expをGGUFフォーマットに変換したものです。 imatrix dataset 日本語能力を重視し、日本語が多量に含まれるTFMC/imatrix-dataset-for-japanese-llmデータセットを使用しました。また、CUDA版llama.cppがbfloat16に対応したため、imatrixの算出は本来の数値精度であるBF16のモデルを使用して行いました。 Environment Windows版llama.cpp-b4514およびllama.cpp-b4524同時リリースのconvert-hf-to-gguf.pyを使用して量子化作業を実施しました。

NaNK

license:apache-2.0

gemma-3-27B-it-gguf

NaNK

—

sarashina2.2-3b-instruct-v0.1-gguf

NaNK

license:mit

Qwen3-32B-GGUF

NaNK

license:apache-2.0

phi-4-open-R1-Distill-EZOv1-GGUF

license:mit

EXAONE-3.5-32B-Instruct-GGUF-Japanese-imatrix

What is this? LG AI Researchの韓国語-英語バイリンガル言語モデルEXAONE-3.5-32B-Instructを日本語imatrixで量子化したものです。なお、商用利用はできませんのでご注意ください。 imatrix dataset 日本語能力を重視し、日本語が多量に含まれるTFMC/imatrix-dataset-for-japanese-llmデータセットを使用しました。公式配布されているBF16版GGUFを利用しましたが、途中、計算リソースの関係上imatrixの算出においてはQ80量子化モデルを使用しました。 Environment Windows(CUDA12)版llama.cpp-b4178を使用して量子化作業を実施しました。

NaNK

—

DeepSeek-R1-Distill-Qwen-32B-GGUF

NaNK

license:apache-2.0

ABEJA-Qwen2.5-32b-Japanese-v0.1-GGUF

NaNK

license:apache-2.0

Mistral-Small-24B-Instruct-2501-GGUF

NaNK

license:apache-2.0

phi-4-deepseek-R1K-RL-EZO-GGUF

license:mit

Phi-3.5-MoE-instruct-GGUF

license:mit

Athene-V2-Agent-GGUF

—

Qwen3-30B-A3B-GGUF

NaNK

license:apache-2.0

grapevine-AI

plamo-2-translate-gguf

Meta-Llama-3-70B-Instruct-GGUF

c4ai-command-r-plus-gguf

gemma-3n-E4B-it-gguf

DeepSeek-R1-Distill-Qwen-32B-Japanese-GGUF

Qwen3-30B-A3B-Thinking-2507-GGUF

Mistral-Nemo-Instruct-2407-GGUF

Gemma 2 2b Jpn It Gguf

Qwen3-Coder-30B-A3B-Instruct-GGUF

gemma-2-27b-it-gguf

EZO-Common-9B-gemma-2-it-gguf

phi-4-gguf

Qwen3-30B-A3B-Instruct-2507-GGUF

CALM3-22B-Chat-GGUF

Llama-70B-DeepSeek-R1-Distill-GGUF

EZO-Humanities-9B-gemma-2-it-gguf

aya-expanse-8b-gguf

c4ai-command-r-08-2024-gguf

Gemma-2-9B-It-SPPO-Iter3-GGUF

c4ai-command-r-plus-08-2024-gguf

DeepSeek-R1-0528-Qwen3-8B-GGUF

Llama-3.2-3B-Instruct-GGUF

gemma-2-9b-it-gguf

aya-expanse-32b-gguf

gemma-2-9b-it-SimPO-GGUF

Llama-3.1-70B-Japanese-Instruct-2407-GGUF

Qwen2.5-Coder-32B-Instruct-GGUF

Qwen2-57B-A14B-Instruct-GGUF

c4ai-command-r-v01-gguf

gemma-3-12B-it-gguf

stockmark-100b-instruct-v0.1-gguf

aya-23-35B-gguf

Qwen2.5-Coder-0.5B-Instruct-GGUF

Llama-3.1-70B-EZO-1.1-it-GGUF

Mistral-Small-24B-Instruct-2501-reasoning-gguf

Llama-3.2-1B-Instruct-GGUF

RakutenAI-2.0-8x7B-instruct-GGUF

Qwen2.5-32B-Instruct-GGUF-Japanese-imatrix

qwq-bakeneko-32b-gguf

Llama3-Athene-70B-GGUF

Athene-V2-Chat-GGUF

Codestral-22B-v0.1-GGUF

Llama-3.1-70B-Instruct-GGUF

ABEJA-QwQ32b-Reasoning-Japanese-v1.0-GGUF

Qwen2-72B-Instruct-GGUF

Qwen2.5-Coder-1.5B-Instruct-GGUF

calm3-22b-chat-selfimprove-experimental-gguf

qwen2.5-bakeneko-32b-instruct-gguf

sarashina2-70b-gguf

EZO-Qwen2.5-72B-Instruct-GGUF

EZO-Qwen2.5-32B-Instruct-GGUF

llama-70b-r1-1776-distill-gguf

Arcee-Blitz-GGUF

deepseek-r1-distill-qwen2.5-bakeneko-32b-gguf

c4ai-command-a-03-2025-gguf

Qwen2.5-72B-Instruct-GGUF-Japanese-imatrix

Llama-3.1-Nemotron-70B-Instruct-GGUF

c4ai-command-r7b-12-2024-GGUF

QwQ-32B-GGUF

QwQ-32B-Preview-GGUF

Qwen2.5-32B-Instruct-GGUF

qwen2.5-bakeneko-32b-instruct-v2-gguf

Llama-3.3-70B-Instruct-GGUF

Qwen2.5-72B-Instruct-GGUF

karakuri-lm-32b-thinking-2501-exp-gguf

gemma-3-27B-it-gguf

sarashina2.2-3b-instruct-v0.1-gguf

Qwen3-32B-GGUF

phi-4-open-R1-Distill-EZOv1-GGUF

EXAONE-3.5-32B-Instruct-GGUF-Japanese-imatrix

DeepSeek-R1-Distill-Qwen-32B-GGUF

ABEJA-Qwen2.5-32b-Japanese-v0.1-GGUF

Mistral-Small-24B-Instruct-2501-GGUF

phi-4-deepseek-R1K-RL-EZO-GGUF

Phi-3.5-MoE-instruct-GGUF

Athene-V2-Agent-GGUF

Qwen3-30B-A3B-GGUF