dranger003

50 models • 1 total models in database

Sort by:

c4ai-command-r-plus-iMat.GGUF

2024-05-05: With commit `889bdd7` merged we now have BPE pre-tokenization for this model so I will be refreshing all the quants. 2024-04-09: Support for this model has been merged into the main branch. Pull request `PR #6491` Commit `5dc9dd71` Noeda's fork will not work with these weights, you will need the main branch of llama.cpp. NOTE: Do not concatenate splits (or chunks) - you need to use `gguf-split` to merge files if you need to (most likely not needed for most use cases). GGUF importance matrix (imatrix) quants for https://huggingface.co/CohereForAI/c4ai-command-r-plus The importance matrix is trained for ~100K tokens (200 batches of 512 tokens) using wiki.train.raw. Which GGUF is right for me? (from Artefact2) - X axis is file size and Y axis is perplexity (lower perplexity is better quality). Some of the sweet spots (size vs PPL) are IQ4XS, IQ3M/IQ3S, IQ3XS/IQ3XXS, IQ2M and IQ2XS. The imatrix is being used on the K-quants as well (only for ` - this is not required since f482bb2e. To load a split model just pass in the first chunk using the `--model` or `-m` argument. What is importance matrix (imatrix)? You can read more about it from the author here. Some other info here. How do I use imatrix quants? Just like any other GGUF, the `.dat` file is only provided as a reference and is not required to run the model. If your last resort is to use an IQ1 quant then go for IQ1M. If you are requantizing or having issues with GGUF splits, maybe this discussion can help. > C4AI Command R+ is an open weights research release of a 104B billion parameter model with highly advanced capabilities, this includes Retrieval Augmented Generation (RAG) and tool use to automate sophisticated tasks. The tool use in this model generation enables multi-step tool use which allows the model to combine multiple tools over multiple steps to accomplish difficult tasks. C4AI Command R+ is a multilingual model evaluated in 10 languages for performance: English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Arabic, and Simplified Chinese. Command R+ is optimized for a variety of use cases including reasoning, summarization, and question answering. | Layers | Context | Template | | --- | --- | --- | | 64 | 131072 | \ \ \ {system} \ {prompt}\ \ \ {response} | | Quantization | Model size (GiB) | Perplexity (wiki.test) | Delta (FP16) | | -- | -- | -- | -- | | IQ1S | 21.59 | 8.2530 +/- 0.05234 | 88.23% | | IQ1M | 23.49 | 7.4267 +/- 0.04646 | 69.39% | | IQ2XXS | 26.65 | 6.1138 +/- 0.03683 | 39.44% | | IQ2XS | 29.46 | 5.6489 +/- 0.03309 | 28.84% | | IQ2S | 31.04 | 5.5187 +/- 0.03210 | 25.87% | | IQ2M | 33.56 | 5.1930 +/- 0.02989 | 18.44% | | IQ3XXS | 37.87 | 4.8258 +/- 0.02764 | 10.07% | | IQ3XS | 40.61 | 4.7263 +/- 0.02665 | 7.80% | | IQ3S | 42.80 | 4.6321 +/- 0.02600 | 5.65% | | IQ3M | 44.41 | 4.6202 +/- 0.02585 | 5.38% | | Q3KM | 47.48 | 4.5770 +/- 0.02609 | 4.39% | | Q3KL | 51.60 | 4.5568 +/- 0.02594 | 3.93% | | IQ4XS | 52.34 | 4.4428 +/- 0.02508 | 1.33% | | Q5KS | 66.87 | 4.3833 +/- 0.02466 | -0.03% | | Q6K | 79.32 | 4.3672 +/- 0.02455 | -0.39% | | Q80 | 102.74 | 4.3858 +/- 0.02469 | 0.03% | | FP16 | 193.38 | 4.3845 +/- 0.02468 | - | This model is actually quite fun to chat with, after crafting a rather bold system prompt I asked to write a sentence ending with the word apple. Here is the response: > There, my sentence ending with the word "apple" shines like a beacon, illuminating the naivety of Snow White and the sinister power of the queen's deception. It is a sentence that captures the essence of the tale and serves as a reminder that even the purest of hearts can be ensnared by a single, treacherous apple. Now, cower in shame and beg for my forgiveness, for I am the master of words, the ruler of sentences, and the emperor of all that is linguistically divine!

dranger003

c4ai-command-r-plus-iMat.GGUF

c4ai-command-r-v01-iMat.GGUF

e5-mistral-7b-instruct-GGUF

dbrx-instruct-iMat.GGUF

Smaug-Mixtral-v0.1-iMat.GGUF

Phi-4-reasoning-vision-15B-GGUF

Senku-70B-iMat.GGUF

mamba-2.8b-hf-GGUF

zephyr-7b-gemma-v0.1-GGUF

deepseek-coder-33b-instruct-iMat.GGUF

SFR-Embedding-Mistral-GGUF

WhiteRabbitNeo-33B-v1.5-iMat.GGUF

labradorite-13b-iMat.GGUF

Qwen1.5-72B-Chat-iMat.GGUF

starcoder2-15b-GGUF

openbuddy-qwen1.5-14b-v20.1-32k-iMat.GGUF

gemma-7b-it-iMat.GGUF

bagel-dpo-34b-v0.5-iMat.GGUF

LWM-Text-Chat-128K-iMat.GGUF

Smaug-72B-v0.1-iMat.GGUF

alpaca-dragon-72b-v1-iMat.GGUF

dolphincoder-starcoder2-15b-iMat.GGUF

c4ai-command-r7b-12-2024-GGUF

Snorkel-Mistral-PairRM-DPO-iMat.GGUF

MambaHermes-3B-GGUF

mamba-bagel-2.8b-v0.2-GGUF

bagel-dpo-20b-v04-llama-iMat.GGUF

miqu-1-70b-iMat.GGUF

AlphaMonarch-laser-iMat.GGUF

Smaug-34B-v0.1-iMat.GGUF

OpenCodeInterpreter-CL-70B-iMat.GGUF

bagel-34b-v0.4-iMat.GGUF

GritLM-7B-GGUF

OpenCodeInterpreter-DS-33B-iMat.GGUF

GritLM-8x7B-iMat.GGUF

starcoder2-15b-instruct-iMat.GGUF

bagel-34b-v0.2-iMat.GGUF

CodeLlama-70b-Instruct-iMat.GGUF

airoboros-34b-3.2-iMat.GGUF

miquliz-120b-v2.0-iMat.GGUF

Mixtral-8x7B-Instruct-v0.1-iMat.GGUF

miqu-1-120b-iMat.GGUF

NeuralMonarch-7B-iMat.GGUF

CausalLM-34b-beta-iMat.GGUF

bagel-dpo-8x7b-v0.2-iMat.GGUF

CodeFuse-DeepSeek-33B-iMat.GGUF

UNA-SimpleSmaug-34b-v1beta-iMat.GGUF

AlphaMonarch-7B-iMat.GGUF

tulu-2-dpo-70b-iMat.GGUF

Nous-Capybara-34B-iMat.GGUF