deepcogito

12 models • 2 total models in database

Sort by:

cogito-v1-preview-qwen-14B

NaNK

license:apache-2.0

1,407

cogito-v1-preview-llama-3B

Cogito V2 Preview Llama 70B

The Cogito v2 LLMs are instruction tuned generative models. All models are released under an open license for commercial use. - Cogito v2 models are hybrid reasoning models. Each model can answer directly (standard LLM), or self-reflect before answering (like reasoning models). - The LLMs are trained using Iterated Distillation and Amplification (IDA) - an scalable and efficient alignment strategy for superintelligence using iterative self-improvement. - The models have been optimized for coding, STEM, instruction following and general helpfulness, and have significantly higher multilingual, coding and tool calling capabilities than size equivalent counterparts. - In both standard and reasoning modes, Cogito v2-preview models outperform their size equivalent counterparts on common industry benchmarks. - This model is trained in over 30 languages and supports a context length of 128k. Evaluations Here is the model performance on some standard industry benchmarks: For detailed evaluations, please refer to the Blog Post. Usage Here is a snippet below for usage with Transformers: Implementing extended thinking - By default, the model will answer in the standard mode. - To enable thinking, you can do any one of the two methods: - Set `enablethinking=True` while applying the chat template. - Add a specific system prompt, along with prefilling the response with "\ \n". NOTE: Unlike Cogito v1 models, we initiate the response with "\ \n" at the beginning of every output when reasoning is enabled. This is because hybrid models can be brittle at times ( \n" ensures that the model does indeed respect thinking. Method 1 - Set enablethinking=True in the tokenizer If you are using Huggingface tokenizers, then you can simply use add the argument `enablethinking=True` to the tokenization (this option is added to the chat template). Method 2 - Add a specific system prompt, along with prefilling the response with "\ \n". To enable thinking using this method, you need to do two parts - Step 1 - Simply use this in the system prompt `systeminstruction = 'Enable deep thinking subroutine.'` If you already have a systeminstruction, then use `systeminstruction = 'Enable deep thinking subroutine.' + '\n\n' + systeminstruction`. Step 2 - Prefil the response with the tokens `" \n"`. Similarly, if you have a system prompt, you can append the `DEEPTHINKINGINSTRUCTION` to the beginning in this way - Tool Calling Cogito models support tool calling (single, parallel, multiple and parallelmultiple) both in standard and extended thinking mode. You can then generate text from this input as normal. If the model generates a tool call, you should add it to the chat like so: and then call the tool and append the result, with the `tool` role, like so: After that, you can `generate()` again to let the model use the tool result in the chat: License This repository and the model weights are licensed under the Llama 3.3 Community License Agreement (Llama models' default license agreement). Contact If you would like to reach out to our team, send an email to [email protected].

NaNK

llama

886

Cogito V2 Preview Llama 405B

The Cogito v2 LLMs are instruction tuned generative models. All models are released under an open license for commercial use. - Cogito v2 models are hybrid reasoning models. Each model can answer directly (standard LLM), or self-reflect before answering (like reasoning models). - The LLMs are trained using Iterated Distillation and Amplification (IDA) - an scalable and efficient alignment strategy for superintelligence using iterative self-improvement. - The models have been optimized for coding, STEM, instruction following and general helpfulness, and have significantly higher multilingual, coding and tool calling capabilities than size equivalent counterparts. - In both standard and reasoning modes, Cogito v2-preview models outperform their size equivalent counterparts on common industry benchmarks. - This model is trained in over 30 languages and supports a context length of 128k. Evaluations Here is the model performance on some standard industry benchmarks: For detailed evaluations, please refer to the Blog Post. Usage Here is a snippet below for usage with Transformers: Implementing extended thinking - By default, the model will answer in the standard mode. - To enable thinking, you can do any one of the two methods: - Set `enablethinking=True` while applying the chat template. - Add a specific system prompt, along with prefilling the response with "\ \n". NOTE: Unlike Cogito v1 models, we initiate the response with "\ \n" at the beginning of every output when reasoning is enabled. This is because hybrid models can be brittle at times ( \n" ensures that the model does indeed respect thinking. Method 1 - Set enablethinking=True in the tokenizer If you are using Huggingface tokenizers, then you can simply use add the argument `enablethinking=True` to the tokenization (this option is added to the chat template). Method 2 - Add a specific system prompt, along with prefilling the response with "\ \n". To enable thinking using this method, you need to do two parts - Step 1 - Simply use this in the system prompt `systeminstruction = 'Enable deep thinking subroutine.'` If you already have a systeminstruction, then use `systeminstruction = 'Enable deep thinking subroutine.' + '\n\n' + systeminstruction`. Step 2 - Prefil the response with the tokens `" \n"`. Similarly, if you have a system prompt, you can append the `DEEPTHINKINGINSTRUCTION` to the beginning in this way - Tool Calling Cogito models support tool calling (single, parallel, multiple and parallelmultiple) both in standard and extended thinking mode. You can then generate text from this input as normal. If the model generates a tool call, you should add it to the chat like so: and then call the tool and append the result, with the `tool` role, like so: After that, you can `generate()` again to let the model use the tool result in the chat: License This repository and the model weights are licensed under the Llama 3.1 Community License Agreement (Llama models' default license agreement). Contact If you would like to reach out to our team, send an email to [email protected].

NaNK

llama

117

cogito-v2-preview-llama-109B-MoE

The Cogito v2 LLMs are instruction tuned generative models. All models are released under an open license for commercial use. - Cogito v2 models are hybrid reasoning models. Each model can answer directly (standard LLM), or self-reflect before answering (like reasoning models). - The LLMs are trained using Iterated Distillation and Amplification (IDA) - an scalable and efficient alignment strategy for superintelligence using iterative self-improvement. - The models have been optimized for coding, STEM, instruction following and general helpfulness, and have significantly higher multilingual, coding and tool calling capabilities than size equivalent counterparts. - In both standard and reasoning modes, Cogito v2-preview models outperform their size equivalent counterparts on common industry benchmarks. - This model is trained in over 30 languages and supports long contexts (upto 10M tokens). Evaluations Here is the model performance on some standard industry benchmarks: For detailed evaluations, please refer to the Blog Post. Usage Here is a snippet below for usage with Transformers: Implementing extended thinking - By default, the model will answer in the standard mode. - To enable thinking, you can do any one of the two methods: - Set `enablethinking=True` while applying the chat template. - Add a specific system prompt, along with prefilling the response with "\ \n". NOTE: Unlike Cogito v1 models, we initiate the response with "\ \n" at the beginning of every output when reasoning is enabled. This is because hybrid models can be brittle at times ( \n" ensures that the model does indeed respect thinking. Method 1 - Set enablethinking=True in the tokenizer If you are using Huggingface tokenizers, then you can simply use add the argument `enablethinking=True` to the tokenization (this option is added to the chat template). Method 2 - Add a specific system prompt, along with prefilling the response with "\ \n". To enable thinking using this method, you need to do two parts - Step 1 - Simply use this in the system prompt `systeminstruction = 'Enable deep thinking subroutine.'` If you already have a systeminstruction, then use `systeminstruction = 'Enable deep thinking subroutine.' + '\n\n' + systeminstruction`. Step 2 - Prefil the response with the tokens `" \n"`. Similarly, if you have a system prompt, you can append the `DEEPTHINKINGINSTRUCTION` to the beginning in this way - Tool Calling Cogito models support tool calling (single, parallel, multiple and parallelmultiple) both in standard and extended thinking mode. You can then generate text from this input as normal. If the model generates a tool call, you should add it to the chat like so: and then call the tool and append the result, with the `tool` role, like so: After that, you can `generate()` again to let the model use the tool result in the chat: License This repository and the model weights are licensed under the Llama 4 Community License Agreement (Llama models' default license agreement). Contact If you would like to reach out to our team, send an email to [email protected].

NaNK

llama4

cogito-671b-v2.1

NaNK

license:mit

cogito-v2-preview-deepseek-671B-MoE-FP8

NaNK

license:mit

cogito-v2-preview-deepseek-671B-MoE

The Cogito v2 LLMs are instruction tuned generative models. All models are released under an open license for commercial use. - Cogito v2 models are hybrid reasoning models. Each model can answer directly (standard LLM), or self-reflect before answering (like reasoning models). - The LLMs are trained using Iterated Distillation and Amplification (IDA) - an scalable and efficient alignment strategy for superintelligence using iterative self-improvement. - The models have been optimized for coding, STEM, instruction following and general helpfulness, and have significantly higher multilingual, coding and tool calling capabilities than size equivalent counterparts. - In both standard and reasoning modes, Cogito v2-preview models outperform their size equivalent counterparts on common industry benchmarks. - This model is trained in over 30 languages and supports a context length of 128k. Evaluations Here is the model performance on some standard industry benchmarks: For detailed evaluations, please refer to the Blog Post. Usage Here is a snippet below for usage with Transformers: Implementing extended thinking - By default, the model will answer in the standard mode. - To enable thinking, you can do any one of the two methods: - Set `enablethinking=True` while applying the chat template. - Add a specific system prompt, along with prefilling the response with "\ \n". NOTE: Unlike Cogito v1 models, we initiate the response with "\ \n" at the beginning of every output when reasoning is enabled. This is because hybrid models can be brittle at times, and adding a "\ \n" ensures that the model does indeed respect thinking. Method 1 - Set enablethinking=True in the tokenizer If you are using Huggingface tokenizers, then you can simply use add the argument `enablethinking=True` to the tokenization (this option is added to the chat template). Method 2 - Add a specific system prompt, along with prefilling the response with "\ \n". To enable thinking using this method, you need to do two parts - Step 1 - Simply use this in the system prompt `systeminstruction = 'Enable deep thinking subroutine.'` If you already have a systeminstruction, then use `systeminstruction = 'Enable deep thinking subroutine.' + '\n\n' + systeminstruction`. Step 2 - Prefil the response with the tokens `" \n"`. Similarly, if you have a system prompt, you can append the `DEEPTHINKINGINSTRUCTION` to the beginning in this way - Tool Calling Cogito models support tool calling (single, parallel, multiple and parallelmultiple) both in standard and extended thinking mode. This will result in the output - json {"location":"Paris, France"} You can then generate text from this input as normal. If the model generates a tool call, you should add it to the chat like so: and then call the tool and append the result, with the `tool` role, like so: After that, you can `generate()` again to let the model use the tool result in the chat: License This repository and the model weights are licensed under MIT License. Contact If you would like to reach out to our team, send an email to [email protected].

NaNK

license:mit

deepcogito

cogito-v1-preview-qwen-14B

cogito-v1-preview-llama-3B

Cogito V2 Preview Llama 70B

cogito-671b-v2.1-FP8

cogito-v1-preview-llama-8B

cogito-v1-preview-llama-70B

cogito-v1-preview-qwen-32B

Cogito V2 Preview Llama 405B

cogito-v2-preview-llama-109B-MoE

cogito-671b-v2.1

cogito-v2-preview-deepseek-671B-MoE-FP8

cogito-v2-preview-deepseek-671B-MoE