Neutrino-Instruct is a 7B parameter instruction-tuned LLM developed by Fardeen NB. It is designed for conversational AI, multi-step reasoning, and instruction-following tasks, fine-tuned to maintain coherent and contextual dialogue across multiple turns.
- Model Name: Neutrino-Instruct - Developer: Fardeen NB - License: Apache-2.0 - Language(s): English - Format: GGUF (optimized for `llama.cpp` and `Ollama`) - Base Model: Neutrino - Version: 2.0 - Task: Text Generation (chat, Q&A, instruction-following)
CPU-only: 32–64GB RAM recommended (runs on modern laptops, slower inference). GPU acceleration:
4GB VRAM → 4-bit quantized (Q4) models 8GB VRAM → 5-bit/8-bit models 12GB+ VRAM → FP16 full precision
Conversational AI assistants Research prototypes Instruction-following agents Chatbots with identity-awareness
⚠️ Out of Scope: Use in critical decision-making, legal, or medical contexts.
Model uploaded in GGUF format for portability & performance. Compatible with llama.cpp, Ollama, and llama-cpp-python. Supports quantization levels (Q4, Q5, Q8) for deployment on resource-constrained devices.
If you use Neutrino in your research or projects, please cite: