hardrave
medicanite-granite-3.3-2b-instruct
Zombiellm
🧟 Model Details - Student: `gpt2-xl` (1.5B parameters), OpenAI (2019) [1] - Teacher: `gpt-oss-20b` (OpenAI open-weight), OpenAI (2025) [2] - Architecture: Decoder-only Transformer (GPT-2 family). - Context window: 1024 tokens (default). 🧪 Training & Adaptation - SFT: Instruction fine-tuning on GPT-OSS-20B–distilled responses to questions from Dolly-15k [3] and Alpaca [4] prompts, using TRL with DoRA (bf16). - Representation-level KD: Cosine-similarity alignment via shared projection heads (student ↔ teacher). - Domain tuning: Survival + persona blend for tone, including questions from CoTReasoningBushcraftSurvival [5]. - Persona booster: Short DoRA pass to stabilize style and voice. ZombieLLM was trained on a blend of distilled instruction–response datasets and custom persona data: - hardrave/alpacagptossdatadistilled – Alpaca-cleaned (15k sample) prompts with distilled GPT-OSS-20B answers [6] - hardrave/dolly15kgptossdatadistilled – Dolly-15k prompts with distilled final-only answers from GPT-OSS-20B [7] - hardrave/bushcraftsurvivalgptossdatadistilled – CoT Bushcraft/Survival dataset distilled into concise final answers [8] - hardrave/zombiepersona – Custom MIT-licensed dataset injecting a consistent undead survivalist persona [9] These datasets were used for SFT (instruction fine-tuning) and representation-level KD (knowledge distillation), forming the backbone of the ZombieLLM reanimation pipeline. ⚠️ Limitations & Risks > - Small model trade-offs: As a 1.5B GPT-2 derivative, reasoning and factual accuracy are limited vs. modern mid/large LLMs. > - Hallucinations: May assert plausible-sounding but incorrect facts. Verification required for critical tasks. > - English-centric: Performance is strongest in English (due to GPT-2 pretraining). > - No memory by design: Template ignores history - good for privacy/reproducibility, but not suited for long multi-turn dialogue. 📜 Disclaimer & Responsible Use - RESEARCH USE ONLY - NO PRODUCTION, NO ADVICE - Provided as is for research and evaluation. Not approved for production or decision-making without human oversight. - Outputs may be inaccurate, misleading, biased, or offensive. Do not use for medical, legal, financial, or safety-critical purposes. - You are responsible for usage, compliance, filtering, and review of all inputs/outputs. The ZombieLLM model weights are released under the CC BY-NC 4.0 License, because they were trained on datasets that carry non-commercial terms. This project is intended for research and experimentation. It is not production-ready and should be used for learning, exploration, and prototyping rather than deployment in critical systems. 📚 References 1. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al. Language models are unsupervised multitask learners. OpenAI Blog 1(8):9 (2019). 2. OpenAI. gpt-oss-120b & gpt-oss-20b Model Card. arXiv:2508.10925 (2025). https://arxiv.org/abs/2508.10925 3. Conover, M., Hayes, M., Mathur, A., Xie, J., Wan, J., Shah, S., Ghodsi, A., Wendell, P., Zaharia, M., Xin, R. Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM. Databricks Blog (2023). https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm 4. Taori, R., Gulrajani, I., Zhang, T., Dubois, Y., Li, X., Guestrin, C., Liang, P., Hashimoto, T. Stanford Alpaca: An Instruction-following LLaMA model. GitHub repository (2023). https://github.com/tatsu-lab/stanfordalpaca 5. Wesney, M. R. CoTReasoningBushcraftSurvivalDataset. Hugging Face (2025). https://huggingface.co/datasets/moremilk/CoTReasoningBushcraftSurvival 6. 7. 8. 9.