fabric-llm-finetune
156
3
llama-cpp
by
qvac
Language Model
OTHER
New
156 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
Unknown
Mobile
Laptop
Server
Quick Summary
AI model with specialized capabilities.
Training Data Analysis
🔵 Good (6.0/10)
Researched training datasets used by fabric-llm-finetune with quality assessment
Specialized For
general
multilingual
Training Datasets (1)
c4
🔵 6/10
general
multilingual
Key Strengths
- •Scale and Accessibility: 750GB of publicly available, filtered text
- •Systematic Filtering: Documented heuristics enable reproducibility
- •Language Diversity: Despite English-only, captures diverse writing styles
Considerations
- •English-Only: Limits multilingual applications
- •Filtering Limitations: Offensive content and low-quality text remain despite filtering
Explore our comprehensive training dataset analysis
View All DatasetsCode Examples
Step 2: Download Base Model & Adapterbash
# Create directories
mkdir -p models adapters
# === CHOOSE ONE MODEL ===
# Option 1: Qwen3-1.7B (recommended for most use cases)
wget https://huggingface.co/Qwen/Qwen3-1.7B-GGUF/resolve/main/qwen3-1_7b-q8_0.gguf -O models/base.gguf
wget https://huggingface.co/qvac/finetune/resolve/main/qwen3-1.7b-qkvo-ffn-lora-adapter.gguf -O adapters/adapter.ggufStep 3: Run Inference with Adapterbash
# Interactive chat mode
./bin/llama-cli \
-m models/base.gguf \
--lora adapters/adapter.gguf \
-ngl 999 \
-c 2048 \
--temp 0.7 \
-p "Q: Does vitamin D supplementation prevent fractures?\nA:"
# Single prompt mode
./bin/llama-cli \
-m models/base.gguf \
--lora adapters/adapter.gguf \
-ngl 999 \
-p "Explain the mechanism of action for beta-blockers in treating hypertension."Step 1-2: Same as Option 1bash
# Export LoRA adapter to base model format
./bin/llama-export-lora \
-m models/base.gguf \
--lora adapters/adapter.gguf \
-o models/merged.gguf
# Verify merged model
ls -lh models/merged.ggufVerify merged modelbash
# Use merged model directly (no --lora flag needed)
./bin/llama-cli \
-m models/merged.gguf \
-ngl 999 \
-c 2048 \
-p "Q: What are the contraindications for aspirin therapy?\nA:"Custom Temperature & Samplingbash
./bin/llama-cli \
-m models/base.gguf \
--lora adapters/adapter.gguf \
-ngl 999 \
--temp 0.3 \ # Lower = more focused (good for medical)
--top-p 0.9 \ # Nucleus sampling
--top-k 40 \ # Top-k sampling
--repeat-penalty 1.1 \
-n 512 \ # Max tokens to generate
-p "Your prompt"Batch Processingbash
# Create prompts file
cat > prompts.txt << 'EOF'
Q: Does vitamin D supplementation prevent fractures?
Q: Is aspirin effective for primary prevention of cardiovascular disease?
Q: Do statins reduce mortality in patients with heart failure?
EOF
# Process all prompts
cat prompts.txt | while read prompt; do
echo "=== Processing: $prompt ==="
./bin/llama-cli \
-m models/base.gguf \
--lora adapters/adapter.gguf \
-ngl 999 \
--temp 0.4 \
-p "$prompt\nA:"
echo ""
doneMobile-Specific Flagsbash
./bin/llama-cli \
-m model.gguf \
--lora adapter.gguf \
-ngl 99 \ # Partial GPU offload
-c 512 \ # Smaller context
-b 128 \ # Smaller batch
-fa off \ # Disable flash attention (Vulkan)
-ub 128 # Uniform batch size🔍 Troubleshootingbash
# Use smaller batch size and disable flash attention
./bin/llama-cli -m model.gguf --lora adapter.gguf -ngl 99 -c 512 -b 128 -ub 128 -fa offUse smaller batch size and disable flash attentionbash
# Reduce context size or use smaller model
./bin/llama-cli -m model.gguf --lora adapter.gguf -ngl 50 -c 512Reduce context size or use smaller modelbash
# Offload fewer layers to GPU
./bin/llama-cli -m model.gguf --lora adapter.gguf -ngl 20Offload fewer layers to GPUbash
# Verify adapter file exists and matches model architecture
ls -lh adapters/
./bin/llama-cli -m model.gguf --lora adapter.gguf --verboseDeploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.