Meta Llama 3.1 8B - Alpaca
This model card describes the LLaMA 3 8B Alpaca model fine-tuned for instruction-following and chat tasks. It has been optimized for fast and accurate text generation using LoRA and bf16 precision.
Model Details
Model Description
This model is a fine-tuned version of LLaMA 3 8B using Alpaca-style instruction-following data. It is designed for natural language understanding and generation tasks, supporting conversational AI and other NLP applications.
- Developed by: Anezatra
- Shared by: HuggingFace Community
- Model type: Transformer-based Language Model (Instruction-Finetuned)
- Language(s) (NLP): English (with potential for multilingual input via translation)
- License: Apache 2.0
- Finetuned from model: LLaMA 3 8B
Model Sources
- Repository: https://huggingface.co/unsloth/meta-llama-3.1-8b-alpaca
- Paper: https://arxiv.org/abs/2302.13971 (LLaMA 3)
- Demo: [Example usage with llama-cli]
Training Details
Training Data
- Instruction-following datasets (Alpaca-style(Cleaned))
Training Procedure
- Fine-tuned using LoRA
- Epochs: 1
- Steps: 120
- Learning rate: 2e-4
- Batch size: 6 per GPU
- Gradient accumulation: 5 steps
- Precision: bf16 mixed precision
Preprocessing
- Text normalization: trimming, punctuation correction, lowercasing
- Tokenization using LLaMA tokenizer
- Merging conversational history for context-aware generation
Speeds, Sizes, Times
- Model size (8B parameters) ~ 16 GB in bf16
- Q4_K_M quantized GGUF ~ 4.9 GB
- Training completed in under 24 hours on single GPU with LoRA
Evaluation
Testing Data, Factors & Metrics
- Evaluated on held-out instruction-following tasks
- Metrics: Perplexity, accuracy on factual Q&A, and BLEU for sequence generation
- Human evaluation for conversational coherence
Results
- Perplexity: ~1.35 on validation set
- Maintains context across multiple turns in dialogue
- Generates coherent and instruction-following responses
Environmental Impact
- Hardware Type: NVIDIA L4 24GB GPU
- Hours used: -
Technical Specifications
Model Architecture and Objective
- Transformer decoder-only architecture
- 8B parameters, 32 layers, 32 attention heads
- Optimized for instruction-following tasks and conversational AI
Compute Infrastructure
Hardware
- Single 24GB L4 GPU
Software
- PyTorch + Transformers + Unsloth LoRA integration
- llama.cpp for GGUF inference on CPU
Citation
BibTeX:
@misc{meta-llama-3.1-8b-alpaca,
title={LLaMA 3 8B Alpaca Fine-Tuned Model},
author={Anezatra},
year={2025},
howpublished={\url{https://huggingface.co/unsloth/meta-llama-3.1-8b-alpaca}}
}
- Downloads last month
- 37
Hardware compatibility
Log In
to view the estimation
4-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for anezatra/Llama-3.1-8B-alpaca-GGUF
Base model
meta-llama/Llama-3.1-8B