--- datasets: - yahma/alpaca-cleaned base_model: - meta-llama/Llama-3.1-8B library_name: adapter-transformers --- --- # Meta Llama 3.1 8B - Alpaca This model card describes the LLaMA 3 8B Alpaca model fine-tuned for instruction-following and chat tasks. It has been optimized for fast and accurate text generation using LoRA and bf16 precision. ## Model Details ### Model Description This model is a fine-tuned version of LLaMA 3 8B using Alpaca-style instruction-following data. It is designed for natural language understanding and generation tasks, supporting conversational AI and other NLP applications. - **Developed by:** Anezatra - **Shared by:** HuggingFace Community - **Model type:** Transformer-based Language Model (Instruction-Finetuned) - **Language(s) (NLP):** English (with potential for multilingual input via translation) - **License:** Apache 2.0 - **Finetuned from model:** LLaMA 3 8B ### Model Sources - **Repository:** https://huggingface.co/unsloth/meta-llama-3.1-8b-alpaca - **Paper:** https://arxiv.org/abs/2302.13971 (LLaMA 3) - **Demo:** [Example usage with llama-cli] ## Training Details ### Training Data - Instruction-following datasets (Alpaca-style(Cleaned)) ### Training Procedure - Fine-tuned using LoRA - **Epochs:** 1 - **Steps:** 120 - **Learning rate:** 2e-4 - **Batch size:** 6 per GPU - **Gradient accumulation:** 5 steps - **Precision:** bf16 mixed precision ### Preprocessing - Text normalization: trimming, punctuation correction, lowercasing - Tokenization using LLaMA tokenizer - Merging conversational history for context-aware generation ### Speeds, Sizes, Times - Model size (8B parameters) ~ 16 GB in bf16 - Q4_K_M quantized GGUF ~ 4.9 GB - Training completed in under 24 hours on single GPU with LoRA ## Evaluation ### Testing Data, Factors & Metrics - Evaluated on held-out instruction-following tasks - Metrics: Perplexity, accuracy on factual Q&A, and BLEU for sequence generation - Human evaluation for conversational coherence ### Results - Perplexity: ~1.35 on validation set - Maintains context across multiple turns in dialogue - Generates coherent and instruction-following responses ## Environmental Impact - **Hardware Type:** NVIDIA L4 24GB GPU - **Hours used:** - ## Technical Specifications ### Model Architecture and Objective - Transformer decoder-only architecture - 8B parameters, 32 layers, 32 attention heads - Optimized for instruction-following tasks and conversational AI ### Compute Infrastructure #### Hardware - Single 24GB L4 GPU #### Software - PyTorch + Transformers + Unsloth LoRA integration - llama.cpp for GGUF inference on CPU ## Citation **BibTeX:** ``` @misc{meta-llama-3.1-8b-alpaca, title={LLaMA 3 8B Alpaca Fine-Tuned Model}, author={Anezatra}, year={2025}, howpublished={\url{https://huggingface.co/unsloth/meta-llama-3.1-8b-alpaca}} } ```