Adapters
GGUF

Meta Llama 3.1 8B - Alpaca

This model card describes the LLaMA 3 8B Alpaca model fine-tuned for instruction-following and chat tasks. It has been optimized for fast and accurate text generation using LoRA and bf16 precision.

Model Details

Model Description

This model is a fine-tuned version of LLaMA 3 8B using Alpaca-style instruction-following data. It is designed for natural language understanding and generation tasks, supporting conversational AI and other NLP applications.

  • Developed by: Anezatra
  • Shared by: HuggingFace Community
  • Model type: Transformer-based Language Model (Instruction-Finetuned)
  • Language(s) (NLP): English (with potential for multilingual input via translation)
  • License: Apache 2.0
  • Finetuned from model: LLaMA 3 8B

Model Sources

Training Details

Training Data

  • Instruction-following datasets (Alpaca-style(Cleaned))

Training Procedure

  • Fine-tuned using LoRA
  • Epochs: 1
  • Steps: 120
  • Learning rate: 2e-4
  • Batch size: 6 per GPU
  • Gradient accumulation: 5 steps
  • Precision: bf16 mixed precision

Preprocessing

  • Text normalization: trimming, punctuation correction, lowercasing
  • Tokenization using LLaMA tokenizer
  • Merging conversational history for context-aware generation

Speeds, Sizes, Times

  • Model size (8B parameters) ~ 16 GB in bf16
  • Q4_K_M quantized GGUF ~ 4.9 GB
  • Training completed in under 24 hours on single GPU with LoRA

Evaluation

Testing Data, Factors & Metrics

  • Evaluated on held-out instruction-following tasks
  • Metrics: Perplexity, accuracy on factual Q&A, and BLEU for sequence generation
  • Human evaluation for conversational coherence

Results

  • Perplexity: ~1.35 on validation set
  • Maintains context across multiple turns in dialogue
  • Generates coherent and instruction-following responses

Environmental Impact

  • Hardware Type: NVIDIA L4 24GB GPU
  • Hours used: -

Technical Specifications

Model Architecture and Objective

  • Transformer decoder-only architecture
  • 8B parameters, 32 layers, 32 attention heads
  • Optimized for instruction-following tasks and conversational AI

Compute Infrastructure

Hardware

  • Single 24GB L4 GPU

Software

  • PyTorch + Transformers + Unsloth LoRA integration
  • llama.cpp for GGUF inference on CPU

Citation

BibTeX:

@misc{meta-llama-3.1-8b-alpaca,
  title={LLaMA 3 8B Alpaca Fine-Tuned Model},
  author={Anezatra},
  year={2025},
  howpublished={\url{https://huggingface.co/unsloth/meta-llama-3.1-8b-alpaca}}
}
Downloads last month
37
GGUF
Model size
8B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for anezatra/Llama-3.1-8B-alpaca-GGUF

Adapter
(521)
this model

Dataset used to train anezatra/Llama-3.1-8B-alpaca-GGUF