🦁 ResNet50 Fine-Tuned on Animals-10 Dataset

Note: This project was developed by Group 4 as part of the Youth AI Initiative. It demonstrates how to achieve state-of-the-art performance (>98% accuracy) using modern Fine-Tuning techniques on standard architectures.

📝 Overview

This project implements a high-performance Image Classification model capable of identifying 10 different animal species with near-perfect accuracy.

Unlike complex ensemble approaches that consume vast resources, we focused on optimizing a single robust backbone (ResNet50) using advanced training strategies like OneCycleLR, Label Smoothing, and Mixed Precision Training. This resulted in a lightweight yet extremely powerful model that outperforms standard baselines.

🎯 Objectives

High Accuracy: Achieve >95% accuracy on the test set (Achieved: 98.32%).
Robustness: Prevent overfitting using regularization techniques (Label Smoothing, Weight Decay).
Efficiency: Utilize GPU acceleration (AMP) for faster training.
Explainability: Analyze errors using Confusion Matrices and Per-Class metrics.

🏆 Performance Metrics

The model was evaluated on an independent test set (10% split) and achieved exceptional results across all metrics.

Metric	Score	Description
Test Accuracy	98.32%	Overall correct predictions.
F1-Score (Weighted)	0.98	Harmonic mean of precision and recall.
Precision	0.98	Accuracy of positive predictions.
Recall	0.98	Ability to find all positive instances.

📊 Confusion Matrix & Error Analysis

The confusion matrix below demonstrates the model's robustness. The dark diagonal line indicates near-perfect classification.

📈 Per-Class Performance

The model maintains high performance (>95%) even on difficult classes.

Class (IT/EN)	Precision	Recall	F1-Score
Cane (Dog)	0.99	0.98	0.99
Cavallo (Horse)	0.99	0.99	0.99
Elefante (Elephant)	0.98	0.99	0.99
Farfalla (Butterfly)	0.99	0.98	0.99
Gallina (Chicken)	0.97	0.98	0.98
Gatto (Cat)	0.96	0.97	0.97
Mucca (Cow)	0.97	0.96	0.97
Pecora (Sheep)	0.98	0.98	0.98
Ragno (Spider)	0.99	0.99	0.99
Scoiattolo (Squirrel)	0.98	0.97	0.98

⚙️ Methodology & Training Techniques

To achieve 98.32% accuracy while maintaining a healthy Bias-Variance Tradeoff, we employed the following advanced techniques:

Technique	Purpose in this Project
Model Architecture	ResNet50 backbone for powerful feature extraction.
Optimization	AdamW optimizer for better weight decay and regularization.
Learning Rate Schedule	OneCycleLR policy for faster and more stable convergence.
Regularization	Label Smoothing (0.1) to prevent overfitting and overconfidence.
Data Augmentation	`RandomErasing`, `ColorJitter`, and `Rotation` to force feature learning.
Mixed Precision	Native AMP (fp16) for efficient VRAM usage and speed.
Training Strategy	Fine-Tuning (Frozen early layers, trainable Layer 4 + FC).

🛠️ Installation & Requirements

To run this model, you need to install the following dependencies. We recommend using a GPU for faster inference.

pip install torch torchvision torchaudio pillow

💻 Usage Code (GPU Supported)

You can use this model directly with PyTorch. The code below automatically detects if you have a GPU (CUDA).

import torch
import torch.nn as nn
from torchvision import models, transforms
from PIL import Image

# 1. Device Configuration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 2. Define Architecture
model = models.resnet50(weights=None)
model.fc = nn.Linear(model.fc.in_features, 10)

# 3. Load Weights
# Ensure 'best_resnet50_animals.pt' is in your directory
model.load_state_dict(torch.load("best_resnet50_animals.pt", map_location=device))
model = model.to(device)
model.eval()

# 4. Preprocess Image
transform = transforms.Compose([
    transforms.Resize((256, 256)),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
])

# 5. Predict
img_path = "test_image.jpg" 
try:
    img = Image.open(img_path).convert("RGB")
    input_tensor = transform(img).unsqueeze(0).to(device)

    with torch.no_grad():
        output = model(input_tensor)
        probabilities = torch.nn.functional.softmax(output[0], dim=0)
        confidence, pred = torch.max(probabilities, 0)

    classes = ['cane', 'cavallo', 'elefante', 'farfalla', 'gallina', 
               'gatto', 'mucca', 'pecora', 'ragno', 'scoiattolo']
               
    print(f"Prediction: {classes[pred.item()].upper()} ({confidence.item():.2%})")

except FileNotFoundError:
    print("Image not found.")

👥 Team Members (Group 4)

[Kuzey KAYA, Maruf Salih ATALA, Umut ÇATAK, Kuzey ÇALIŞKAN, Mücahit YETER, Yusuf BATMACA, Göktüğ...]

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for youth-ai-initiative/Animals10_Classifier_Group_4

Base model

microsoft/resnet-50

Finetuned

(447)

this model

Evaluation results

accuracy on Animals-10
self-reported

98.320
f1 on Animals-10
self-reported

0.980

View on Papers With Code