π¦ ResNet50 Fine-Tuned on Animals-10 Dataset
Note: This project was developed by Group 4 as part of the Youth AI Initiative. It demonstrates how to achieve state-of-the-art performance (>98% accuracy) using modern Fine-Tuning techniques on standard architectures.
π Overview
This project implements a high-performance Image Classification model capable of identifying 10 different animal species with near-perfect accuracy.
Unlike complex ensemble approaches that consume vast resources, we focused on optimizing a single robust backbone (ResNet50) using advanced training strategies like OneCycleLR, Label Smoothing, and Mixed Precision Training. This resulted in a lightweight yet extremely powerful model that outperforms standard baselines.
π― Objectives
- High Accuracy: Achieve >95% accuracy on the test set (Achieved: 98.32%).
- Robustness: Prevent overfitting using regularization techniques (Label Smoothing, Weight Decay).
- Efficiency: Utilize GPU acceleration (AMP) for faster training.
- Explainability: Analyze errors using Confusion Matrices and Per-Class metrics.
π Performance Metrics
The model was evaluated on an independent test set (10% split) and achieved exceptional results across all metrics.
| Metric | Score | Description |
|---|---|---|
| Test Accuracy | 98.32% | Overall correct predictions. |
| F1-Score (Weighted) | 0.98 | Harmonic mean of precision and recall. |
| Precision | 0.98 | Accuracy of positive predictions. |
| Recall | 0.98 | Ability to find all positive instances. |
π Confusion Matrix & Error Analysis
The confusion matrix below demonstrates the model's robustness. The dark diagonal line indicates near-perfect classification.
π Per-Class Performance
The model maintains high performance (>95%) even on difficult classes.
| Class (IT/EN) | Precision | Recall | F1-Score |
|---|---|---|---|
| Cane (Dog) | 0.99 | 0.98 | 0.99 |
| Cavallo (Horse) | 0.99 | 0.99 | 0.99 |
| Elefante (Elephant) | 0.98 | 0.99 | 0.99 |
| Farfalla (Butterfly) | 0.99 | 0.98 | 0.99 |
| Gallina (Chicken) | 0.97 | 0.98 | 0.98 |
| Gatto (Cat) | 0.96 | 0.97 | 0.97 |
| Mucca (Cow) | 0.97 | 0.96 | 0.97 |
| Pecora (Sheep) | 0.98 | 0.98 | 0.98 |
| Ragno (Spider) | 0.99 | 0.99 | 0.99 |
| Scoiattolo (Squirrel) | 0.98 | 0.97 | 0.98 |
βοΈ Methodology & Training Techniques
To achieve 98.32% accuracy while maintaining a healthy Bias-Variance Tradeoff, we employed the following advanced techniques:
| Technique | Purpose in this Project |
|---|---|
| Model Architecture | ResNet50 backbone for powerful feature extraction. |
| Optimization | AdamW optimizer for better weight decay and regularization. |
| Learning Rate Schedule | OneCycleLR policy for faster and more stable convergence. |
| Regularization | Label Smoothing (0.1) to prevent overfitting and overconfidence. |
| Data Augmentation | RandomErasing, ColorJitter, and Rotation to force feature learning. |
| Mixed Precision | Native AMP (fp16) for efficient VRAM usage and speed. |
| Training Strategy | Fine-Tuning (Frozen early layers, trainable Layer 4 + FC). |
π οΈ Installation & Requirements
To run this model, you need to install the following dependencies. We recommend using a GPU for faster inference.
pip install torch torchvision torchaudio pillow
π» Usage Code (GPU Supported)
You can use this model directly with PyTorch. The code below automatically detects if you have a GPU (CUDA).
import torch
import torch.nn as nn
from torchvision import models, transforms
from PIL import Image
# 1. Device Configuration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# 2. Define Architecture
model = models.resnet50(weights=None)
model.fc = nn.Linear(model.fc.in_features, 10)
# 3. Load Weights
# Ensure 'best_resnet50_animals.pt' is in your directory
model.load_state_dict(torch.load("best_resnet50_animals.pt", map_location=device))
model = model.to(device)
model.eval()
# 4. Preprocess Image
transform = transforms.Compose([
transforms.Resize((256, 256)),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
])
# 5. Predict
img_path = "test_image.jpg"
try:
img = Image.open(img_path).convert("RGB")
input_tensor = transform(img).unsqueeze(0).to(device)
with torch.no_grad():
output = model(input_tensor)
probabilities = torch.nn.functional.softmax(output[0], dim=0)
confidence, pred = torch.max(probabilities, 0)
classes = ['cane', 'cavallo', 'elefante', 'farfalla', 'gallina',
'gatto', 'mucca', 'pecora', 'ragno', 'scoiattolo']
print(f"Prediction: {classes[pred.item()].upper()} ({confidence.item():.2%})")
except FileNotFoundError:
print("Image not found.")
π₯ Team Members (Group 4)
- [Kuzey KAYA, Maruf Salih ATALA, Umut ΓATAK, Kuzey ΓALIΕKAN, MΓΌcahit YETER, Yusuf BATMACA, GΓΆktΓΌΔ...]
Model tree for youth-ai-initiative/Animals10_Classifier_Group_4
Base model
microsoft/resnet-50Evaluation results
- accuracy on Animals-10self-reported98.320
- f1 on Animals-10self-reported0.980
