Model Card: Nllb-200-distilled-600M_236K

This model is a domain-adapted version of facebook/nllb-200-distilled-600M, fine-tuned on 236k Englishโ€“French sentence pairs from the bioinformatics and biomedical domains.
It is designed for English โ†’ French Machine Translation.

โœ๏ธ Model Details

Model Description

  • Developed by: Jurgi Giraud
  • Model type: Multilingual language model
  • Language(s) (NLP): English to French
  • License: CC-BY-NC-4.0
  • Finetuned from model: facebook/nllb-200-distilled-600M

This model was fine-tuned as part of a PhD research project investigating domain adaptation for Machine Translation (MT) in low-resource scenario within the bioinformatics domain (English โ†’ French). The project explores the performance of compact MT models and Large Language Models (LLMs), including architectures under 1B parameters as well as models in the 3Bโ€“8B range, with a strong emphasis on resource-efficient fine-tuning strategies. The fine-tuning process made use of Parameter-Efficient Fine-Tuning (PEFT) and quantization, in particular QLoRA (Quantized Low-Rank Adaptation), for larger models (Dettmers et al., 2023).

In total, 5 models were fine-tuned on in-domain data: t5_236k | nllb-200-distilled-600M_236K (๐Ÿ‘ˆ current model) | madlad400-3b-mt_236k | TowerInstruct-7B-v0.2_236k | and Llama-3.1-8B-Instruct_236K

๐Ÿš€ Usage

This model is intended to be used for English โ†’ French Machine Translation in the bioinformatics domain.

Example (GPU)

Find below an example of basic usage with GPU using Hugging Face's Transformers library.

First, install dependencies:

pip install torch transformers
import torch
from transformers import pipeline

pipeline = pipeline(task="translation_en_to_fr", model="jurgiraud/nllb-200-distilled-600M_236K", device=0)

output = pipeline("The deletion of a gene may result in death or in a block of cell division.")
print(output[0]["translation_text"])
#La suppression d'un gรจne peut entraรฎner la mort ou un blocage de la division cellulaire.

๐Ÿ”ง Fine-tuning Details

Fine-tuning Data

The model was fine-tuned on a set of 236k English-French parallel examples consisting of:

  • Natural parallel data (bioinformatics and biomedical data)
  • Synthetic data, including:
    • Back-translation of in-domain monolingual texts
    • Paraphrased data
    • Terminology-constrained synthetic generation

Fine-tuning dataset available ๐Ÿ‘‰ here.

Fine-tuning Procedure

The model was fine-tuned using transformers Seq2SeqTrainer.

Fine-tuning Hyperparameters

Key hyperparameters and training setup:

  • Approach: Seq2SeqTrainer
  • Training: 8 epochs, learning rate = 2e-5, batch size = 16 (per device)
  • Precision: bfloat16 (bf16)

๐Ÿ“Š Evaluation

The model was evaluated on an in-domain bioinformatics test set using standard MT metrics.

Testing Data & Metrics

Testing Data

Test set available ๐Ÿ‘‰ here.

Metrics

  • BLEU
  • chrF++ (chrF2)
  • TER
  • COMET

Results

Results from automated metrics. Baseline vs domain-adapted model. Best scores in bold.

Models BLEUโ†‘ chRF2โ†‘ TERโ†“ COMETโ†‘
Baseline model Llama-3.2-1B-Instruct 41.00 68.48 49.26 84.54
Domain-adapted model Llama-3.2-1B-Instruct_236K 44.77 71.64 46.24 85.84

๐ŸŒฑ Environmental Impact

The fine-tuning carbon footprint was estimated using the Green Algorithms framework (Lannelongue et al., 2021).

  • Carbon emissions: 960.23 gCOโ‚‚e
  • Energy consumption: 4.15 kWh

๐Ÿ“š Citation

BibTeX:

@phdthesis{giraud2025bioinformaticsMT,
  title        = {Developing Machine Translation for Bioinformatics: An Exploration into Domain-Specific Terminology, Domain-Adaptation, and Evaluation},
  author       = {Giraud, Jurgi},
  school       = {The Open University},
  year         = {2025},
  note         = {Forthcoming. Expected publication date: December 2025.},
}
Downloads last month
12
Safetensors
Model size
0.6B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for jurgiraud/nllb-200-distilled-600M_236K

Finetuned
(222)
this model