92750bf2de129aeb373ebbbb3b09c01e

This model is a fine-tuned version of google/long-t5-tglobal-xl on the Helsinki-NLP/opus_books [en-ru] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0092
  • Data Size: 1.0
  • Epoch Runtime: 238.5313
  • Bleu: 13.4589

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 2.8029 0 18.0919 0.1281
No log 1 437 2.3923 0.0078 21.2322 0.2975
No log 2 874 2.2184 0.0156 24.8742 0.3557
No log 3 1311 2.0426 0.0312 32.5722 0.7512
No log 4 1748 1.8893 0.0625 41.3585 1.4739
2.0564 5 2185 1.7338 0.125 57.0915 2.7085
1.8861 6 2622 1.5660 0.25 80.9126 4.4450
1.6364 7 3059 1.3907 0.5 135.0631 5.7552
1.3652 8.0 3496 1.1953 1.0 245.6774 7.9893
1.1932 9.0 3933 1.0847 1.0 237.4980 9.5079
1.0438 10.0 4370 1.0209 1.0 240.7625 10.4328
0.9525 11.0 4807 0.9845 1.0 238.4553 11.1651
0.8678 12.0 5244 0.9473 1.0 238.6869 11.9281
0.787 13.0 5681 0.9370 1.0 238.1721 12.3069
0.7035 14.0 6118 0.9371 1.0 238.8203 12.7342
0.6576 15.0 6555 0.9317 1.0 237.5394 12.9370
0.5825 16.0 6992 0.9412 1.0 238.2312 13.3106
0.537 17.0 7429 0.9595 1.0 237.2018 13.3257
0.4899 18.0 7866 0.9831 1.0 236.6620 13.5000
0.4346 19.0 8303 1.0092 1.0 238.5313 13.4589

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
3
Safetensors
Model size
0.7B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/92750bf2de129aeb373ebbbb3b09c01e

Finetuned
(49)
this model

Evaluation results