codegen-csharp-run1 / logs /train_codegen_20251130_092019.log
reiprasetya-study's picture
Upload logs/train_codegen_20251130_092019.log with huggingface_hub
23065f8 verified
2025-11-30 09:20:19 - train_codegen - INFO - Logging to: logs/codegen/train_codegen_20251130_092019.log
2025-11-30 09:20:19 - train_codegen - INFO - Monitor progress: tail -f logs/codegen/train_codegen_20251130_092019.log
2025-11-30 09:20:19 - train_codegen - INFO - ============================================================
2025-11-30 09:20:19 - train_codegen - INFO - CodeGen Training
2025-11-30 09:20:19 - train_codegen - INFO - ============================================================
2025-11-30 09:20:19 - train_codegen - INFO - Using CUDA device: 0
2025-11-30 09:20:19 - train_codegen - INFO - GPU: NVIDIA GeForce RTX 5090
2025-11-30 09:20:19 - train_codegen - INFO - Configuration:
2025-11-30 09:20:19 - train_codegen - INFO - model: Salesforce/codegen-350M-mono
2025-11-30 09:20:19 - train_codegen - INFO - data: datasets/csharp
2025-11-30 09:20:19 - train_codegen - INFO - output: model/checkpoints/run1-csharp-codegen
2025-11-30 09:20:19 - train_codegen - INFO - batch_size: 10
2025-11-30 09:20:19 - train_codegen - INFO - gradient_accumulation_steps: 4
2025-11-30 09:20:19 - train_codegen - INFO - effective_batch_size: 40
2025-11-30 09:20:19 - train_codegen - INFO - learning_rate: 5e-05
2025-11-30 09:20:19 - train_codegen - INFO - epochs: 5
2025-11-30 09:20:19 - train_codegen - INFO - max_length: 1024
2025-11-30 09:20:19 - train_codegen - INFO - max_steps: -1
2025-11-30 09:20:19 - train_codegen - INFO - fp16: True
2025-11-30 09:20:19 - train_codegen - INFO - gradient_checkpointing: True
2025-11-30 09:20:19 - train_codegen - INFO - seed: 42
2025-11-30 09:20:19 - train_codegen - INFO - Loading tokenizer and model: Salesforce/codegen-350M-mono
2025-11-30 09:20:33 - train_codegen - INFO - Loading model with gradient checkpointing enabled
2025-11-30 09:20:33 - train_codegen - INFO - Loading dataset...
2025-11-30 09:20:33 - train_codegen - INFO - Loading dataset from datasets/csharp
2025-11-30 09:20:35 - train_codegen - INFO - Train samples: 226616
2025-11-30 09:20:35 - train_codegen - INFO - Validation samples: 28327
2025-11-30 09:20:35 - train_codegen - INFO - ============================================================
2025-11-30 09:20:35 - train_codegen - INFO - Dataset Preprocessing
2025-11-30 09:20:35 - train_codegen - INFO - ============================================================
2025-11-30 09:20:35 - train_codegen - INFO - Preprocessing 226616 samples (optimized eager loading)...
2025-11-30 09:20:40 - train_codegen - INFO - Preprocessed 10000/226616 samples
2025-11-30 09:20:45 - train_codegen - INFO - Preprocessed 20000/226616 samples
2025-11-30 09:20:49 - train_codegen - INFO - Preprocessed 30000/226616 samples
2025-11-30 09:20:53 - train_codegen - INFO - Preprocessed 40000/226616 samples
2025-11-30 09:20:56 - train_codegen - INFO - Preprocessed 50000/226616 samples
2025-11-30 09:21:02 - train_codegen - INFO - Preprocessed 60000/226616 samples
2025-11-30 09:21:07 - train_codegen - INFO - Preprocessed 70000/226616 samples
2025-11-30 09:21:11 - train_codegen - INFO - Preprocessed 80000/226616 samples
2025-11-30 09:21:16 - train_codegen - INFO - Preprocessed 90000/226616 samples
2025-11-30 09:21:20 - train_codegen - INFO - Preprocessed 100000/226616 samples
2025-11-30 09:21:25 - train_codegen - INFO - Preprocessed 110000/226616 samples
2025-11-30 09:21:30 - train_codegen - INFO - Preprocessed 120000/226616 samples
2025-11-30 09:21:40 - train_codegen - INFO - Preprocessed 130000/226616 samples
2025-11-30 09:21:45 - train_codegen - INFO - Preprocessed 140000/226616 samples
2025-11-30 09:21:50 - train_codegen - INFO - Preprocessed 150000/226616 samples
2025-11-30 09:21:55 - train_codegen - INFO - Preprocessed 160000/226616 samples
2025-11-30 09:22:00 - train_codegen - INFO - Preprocessed 170000/226616 samples
2025-11-30 09:22:05 - train_codegen - INFO - Preprocessed 180000/226616 samples
2025-11-30 09:22:09 - train_codegen - INFO - Preprocessed 190000/226616 samples
2025-11-30 09:22:14 - train_codegen - INFO - Preprocessed 200000/226616 samples
2025-11-30 09:22:18 - train_codegen - INFO - Preprocessed 210000/226616 samples
2025-11-30 09:22:23 - train_codegen - INFO - Preprocessed 220000/226616 samples
2025-11-30 09:22:26 - train_codegen - INFO - Preprocessed 226616/226616 samples
2025-11-30 09:22:26 - train_codegen - INFO - Preprocessing complete: 226616 samples ready
2025-11-30 09:22:26 - train_codegen - INFO - Preprocessing 28327 samples (optimized eager loading)...
2025-11-30 09:22:33 - train_codegen - INFO - Preprocessed 10000/28327 samples
2025-11-30 09:22:38 - train_codegen - INFO - Preprocessed 20000/28327 samples
2025-11-30 09:22:42 - train_codegen - INFO - Preprocessed 28327/28327 samples
2025-11-30 09:22:42 - train_codegen - INFO - Preprocessing complete: 28327 samples ready
2025-11-30 09:22:42 - train_codegen - INFO - ============================================================
2025-11-30 09:22:42 - train_codegen - INFO - Training Arguments
2025-11-30 09:22:42 - train_codegen - INFO - ============================================================
2025-11-30 09:22:43 - train_codegen - INFO - Training log will be saved to: model/checkpoints/run1-csharp-codegen/training_log.csv
2025-11-30 09:22:43 - train_codegen - INFO - ============================================================
2025-11-30 09:22:43 - train_codegen - INFO - Training Strategy
2025-11-30 09:22:43 - train_codegen - INFO - ============================================================
2025-11-30 09:22:43 - train_codegen - INFO - Evaluation every 1000 steps (optimized for speed)
2025-11-30 09:22:43 - train_codegen - INFO - Eval batch size: 20 (2x train batch)
2025-11-30 09:22:43 - train_codegen - INFO - Eval accumulation steps: 4
2025-11-30 09:22:43 - train_codegen - INFO - Save checkpoint every 2000 steps
2025-11-30 09:22:43 - train_codegen - INFO - Gradient checkpointing: ENABLED (saves VRAM, slower training)
2025-11-30 09:22:43 - train_codegen - INFO - FP16 mixed precision enabled
2025-11-30 09:22:43 - train_codegen - INFO - Dynamic padding per batch (10-20x faster than max_length padding)
2025-11-30 09:22:43 - train_codegen - INFO - ============================================================
2025-11-30 09:22:43 - train_codegen - INFO - Starting Training
2025-11-30 09:22:43 - train_codegen - INFO - ============================================================
2025-11-30 09:22:43 - train_codegen - INFO - Total training samples: 226616
2025-11-30 09:22:43 - train_codegen - INFO - Total validation samples: 28327
2025-11-30 09:22:43 - train_codegen - INFO - Starting training from scratch
2025-12-01 11:09:58 - train_codegen - INFO - Training completed successfully
2025-12-01 11:09:58 - train_codegen - INFO - ============================================================
2025-12-01 11:09:58 - train_codegen - INFO - Saving Final Model
2025-12-01 11:09:58 - train_codegen - INFO - ============================================================
2025-12-01 11:10:00 - train_codegen - INFO - Model and tokenizer saved to model/checkpoints/run1-csharp-codegen
2025-12-01 11:10:00 - train_codegen - INFO - ============================================================
2025-12-01 11:10:00 - train_codegen - INFO - Training Summary
2025-12-01 11:10:00 - train_codegen - INFO - ============================================================
2025-12-01 11:10:00 - train_codegen - INFO - Total steps: 28325
2025-12-01 11:10:00 - train_codegen - INFO - Best model checkpoint: model/checkpoints/run1-csharp-codegen/checkpoint-22000
2025-12-01 11:10:00 - train_codegen - INFO - Best eval loss: 0.6229148507118225
2025-12-01 11:10:00 - train_codegen - INFO - Done.