| license: mit | |
| datasets: | |
| - allenai/c4 | |
| language: | |
| - en | |
| library_name: transformers | |
| # Bingus-v0.1-60M-Base | |
| A not-so-state-of-the-art 60M parameter transformer model. | |
| Uses the olmo default architecture. | |
| ### Specs | |
| Heads: 8 | |
| Layers: 8 | |
| Dimension model: 512 | |
| Dimension mlp: 4096 | |
| eval/v3-small-c4_en-validation/Perplexity: 40.33 | |
| ### Training Data | |
| Pretraining: | |
| - 5B Tokens C4 (preprocessed, from olmo-data.org) |