New Dataset Release: Overtone Singing (Preview) — Articulation-Level Overtone & Throat Singing Primitives

Admin-Harmonic-Front · December 4, 2025, 10:44am

Hi everyone,

I’m excited to share a new addition to the Harmonic Frontier Audio catalog:

Overtone Singing — Preview Dataset

This preview set provides isolated, rights-cleared overtone singing primitives recorded specifically for machine learning research, MIR tasks, and generative audio experiments (e.g., formant control, spectral modeling, harmonic expression systems).

What’s inside

This preview includes a curated set of high-fidelity articulations such as:

Harmonic arpeggios (e.g., h3–h5, h4–h6)
Harmonic intervals
Harmonic scales
Glissandi across multiple overtone indices
Sustained stable fundamentals for spectral analysis

All recordings were captured at 96kHz / 24-bit, using a neutral recording chain (Zoom F8n Pro → NT1-A), and exported without coloration or processing.

Dataset characteristics

Clean harmonic isolation for spectral analysis
Stable formant transitions
Minimal breath noise
No room coloration
Consistent mic distance and gain staging
Useful for both generative training and MIR feature extraction

Metadata

The dataset includes structured CSV metadata with:

fundamental pitch
harmonic numbers
harmonic frequencies
gesture type
mic chain
recording information
articulation descriptions

This mirrors the structure used across other HFA datasets.

Part of a larger catalog

This dataset is one of many ongoing releases from the Harmonic Frontier Audio project — a catalog focused on high-quality, articulation-level acoustic primitives for AI audio research.

If you find this useful or have suggestions for what might help your research, feel free to reply here. I’m always open to collaboration, feedback, and ideas for making the datasets more useful to the community.

Thanks for taking a look!

— Blake
Harmonic Frontier Audio

Topic		Replies	Views
New Dataset: Subharmonic Phonation / Vocal Fry – Extended Vocal Techniques Series (Harmonic Frontier Audio) 🤗Datasets	0	13	November 4, 2025
How to do that trained huggingface model speech recognation? DeepSpeed	0	407	December 10, 2021
Marathi ASR: Fine-Tuning Wav2Vec2 Languages at Hugging Face	2	606	March 24, 2021
Create speech to text training dataset using text to speech model Intermediate	0	414	February 8, 2023
New Dataset Release – Kalimba (Preview) by Harmonic Frontier Audio 🤗Datasets	0	10	November 12, 2025

New Dataset Release: Overtone Singing (Preview) — Articulation-Level Overtone & Throat Singing Primitives

What’s inside

Dataset characteristics

Metadata

Part of a larger catalog

Related topics