Hi everyone,
I’m excited to share a new addition to the Harmonic Frontier Audio catalog:
Overtone Singing — Preview Dataset
This preview set provides isolated, rights-cleared overtone singing primitives recorded specifically for machine learning research, MIR tasks, and generative audio experiments (e.g., formant control, spectral modeling, harmonic expression systems).
What’s inside
This preview includes a curated set of high-fidelity articulations such as:
-
Harmonic arpeggios (e.g., h3–h5, h4–h6)
-
Harmonic intervals
-
Harmonic scales
-
Glissandi across multiple overtone indices
-
Sustained stable fundamentals for spectral analysis
All recordings were captured at 96kHz / 24-bit, using a neutral recording chain (Zoom F8n Pro → NT1-A), and exported without coloration or processing.
Dataset characteristics
-
Clean harmonic isolation for spectral analysis
-
Stable formant transitions
-
Minimal breath noise
-
No room coloration
-
Consistent mic distance and gain staging
-
Useful for both generative training and MIR feature extraction
Metadata
The dataset includes structured CSV metadata with:
-
fundamental pitch
-
harmonic numbers
-
harmonic frequencies
-
gesture type
-
mic chain
-
recording information
-
articulation descriptions
This mirrors the structure used across other HFA datasets.
Part of a larger catalog
This dataset is one of many ongoing releases from the Harmonic Frontier Audio project — a catalog focused on high-quality, articulation-level acoustic primitives for AI audio research.
If you find this useful or have suggestions for what might help your research, feel free to reply here. I’m always open to collaboration, feedback, and ideas for making the datasets more useful to the community.
Thanks for taking a look!
— Blake
Harmonic Frontier Audio