Skip to content

vocal-tools

High-precision vocal analysis pipeline built to solve a hearing disability

PythonPraat (parselmouth)librosaNumPy/SciPymatplotlib
Problem
Can't reliably hear yourself sing — 60% hearing loss, right ear
Solution
Python pipeline measuring 70+ vocal metrics at 44.1 kHz
Proof
91.5 raw score → 92.8 tuned, 7.3c intonation, 168 analysis runs
Stack
Python, Praat, librosa, NumPy/SciPy, matplotlib

The Problem

Audiology-confirmed significant hearing loss in my right ear (~60%). When you can't reliably hear yourself sing, you can't trust your own judgment about pitch, tone, or quality. Vocal coaching helps, but between sessions there's no objective feedback loop. The ear doesn't heal. The tool has to compensate.

The Solution

A Python pipeline that analyzes vocal recordings at 44.1 kHz and produces high-precision metrics: pitch accuracy (cents deviation from target), formant structure (F1/F2/singer's formant ratio), tone clarity, jitter, shimmer, HNR, vibrato characteristics, and 70+ total measurements. Every take gets a composite score, a suite of diagnostic plots, and exportable CSV data.

Key Metrics Achieved

  • 91.5 — best raw composite score (Take 19)
  • 92.8 — best tuned composite score (Take 19 tuned)
  • 7.3c — pitch intonation accuracy (tuned)
  • 97.9 — tone clarity score (Take 24)

Progression

Vocal quality progression over time showing improvement from initial recordings to current best scores

Architecture

The pipeline runs in three stages:

  1. Analysisvocal_take_analyzer.py processes WAV files through Praat (pitch tracking, formant extraction, jitter/shimmer/HNR) and librosa (onset detection, spectral analysis). Outputs CSV metrics and diagnostic plots.
  2. Compositionvocal_comp_builder.py selects the best segments across multiple takes and builds a composite. Section-level scoring ensures emotional continuity.
  3. Cleanupvocal_comp_cleanup.py handles gap management (comfort noise fill), leveling, dehum, declick, and edge smoothing. Produces an Auto-Tune-ready output.

50 Numbered Discoveries

Over months of daily recording and analysis, the system surfaced 50 numbered technique discoveries — from monitoring as the single largest factor (#1), to the B3 drone breakthrough for pitch anchoring (#2), to consonant flicking (#39), formant sustains (#40), and mix pitch (#43). Each discovery was validated by measurable score improvement.

What This Demonstrates

  • DSP engineering with real-time audio analysis
  • Python scientific computing (NumPy, SciPy, Praat bindings)
  • Data pipeline design with measurable outcomes
  • Iterative optimization over 168 analysis runs
  • Disability compensation through engineering

Get in Touch

This repo is private. Request a walkthrough or view all projects.