Voice Stress Markers Are Orthogonal to Speech Disfluency Labels: A Large-Scale Analysis on SEP-28K

Nazar Kozak

engrXiv Preprint, April 2026 · Planned submission to Journal of Fluency Disorders

Abstract

The relationship between voice stress markers and speech disfluency events has not been systematically quantified at scale, despite both being targets of clinical assessment in stuttering populations. We examine correlations between four acoustic stress features — jitter, shimmer, fundamental frequency (F0) standard deviation, and a composite stress score — and five disfluency types (prolongation, block, sound repetition, word repetition, interjection) across 14,645 three-second clips from the SEP-28K dataset with valid pitch estimates. Using both Pearson and point-biserial correlations with Bonferroni correction for 20 comparisons, we find that all absolute correlations fall below 0.05, with all effect sizes negligible by Cohen's convention (|r| < 0.10). The strongest observed association (composite stress × prolongation, r = −0.050) explains only 0.25% of variance. These findings suggest that acoustic voice stress markers and disfluency labels carry largely non-overlapping information in this dataset.

Keywords
Voice stress analysis, disfluency detection, stuttering, SEP-28K, speech assessment, jitter, shimmer, F0 variability, correlation analysis
Status
Preprint on engrXiv (April 2026) · Planned submission to Journal of Fluency Disorders
Key Finding
All 20 stress–disfluency correlations are negligible (|r| < 0.05, Cohen's d < 0.04), suggesting separable signal dimensions
Dataset
SEP-28K (14,645 clips with valid pitch estimates)
Author
Nazar Kozak — Kozak Technologies Inc., Los Angeles, CA, USA
Contact
nzrkzk@gmail.com · ORCID

← Back to all publications