Morgan Sonderegger

Recent work

Papers

Patterns of pre-nasal allophony across dialects of English: A multi-corpus study of the /IH/-/EH/ contrast. (submitted). Irene Smith and M. Sonderegger. (draft | code)
Imitation of F0 tone contours by Mandarin and English speakers is both categorical and continuous . (accepted). Wei Zhang and Meghan Clayards and M. Sonderegger. To appear in Journal of Phonetics. (preprint | code)
A sociophonetic study of creaky voice across language, gender and age in Canadian English-French bilinguals . (2025). Jeanne Brown and M. Sonderegger. Journal of Phonetics 112, 101431. (paper | code)
A new perspective on the development of Quebec French rhotic vowels . (2025). Massimo Lipari and M. Sonderegger. Laboratory Phonology, 16(1). (paper | code)
Advancements of Phonetics in the 21st century: Quantitative data analysis. (2025). M. Sonderegger and Márton Sóskuthy. Journal of Phonetics 111, 101415. (paper | code)
Unzipping the causality of Zipf's Law and other lexical trade-offs . (2025). Amanda Doucette, Timothy O'Donnell, and M. Sonderegger. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (CMCL). (paper | code)
The cross-linguistic distribution of vowel and consonant intrinsic F0 effects. (2025). Connie Ting, M. Sonderegger, Meghan Clayards, and Michael McAuliffe. Language 101(1), 1–36. (paper | code)

Book

M. Sonderegger. (2023) Regression Modeling for Linguistic Data. Cambridge: MIT Press.
Repository (Github, OSF): contains code, datasets, preprint.

Workshops

PolyglotDB: a library for representing and analyzing speech data. Workshop at Montreal Open Tools Symposium, 9/2025. (tutorial | slides)
Quantitative analysis for corpus phonetics and phonology. Three-hour workshop in the series UnLaboratory Phonology: Corpus Approaches", Association for Laboratory Phonology, 7/2023. Materials on OSF. (slides, code)

Lecture notes

Advanced Quantitative Methods (LING 683): a course on Bayesian regression modeling and nonlinear regression models, including GAMMs, taught at McGill in Fall 2024. Includes contributions by Márton Sóskuthy, Amanda Doucette, and Massimo Lipari. (e-book)

Software

Montreal Corpus Tools, including:
- PolyglotDB: a tool for representing, integrating, and querying speech corpora
- Montreal Forced Aligner: trainable forced alignment using Kaldi
AutoVOT: automatic measurement of voice onset time, using the algorithm described in Sonderegger & Keshet (2012).

Refereed publications

Statistics in phonetics. (2024). Shahin Tavakoli, Beatrice Matteo, Davide Pigoli, Eleanor Chodroff, John Coleman, Michele Gubian, Margaret Renwick, and M. Sonderegger. Annual Review of Statistics and Its Application 12, 133–156. (paper)
Investigating the universality of consonant and vowel co-occurrence restrictions. (2024). Amanda Doucette, Timothy O'Donnell, M. Sonderegger, Heather Goad. Glossa. 9(1), 1–39. (paper | code)
Actuation without production bias. (2024). James Kirby and M. Sonderegger. In Speech Dynamics: Synchronic Variation and Diachronic Change. (preprint | code)
Modelled Multivariate Overlap: A method for measuring vowel merger. (2024). Irene Smith, M. Sonderegger, and The SPADE Consortium. In Proceedings of Interspeech 2024. (paper | code)
Exploring the anatomy of articulation rate in spontaneous English speech: relationships between utterance length effects and social factors. (2024). James Tanner, M. Sonderegger, Tyler Kendall, Jane Stuart-Smith, Jeff Mielke, Erik Thomas, Robin Dodsworth, and The SPADE Consortium. In Proceedings of Interspeech 2024. (paper)
Correlation does not imply compensation: Complexity and irregularity in the lexicon. (2024). Amanda Doucette, Ryan Cotterell, M. Sonderegger, and Timothy O'Donnell. In Proceedings of the Society for Computation in Linguistics (SCiL). (paper | code)
Jacob Hoover, M. Sonderegger, Steven Piantadosi, and Timothy O'Donnell. (2023). The plausibility of sampling as an algorithmic theory of sentence processing. Open Mind 7, 350–391. (paper)
Esmail Moghiseh, M. Sonderegger, and Michael Wagner. (2023). The Iambic-Trochaic Law without Iambs or Trochees: Parsing speech for grouping and prominence. Journal of the Acoustical Society of America 153(2), 1108–1129. (paper, preprint + code + data)
M. Sonderegger, Jane Stuart-Smith, Michael McAuliffe, Rachel Macdonald, and Tyler Kendall. (2022). Managing data for integrated speech corpus analysis in SPeech Across Dialects of English (SPADE) . In A.L. Berez-Kroeker, B. McDonnell, E. Koller, L. B. Collister (Eds.), Open Handbook of Linguistic Data Management (pp. 195–207). https://doi.org/10.7551/mitpress/12200.003.0020.
James Tanner, M. Sonderegger, Jane Stuart-Smith, & Josef Fruehwald. (2020). Towards `English' phonetics: variability in the pre-consonantal voicing effect across English dialects and speakers. Frontiers in Artificial Intelligence. https://doi.org/10.3389/frai.2020.00038.
James Tanner, M. Sonderegger, & Jane Stuart-Smith. (2020). Structured speaker variability in Japanese stops: relationships within and across cues to stop voicing. Journal of the Acoustical Society of America 148(2), 793–804. (paper, code + data)
M. Sonderegger, Jane Stuart-Smith, Rachel Macdonald, Thea Knowles, and Tamara Rathcke. (2020). Structured heterogeneity in Scottish stops over the twentieth century. Language 96(1), 94–125. (paper, code+data)
James Tanner, Morgan Sonderegger, and Francisco Torreira. (2019). Durational evidence that Tokyo Japanese vowel devoicing is not gradient reduction. Frontiers in Psychology 10, article 821. https://doi.org/10.3389/fpsyg.2019.00821.
Martha Schwarz, M. Sonderegger, and Heather Goad. (2019) Realization and representation of Nepali laryngeal contrasts. Journal of Phonetics 73, 113--127. (preprint)
James Kirby and M. Sonderegger. (2018) Mixed-effects design analysis for experimental phonetics. Journal of Phonetics 70, 70--85. (preprint, code+data)
Thea Knowles, Meghan Clayards, and M. Sondereggger. (2018) Examining factors influencing the viability of automatic acoustic analysis of child speech. Journal of Speech, Language, and Hearing Research 61, 2487--2501. (paper)
James Kirby and M. Sonderegger. (2018) Model selection and phonological argumentation. In D. Brentari and J.S. Lee (Eds.), Shaping phonology (pp. 234--252). (preprint)
Hye-Young Bang, M. Sonderegger, Yoonjung Kang, Meghan Clayards, and Tae-jin Yoon. (2018) The emergence, progress, and impact of sound change in progress in Seoul Korean: implications for mechanisms of tonogenesis. Journal of Phonetics 66, 120–144. (preprint)
M. Sonderegger, Max Bane, and Peter Graff. (2017) The medium-term dynamics of accents on reality television. Language 93(3), 598–640. (paper, supplemental material)
James Tanner, M. Sonderegger, and Michael Wagner. (2017) Production planning and coronal stop deletion in spontaneous speech. Laboratory Phonology 8(1), 15: 1–39 . (paper)
Michael McAuliffe, Michaela Socolof, Sarah Mihuc, Michael Wagner, and M. Sonderegger. (2017) Montreal Forced Aligner: trainable text-speech alignment using Kaldi. Proceedings of Interspeech 2017 (paper)
Michael McAuliffe, Elias Stengel-Eskin, Michaela Socolof, and M. Sonderegger. (2017) Polyglot and Speech Corpus Tools: a system for representing, integrating, and querying speech corpora. Proceedings of Interspeech 2017 (paper)
Oriana Kilbourn-Ceron and M. Sonderegger. (2017) Boundary phenomena and variability in Japanese high vowel devoicing. Natural Language and Linguistic Theory. doi:10.1007/s11049-017-9368-x. (preprint | paper)
Jane Stuart-Smith, M. Sonderegger, Tamara Rathcke, and Rachel Macdonald. (2015) The private life of stops: VOT in a real-time corpus of spontaneous Glaswegian. Laboratory Phonology 6(3-4): 505–549. (paper)
Jane Stuart-Smith, Tamara Rathcke, M. Sonderegger and Rachel Macdonald. (2015) A real-time study of plosives in Glaswegian using an automatic measurement algorithm: change or age-grading? In E.N. Torgersen et al. (Eds.), Language Variation -- European Perspectives V (pp. 225–237). Benjamins. (preprint)
Matthew Carlson, M. Sonderegger, and Max Bane. (2014) How children explore the phonological network in child-directed speech: A survival analysis of children's first word productions. Journal of Memory and Language 75: 159–180. (preprint | paper)
Alan Yu, M. Sonderegger, and Carissa Abrego-Collier. (2013) Phonetic imitation from an individual-difference perspective: subjective attitude, personality and ``autistic'' traits. PLOS ONE 8 (9): e74746. (paper)
M. Sonderegger and Partha Niyogi. (2013) Variation and change in English noun/verb pair stress: Data, dynamical systems models, and their interaction. In A.C.L. Yu (Ed.), Origins of Sound Patterns: Approaches to Phonologization (pp. 262–284) . Oxford: OUP. (preprint)
M. Sonderegger and Joseph Keshet. (2012) Automatic measurement of voice onset time using discriminative structured prediction. Journal of the Acoustical Society of America 132: 3965–3979. (paper | code)
M. Sonderegger. (2011) Applications of graph theory to an English rhyming corpus. Computer Speech and Language 25: 655–678. (preprint | paper)
Max Bane, Jason Riggle, and M. Sonderegger. (2010) The VC dimension of constraint based grammars. Lingua 120: 1194–1208. (preprint | paper)
Xuemin Chi and M. Sonderegger. (2007) Subglottal coupling and its influence on vowel formants. Journal of the Acoustical Society of America 122: 1735–1745. (preprint | paper)

E-book

M. Sonderegger, Michael Wagner, and Francisco Torreira. (2018) Quantitative Methods for Linguistic Data. v. 1.0 (Oct. 2018). (book, source) Note: this book is superceded by Regression Modeling for Linguistic Data.

Conference proceedings

Morgan Sonderegger, Jane Stuart-Smith, Jeff Mielke, and The SPADE Consortium. (2023) How variable are English sibilants? Proceedings of the 20th International Congress of Phonetic Sciences. (paper | code)
James Tanner, M. Sonderegger, and Jane Stuart-Smith. (2022). Multidimensional acoustic variation in vowels across English dialects. In Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology. (paper)
Bing'er Jiang, Ewan Dunbar, M. Sonderegger, Meghan Clayards, & Emmanuel Dupoux. Modelling Perceptual Effects of Phonology with ASR Systems. (2020).Proceedings of the 42nd Annual Meeting of the Cognitive Science Society (pp. 2735–2741). Austin, TX: Cognitive Science Society. (paper)
James Tanner, M. Sonderegger, Jane Stuart-Smith, and The SPADE Consortium. (2019). Vowel duration and the voicing effect across English dialects. Toronto Working Papers in Linguistics, Volume 41. (paper)
Michael McAuliffe, Arlie Coles, Michael Goodale, Sarah Mihuc, Michael Wagner, Jane Stuart-Smith, and M. Sonderegger. (2019) ISCAN: a system for integrated phonetic analyses across speech corpora. Proceedings of the 19th International Congress of Phonetic Sciences. (paper)
James Tanner, M. Sonderegger, and Jane Stuart-Smith. (2019) Structured speaker variability in spontaneous Japanese stop contrast production. Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS) . (paper)
Jeff Mielke, Erik Thomas, Joe Fruehwald, Jane Stuart-Smith, M. Sonderegger, Robin Dodsworth, and Michael McAuliffe. (2019) Age vectors vs. axes of intraspeaker variation in vowel formants measured automatically from several English speech corpora. Proc. 19th ICPhS. (paper)
Jane Stuart-Smith, Morgan Sonderegger, Rachel MacDonald, Jeff Mielke, Michael McAuliffe, and Erik Thomas. (2019) Large-scale acoustic analysis of dialectal and social factors in English /s/-retraction. Proc. 19th ICPhS. (paper, code+data)
Thea Knowles, Meghan Clayards, M. Sonderegger, Michael Wagner, Aparna Nadig, and Kris Onishi. (2015) Automatic Forced Alignment on Child Speech: Directions for Improvement. Proceedings of Meetings on Acoustics 25:060001. (paper)
M. Sonderegger. (2015) Trajectories of voice onset time in spontaneous speech on reality TV. Proceedings of the 18th International Congress of Phonetic Sciences. (paper)
James Tanner, M. Sonderegger, and Michael Wagner. (2015) Production planning and coronal stop deletion in spontaneous speech. Proceedings of the 18th International Congress of Phonetic Sciences. (paper)
Hye-Young Bang, M. Sonderegger, Yoonjung Kang, Meghan Clayards, and Tae-Jin Yoon. (2015) The effect of word frequency on the timecourse of tonogenesis in Seoul Korean. Proceedings of the 18th International Congress of Phonetic Sciences. (paper)
Morgane Ciot, M. Sonderegger, and Derek Ruths. (2013) Gender inference of Twitter users in non-English contexts. Proceedings of EMNLP 2013. (paper)
James Kirby and M. Sonderegger. (2013) A model of population dynamics applied to phonetic change. Proceedings of the 35th Annual Conference of the Cognitive Science Society (pp. 776–781). Austin, TX: Cognitive Science Society. (paper | errata)
Katie Henry, M. Sonderegger and Joseph Keshet. (2012) Automatic measurement of positive and negative voice onset time. Proceedings of Interspeech 2012. (paper)
Carissa Abrego-Collier, Julian Grove, M. Sonderegger, and Alan Yu. (2011) Effects of speaker evaluation on phonetic convergence. Proceedings of the 17th International Congress of Phonetic Sciences (pp. 192–195). (paper) Note: This paper is superceded by Yu et al (2013)!
Alan Yu, Julian Grove, Martina Martinović, and M. Sonderegger. (2011) Effects of working memory capacity and "autistic traits" on phonotactic effects in speech perception. Proceedings of the 17th International Congress of Phonetic Sciences (pp. 2236–2239). (paper)
Matthew Carlson, Max Bane, and M. Sonderegger. (2011) Global properties of the phonological network in child-directed speech. In N. Danis, K. Mesh, and H. Sung (Eds.), BUCLD 35: Proceedings of the 35th Annual Boston University Conference on Language Development (pp. 97–109). (paper)
Max Bane, Peter Graff, and M. Sonderegger. (2010/2013) Longitudinal phonetic variation in a closed system. Proceedings of the 46th Annual Meeting of the Chicago Linguistics Society (pp. 43–58). (paper)
M. Sonderegger. (2010/2016) Testing for frequency and structural effects in an English stress shift. Proceedings of the 36th Annual Meeting of the Berkeley Linguistics Society (pp. 411–425). (paper)
M. Sonderegger and Joseph Keshet . (2010) Automatic discriminative measurement of voice onset time. Proceedings of Interspeech 2010, pp. 2242–2245. (paper)
M. Sonderegger and Partha Niyogi. (2010) Combining data and mathematical models of language change. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 1019–1029. (paper)
M. Sonderegger and Alan Yu. (2010) A rational account of perceptual compensation for coarticulation. In S. Ohlsson & R. Catrambone (Eds.), Proceedings of the 32nd Annual Conference of the Cognitive Science Society (pp. 375–380). Austin, TX: Cognitive Science Society. (paper)

Manuscripts

James Kirby and M. Sonderegger. (2015) Bias and population dynamics in the actuation of sound change. arXiV 1507.04420 [cs.CL]. (paper)

Theses

M. Sonderegger. (2012) Phonetic and phonological dynamics on reality television . Ph.D. thesis, U. Chicago. (thesis)
M. Sonderegger. (2009) Dynamical systems models of language variation and change: An application to an English stress shift. M.S. thesis, U. Chicago. (thesis)
M. Sonderegger. (2004) Subglottal coupling and vowel space. B.S. thesis, MIT. (thesis)

Data and code

Data and code from the SPADE project.
Data and code for several projects are on my OSF page, or various GitHub pages (Montreal Corpus Tools, my old lab, me). Feel free to contact me for code or data used for my work.
English stress shift:
- Stress vs. time trajectories for the 149 N/V pairs. (pdf)
- Stress data, stress vs. time trajectories. (zip)
Rhyme graphs: 1900 subcorpus graph (10M, components with 2 or 3 words omitted) (image)
Subglottal resonances: Wav files of microphone and accelerometer signals of English vowels for 14 speakers; F2 and second subglottal formant (SubF2) measurements in my B.S. thesis.