Trudności związane ze słuchowym i automatycznym rozpoznawaniem osób na podstawie głosu: analiza przypadków głosów bardzo podobnych
Wybierz format
RIS BIB ENDNOTEData publikacji: 10.12.2025
Problems of Forensic Sciences (Z Zagadnień Nauk Sądowych), 2025, 142–143, s. 143-155
https://doi.org/10.4467/12307483PFS.25.007.22914Autorzy
Challenges for Auditory and Automatic Speaker Recognition: Evaluating Cases of Highly Similar Voices
Identical twins present a difficult case for both auditory and machine speaker recognition. This paper addresses this challenge and presents the findings of two studies: an auditory speaker discrimination test and a machine-based task using forensic automatic speaker recognition (ASR) system. The outcomes of the perceptual judgement task were compared with the log-likelihood ratios (LLRs) yielded by an x-vector-based speaker recognition system. Although the task was given to lay listeners, as opposed to forensic phonetic experts, the results appear to be congruent with the scores yielded by a state-of-the-art automatic system. The human raters were more accurate in judging same-speaker pairs than different-speaker pairs. The machine approach showed better performance in both conditions tested as compared to human listeners. Overall, the voices that were difficult for human listeners were different from those that the ASR system struggled with.
The experimental data and code are available in the following OSF repository: https://osf.io/kxu6v/.
1. Jessen M. Forensic phonetics. Lang Linguist Compass. 2008;2(4):671–711. Available from: https://compass.onlinelibrary.wiley.com/doi/abs/10.1111/j.1749-818X.2008.00066.x
2. French P. An overview of forensic phonetics with particular reference to speaker identification. Int J Speech Lang La. 1994;1(2):169–81.
3. Nolan F. Auditory and acoustic analysis in speaker recognition. In: Gibbons J, editor. Language and the Law. London: Longman; 1994. p. 326–45.
4. Beck JM. Organic variation of the vocal apparatus. In: Hardcastle W, Laver J, Gibbon F, editors. The Handbook of Phonetic Sciences. Oxford: Blackwell Publishers; 1997. p. 256–97.
5. Mayr R, Price S, Mennen I. First language attrition in the speech of Dutch–English bilinguals: the case of monozygotic twin sisters. Biling-Lang Cogn. 2012;15(4):687–700.
6. San Segundo E, Tsanas A, Gómez-Vilda P. Euclidean distances as measures of speaker similarity including identical twin pairs: A forensic investigation using source and filter voice characteristics. Forensic Sci Int. 2017;270: 25–38. Available from: https://www.sciencedirect.com/science/article/pii/S0379073816304960
7. Whiteside SP, Rixon E. Speech tempo and fundamental frequency patterns: a case study of male monozygotic twins and an age- and sex-matched sibling. Logopedics Phoniatrics Vocology. 2013;38(4):173–81.
8. Van Gysel W, Vercammen J, Debruyne F. Voice similarity in identical twins. Acta Otorhinolaryngol. Belg. 2001;55(1):49–55. Available from: http://europepmc.org/abstract/MED/11256192
9. Cavalcanti JC, Eriksson A, Barbosa PA. Multiparametric analysis of speaking fundamental frequency in genetically related speakers using different speech materials: Some forensic implications. J. Voice. 2024;38(1):243.e11–29. Available from: https://www.sciencedirect.com/science/article/pii/S0892199721002927
10. Zuo D, Mok PPK. Formant dynamics of bilingual identical twins. J Phonetics. 2015;52:1–12. Available from: https://www.sciencedirect.com/science/article/pii/S0095447015000182
11. San Segundo E, Gómez-Vilda P. Evaluating the forensic importance of glottal source features through the voice analysis of twins and non-twin siblings. Lang Law/Linguagem e Direito. 2017;1(2):22–41.
12. Nolan F, Oh T. Identical twins, different voices. Int J Speech Lang La. 1996;3(1):39–49. Available from: https://journal.equinoxpub.com/IJSLL/article/view/10105
13. Loakes D. A forensic phonetic investigation into the speech patterns of identical and non-identical twins [PhD dissertation]. University of Melbourne, School of Languages; 2006.
14. Whiteside S, Rixon E. Speech patterns of monozygotic twins: An acoustic case study of monosyllabic words. The Phonetician. 2001;84:9–22.
15. Whiteside SP, Rixon E. Speech characteristics of monozygotic twins and a same-sex sibling: An acoustic case study of coarticulation patterns in read speech. Phonetica. 2003;60(4):273–97.
16. Künzel HJ. Automatic speaker recognition of identical twins. Int J Speech Lang La. 2010;17(2):251–277.
17. Cavalcanti JC, da Silva RR, Eriksson A, Barbosa PA. Exploring the performance of automatic speaker recognition using twin speech and deep learning-based artificial neural networks. Front Artif Intell. 2024;7:1287877.
18. Gerlach L, McDougall K, Kelly F, Alexander A. Voice twins: discovering extremely similar-sounding, unrelated speakers. In: Interspeech 2023; Dublin: ISCA. p. 2553–2557. doi: 10.21437/Interspeech.2023-2134
19. Gerlach L, McDougall K, Kelly F, Alexander A, Nolan F. Exploring the relationship between voice similarity estimates by listeners and by an automatic speaker recognition system incorporating phonetic features. Speech Commun. 2020;124:85–95. Available from: https://www.sciencedirect.com/science/article/pii/S016763932030251X
20. Phonexia Voice Inspector version 5.1.0. Available from: https://www.phonexia.com/use-case/audio-forensics-software/ Accessed on 05.02.2024.
21. Hughes V. Sample size and the multivariate kernel density likelihood ratio: how many speakers are enough? Speech Commun. 2017;94:15–29.
22. Phonexia Voice Inspector – User Manual. Phonexia s.r.o. Manual version 2023-12-07.
23. Brümmer N, Swart A. Bayesian calibration for forensic evidence reporting. In: Li H, Meng HM, Ma B, Chng ES, Xie L, editors. Interspeech 2014. 15th Annual Conference of the International Speech Communication Association. Singapore: ISCA; 2014. p. 388–92.
24. Evett IW. Towards a uniform framework for reporting opinions in forensic science casework. Sci. Justice. 1998;3(38):198–202.
25. Morrison GS, Zhang C, Rose P. An empirical estimate of the precision of likelihood ratios from a forensic-voice-comparison system. Forensic Sci Int. 2011;208(1–3):59–65.
26. Drygajlo A, Jessen M, Gfroerer S, Wagner I, Vermeulen J, Niemi T, et al. Methodological guidelines for best practice in forensic semi-automatic and automatic speaker recognition. Frankfurt: Verlag für Polizeiwissenschaft; 2015.
27. Morrison GS, Enzinger E, Hughes V, Jessen M, Meuwly D, Neumann C, et al. Consensus on validation of forensic voice comparison. Science & Justice. 2021;61(3):299–309. Available from: https://www.sciencedirect.com/science/article/pii/S1355030621000083
28. McKenna L, McDermott S, O’Donell G, Barrett A, Rasmusson B, Nordgaard A, et al. ENFSI Guideline for Evaluative Reporting in Forensic Science: Strengthening the evaluation of forensic results across Europe (STEOFRAE). Wiesbaden, Germany: European Network of Forensic Science Institutes; 2015. Available from: http://enfsi.eu/wp-content/uploads/2016/09/m1_guideline.pdf
29. Brümmer N, Du Preez J. Application-independent evaluation of speaker detection. Computer Speech & Language. 2006;20(2–3):230–75.
30. Reilly D, Neumann DL, Andrews G. Gender differences in self-estimated intelligence: Exploring the male hubris, female humility problem. Front. Psychol. 2022;13:812483.
31. Kudera J, Coccia M, Fadaeijouybari S, Preidt T, Ranjan A, Braun A. Voice cloning and mismatch conditions in forensic automatic speaker recognition. In: Karpov A, Delić V, editors. Speech and Computer. Cham: Springer Nature Switzerland; 2025. p. 171–84. Available from: https://doi.org/10.1007/978-3-031-78014-1_13
32. Kelly F, Forth O, Kent S, Gerlach L, Alexander A. Deep neural network based forensic automatic speaker recognition in vocalise using x-vectors. In: Audio Engineering Society Conference: 2019 AES International Conference on Audio Forensics. Porto, Portugal: Audio Engineering Society; 2019. p. 151–57.
Informacje: Problems of Forensic Sciences (Z Zagadnień Nauk Sądowych), 2025, 142–143, s. 143-155
Typ artykułu: Oryginalny artykuł naukowy
Tytuły:
Universität Trier
Niemcy
Universität Trier
Niemcy
Publikacja: 10.12.2025
Otrzymano: 30.06.2025
Zaakceptowano: 23.10.2025
Status artykułu: Otwarte
Licencja: CC BY-NC-ND
Udział procentowy autorów:
Korekty artykułu:
-Języki publikacji:
Angielski, PolskiLiczba wyświetleń: 162
Liczba pobrań: 66