FAQ

A note on Levenshtein distance versus human analysis

Publication date: 10.12.2011

Studia Linguistica Universitatis Iagellonicae Cracoviensis, 2011, Volume 128, Issue 1, pp. 155-160

https://doi.org/10.2478/v10148-011-0020-6

Authors

Kamil Stachowski
Jagiellonian University in Kraków, Gołębia 24, 31-007 Kraków, Poland
https://orcid.org/0000-0002-5909-035X Orcid
Contact with author
All publications →

Download full text

Titles

A note on Levenshtein distance versus human analysis

Abstract

This paper argues that automatic phonetic comparison will only return true results if the languages in question have similar and comparably lenient phonologies. In the situation where their phonologies are incompatible and / or restrictive, linguistic knowledge of both of them is necessary to obtain results matching human perception. Whilst the case is mainly exemplified by Levenshtein distance and Russian loanwords in Dolgan, the conclusion is also applicable to the approach as a whole.

References

Download references

van der Ark R., Mennecier P., Nerbonne J., Manni F. 2007. Preliminary identification of language groups and loan words in Central Asia. – Osenova P. et al. (eds.) Proceedings of the RANLP workshop on computational phonology workshop at the conference Recent Advances in Natural Language Processing. Borovets: 13–20. [www.let.rug.nl/nerbonne/ paper.html, accessed 2010.12.17]. 

Dunning T. 1994. Statistical identification of language. – Technical Report CRL MCCS 94-273. New Mexico State University. [ucrel.lancs.ac.uk/papers, accessed 2010.12.18]. 

Heeringa W., Kleiweg P., Gooskens Ch., Nerbonne J. 2006. Evaluation of string distance algorithms for dialectology. – Nerbonne J., Hinrichs E. (eds.) Linguistic distances workshop at the joint conference of International Committee on Computational Linguistics and the Association for Computational Linguistics. Sydney: 51–62. [www.let.rug.nl/nerbonne/ paper.html, accessed 2010.12.17]. 

Heggarty P. 2006. Interdisciplinary indiscipline? Can phylogenetic methods meaningfully be applied to language data — and to dating language? – Renfrew C., Forster P. (eds.) Phylogenetic methods and the prehistory of languages. Cambridge: 183–94. 

Nerbonne J., Heeringa W. 2009. Measuring dialect differences. – Schmidt J.E., Auer P. (eds.) Language and space: theories and methods [= Handbücher zur Sprach- und Kommunikationswissenschaft 30.1]. Berlin: 550–67. 

Polivanov E.D. 1968. Statьi po obščemu jazykoznaniju. Moskva. 

Sanders N.C., Chin S.B. 2009. Phonological distance measures. – Journal of Quantitative Linguistics 16.1: 96–114. [citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.95.2447, accessed 2010.12.18]. 

Stachowski K. 2010. Quantifying phonetic adaptations of Russian loanwords in Dolgan. – Studia Linguistica Universitatis Iagellonicae Cracoviensis 127: 101–77.

Information

Information: Studia Linguistica Universitatis Iagellonicae Cracoviensis, 2011, Volume 128, Issue 1, pp. 155-160

Article type: Original article

Authors

https://orcid.org/0000-0002-5909-035X

Kamil Stachowski
Jagiellonian University in Kraków, Gołębia 24, 31-007 Kraków, Poland
https://orcid.org/0000-0002-5909-035X Orcid
Contact with author
All publications →

Jagiellonian University in Kraków, Gołębia 24, 31-007 Kraków, Poland

Published at: 10.12.2011

Article status: Open

Licence: None

Percentage share of authors:

Kamil Stachowski (Author) - 100%

Article corrections:

-

Publication languages:

English

View count: 2154

Number of downloads: 1404

<p>A note on Levenshtein distance versus human analysis</p>

A note on Levenshtein distance versus human analysis

cytuj

pobierz pliki

RIS BIB ENDNOTE