TY - JOUR TI - Native vs. non-native English: data-driven lexical analysis AU - Witalisz, Ewa AU - Leśniewska, Justyna TI - Native vs. non-native English: data-driven lexical analysis AB - This article presents a preliminary, data-driven study of a corpus of texts written by advanced Polish learners of English, which were analysed with reference to a baseline corpus of native-speaker texts. The texts included in both corpora were produced in similar circumstances (classroom setting), with the same time and word limit, and in response to the same task. We conducted a comparative lexical analysis of the two corpora, using corpus methodology (word lists, cluster analysis, concordances, keyness) to identify the most significant differences. The most important conclusion from this study is that advanced foreign language use may differ from native-speaker language use in ways which only become visible in larger samples of language, and the differences, if analysed individually, would not be regarded as errors and would go unnoticed. There is some evidence in the study that some of these differences may be attributed to cross-linguistic influence. VL - 2012 IS - Volume 129, Issue 2 PY - 2012 SN - 1897-1059 C1 - 2083-4624 SP - 127 EP - 137 DO - 10.4467/20834624SL.12.009.0598 UR - https://ejournals.eu/en/journal/studia-linguistica-uic/article/native-vs-non-native-english-data-driven-lexical-analysis KW - advanced EFL use KW - corpus analysis of learner language KW - lexical features of L2 writing