TY - JOUR TI - Hilberg’s Conjecture – a Challenge for Machine Learning AU - Dębowski, Łukasz TI - Hilberg’s Conjecture – a Challenge for Machine Learning AB - We review three mathematical developments linked with Hilberg’s conjecture – a hypothesis about the power-law growth of entropy of texts in natural language, which sets up a challenge for machine learning. First, considerations concerning maximal repetition indicate that universal codes such as the Lempel-Ziv code may fail to efficiently compress sources that satisfy Hilberg’s conjecture. Second, Hilberg’s conjecture implies the empirically observed power-law growth of vocabulary in texts. Third, Hilberg’s conjecture can be explained by a hypothesis that texts describe consistently an infinite random object. VL - 2014 IS - Volume 23 PY - 2015 SN - 1732-3916 C1 - 2083-8476 SP - 33 EP - 44 DO - 10.4467/20838476SI.14.003.3020 UR - https://ejournals.eu/en/journal/schedae-informaticae/article/hilbergs-conjecture-a-challenge-for-machine-learning KW - statistical language modeling KW - Hilberg’s conjecture KW - maximal repetition KW - grammar-based codes KW - Santa Fe processes