Volume 23

2014

Publication date: 14.04.2015

Licence: None

Issue content

A Short Introduction to Stochastic Optimization

Jerzy Ombach

Schedae Informaticae, Volume 23, 2014, pp. 9-20

https://doi.org/10.4467/20838476SI.14.001.3018

We present some typical algorithms used for finding global minimum/ maximum of a function defined on a compact finite dimensional set, discuss commonly observed procedures for assessing and comparing the algorithms’ performance and quote theoretical results on convergence of a broad class of stochastic algorithms.

PDF

Data Stream Classification Using Classifier Ensemble

Michał Woźniak, Andrzej Kasprzak

Schedae Informaticae, Volume 23, 2014, pp. 21-32

https://doi.org/10.4467/20838476SI.14.002.3019

For the contemporary business, the crucial factor is making smart decisions on the basis of the knowledge hidden in stored data. Unfortunately,m traditional simple methods of data analysis are not sufficient for efficient management of modern enterprizes, because they are not appropriate for the huge and growing amount of the data stored by them. Additionally data usually comes continuously in the form of so-called data stream. The great disadvantage of traditional classification methods is that they assume that statistical properties of the discovered concept are being unchanged, while in real situation, we could observe so-called concept drift, which could be caused by changes in the probabilities of classes or/and conditional probability distributions of classes. The potential for considering new training data is an important feature of machine learning methods used in security applications (spam filtering or intrusion detection) or decision support systems for marketing departments, which need to follow the changing client behavior. Unfortunately, the occurrence of concept drift dramatically decreases classification accuracy. This work presents the comprehensive study on the ensemble classifier approach applied to the problem of drifted data streams. Especially it reports the research on modifications of previously developed Weighted Aging Classifier Ensemble (WAE) algorithm, which is able to construct a valuable classifier ensemble for classification of incremental drifted stream data. We generalize WAE method and propose the general framework for this approach. Such framework can prune an classifier ensemble before or after assigning weights to individual classifiers. Additionally, we propose new classifier pruning criteria, weight calculation methods, and aging operators. We also propose rejuvenating operator, which is able to soften the aging effect, which could be useful, especially in the case if quite ”old” classifiers are high quality models, i.e., their presence increases ensemble accuracy, what could be found, e.g., in the case of recurring concept drift. The chosen characteristics of the proposed frameworks were evaluated on the basis of the wide range of computer experiments carried out on the two benchmark data streams. Obtained results confirmed the usability of proposed method to the data stream classification with the presence of incremental concept drift.

PDF

Hilberg’s Conjecture – a Challenge for Machine Learning

Łukasz Dębowski

Schedae Informaticae, Volume 23, 2014, pp. 33-44

https://doi.org/10.4467/20838476SI.14.003.3020

We review three mathematical developments linked with Hilberg’s conjecture – a hypothesis about the power-law growth of entropy of texts in natural language, which sets up a challenge for machine learning. First, considerations concerning maximal repetition indicate that universal codes such as the Lempel-Ziv code may fail to efficiently compress sources that satisfy Hilberg’s conjecture. Second, Hilberg’s conjecture implies the empirically observed power-law growth of vocabulary in texts. Third, Hilberg’s conjecture can be explained by a hypothesis that texts describe consistently an infinite random object.

PDF

Markov State Space Aggregation via the Information Bottleneck Method

Bernhard C. Geiger

Schedae Informaticae, Volume 23, 2014, pp. 45-56

https://doi.org/10.4467/20838476SI.14.004.3021

Consider the problem of approximating a Markov chain by another Markov chain with a smaller state space that is obtained by partitioning the original state space. An information-theoretic cost function is proposed that is based on the relative entropy rate between the original Markov chain and a Markov chain defined by the partition. The state space aggregation problem can be sub-optimally solved by using the information bottleneck method.

PDF

Fast Optimization of Multithreshold Entropy Linear Classifier

Rafał Józefowicz, Wojciech Marian Czarnecki

Schedae Informaticae, Volume 23, 2014, pp. 57-67

https://doi.org/10.4467/20838476SI.14.005.3022

Multithreshold Entropy Linear Classifier (MELC) is a density based model which searches for a linear projection maximizing the Cauchy-Schwarz Divergence of dataset kernel density estimation. Despite its good empirical results, one of its drawbacks is the optimization speed. In this paper we analyze how one can speed it up through solving an approximate problem. We analyze two methods, both similar to the approximate solutions of the Kernel Density Estimation querying and provide adaptive schemes for selecting a crucial parameters based on user-specified acceptable error. Furthermore we show how one can exploit well known conjugate gradients and L-BFGS optimizers despite the fact that the original optimization problem should be solved on the sphere. All above methods and modifications are tested on 10 real life datasets from UCI repository to confirm their practical usability.

PDF

Słowa kluczowe: global optimization, stochastic algorithm, random search, convergence of metaheuristics, data stream classification, classifier enslemble, pattern classification, forgetting., statistical language modeling, Hilberg’s conjecture, maximal repetition, grammar-based codes, Santa Fe processes, Markov chains, state space aggregation, coarse-graining, information bottleneck, relative entropy, lumpability, multithreshold classifier, entropy, approximation, optimization

Volume 23

2014 Next

Issue content

2014