Przemysław Spurek
Schedae Informaticae, Volume 24, 2015, pp. 31-40
https://doi.org/10.4467/20838476SI.15.003.3025This work presents the step by step tutorial for how to use cross entropy clustering for the iris segmentation. We present the detailed construction of a suitable Gaussian model which best fits for in the case of iris images, and this is the novelty of the proposal approach. The obtained results are promising, both pupil and iris are extracted properly and all the information necessary for human identification and verification can be extracted from the found parts of the iris.
Przemysław Spurek
Schedae Informaticae, Volume 24, 2015, pp. 21-29
https://doi.org/10.4467/20838476SI.15.002.3024This paper presents a novel global thresholding algorithm for the binarization of documents and gray-scale images using Cross-Entropy Clustering. In the first step, a gray-level histogram is constructed, and the Gaussian densities are fitted. The thresholds are then determined as the cross-points of the Gaussian densities. This approach automatically detects the number of components (the upper limit of Gaussian densities is required).
Przemysław Spurek
Schedae Informaticae, Volume 28, 2019, pp. 25-47
https://doi.org/10.4467/20838476SI.19.002.14379Independent Component Analysis (ICA) is a method for searching the linear transformation that minimizes the statistical dependence between its components. Most popular ICA methods use kurtosis as a metric of independence (non-Gaussianity) to maximize, such as FastICA and JADE. However, their assumption of fourth-order moment (kurtosis) may not always be satisfied in practice. One of the possible solution is to use third-order moment (skewness) instead of kurtosis, which was applied in ICA_SG and EcoICA. In this paper we present a competitive approach to ICA based on the Split Generalized Gaussian distribution (SGGD), which is well adapted to heavy-tailed as well as asymmetric data. Consequently, we obtain a method which works better than the classical approaches, in both cases: heavy tails and non-symmetric data.
Przemysław Spurek
Schedae Informaticae, Volume 27, 2018, pp. 129-141
https://doi.org/10.4467/20838476SI.18.010.10415In this paper we present a method with closed analytic formula of stitching aligned images.
It is obtained by choosing a statistically optimal global color change of one part of image. This approach, due to its numerical efficiency, is especially well-suited for merging big amount of satellite images into a single map.
Moreover, we present solution of a general problem, how to find an optimal shift by v of data Y with respect to v from V, so that the dataset X, Y+v is maximally statistically consistent. We show that the solution is given in a closed analytic form.
Przemysław Spurek
Schedae Informaticae, Volume 27, 2018, pp. 69-79
https://doi.org/10.4467/20838476SI.18.006.10411In this paper we discuss a class of AutoEncoder based generative models based on one dimensional sliced approach. The idea is based on the reduction of the discrimination between samples to one-dimensional case.
Our experiments show that methods can be divided into two groups. First consists of methods which are a modification of standard normality tests, while the second is based on classical distances between samples.
It turns out that both groups are correct generative models, but the second one gives a slightly faster decrease rate of Frechet Inception Distance (FID).
Przemysław Spurek
Schedae Informaticae, Volume 24, 2015, pp. 133-142
https://doi.org/10.4467/20838476SI.15.013.3035We present a new subspace clustering method called SuMC (Subspace Memory Clustering), which allows to efficiently divide a dataset D RN into k 2 N pairwise disjoint clusters of possibly different dimensions. Since our approach is based on the memory compression, we do not need to explicitly specify dimensions of groups: in fact we only need to specify the mean number of scalars which is used to describe a data-point. In the case of one cluster our method reduces to a classical Karhunen-Loeve (PCA) transform. We test our method on some typical data from UCI repository and on data coming from real-life experiments.
Przemysław Spurek
Schedae Informaticae, Volume 25, 2016, pp. 117-126
https://doi.org/10.4467/20838476SI.16.009.6190Robust mixture models approaches, which use non-normal distributions have recently been upgraded to accommodate data with fixed bounds. In this article we propose a new method based on uniform distributions and Cross-Entropy Clustering (CEC). We combine a simple density model with a clustering method which allows to treat groups separately and estimate parameters in each cluster individually. Consequently, we introduce an effective clustering algorithm which deals with non-normal data.