Maciej Brzeski
Schedae Informaticae, Volume 32, 2023, pp. 9 - 25
https://doi.org/10.4467/20838476SI.23.001.19323Maintaining data warehouses and ETL processes is becoming increasingly difficult. For this reason, we introduce a similarity measure on ETL processes, based on the edit distance of a graph, which models the process. We show both the exact way how to calculate it and heuristic approaches to compute the estimated similarity more quickly. We propose methods to improve graph edit distance based on the assumption that the ETL process model is a directed acyclic graph.
Maciej Brzeski
Schedae Informaticae, Volume 27, 2018, pp. 19 - 30
https://doi.org/10.4467/20838476SI.18.002.10407We investigate performance of a gradient descent optimization (GR) applied to the traffic signal setting problem and compare it to genetic algorithms. We used neural networks as metamodels evaluating quality of signal settings and discovered that both optimization methods produce similar results, e.g., in both cases the accuracy of neural networks close to local optima depends on an activation function (e.g., TANH activation makes optimization process converge to different minima than ReLU activation).
Maciej Brzeski
Schedae Informaticae, Volume 25, 2016, pp. 117 - 126
https://doi.org/10.4467/20838476SI.16.009.6190Robust mixture models approaches, which use non-normal distributions have recently been upgraded to accommodate data with fixed bounds. In this article we propose a new method based on uniform distributions and Cross-Entropy Clustering (CEC). We combine a simple density model with a clustering method which allows to treat groups separately and estimate parameters in each cluster individually. Consequently, we introduce an effective clustering algorithm which deals with non-normal data.