Adam Roman
Schedae Informaticae, Volume 32, 2023, pp. 9 - 25
https://doi.org/10.4467/20838476SI.23.001.19323Maintaining data warehouses and ETL processes is becoming increasingly difficult. For this reason, we introduce a similarity measure on ETL processes, based on the edit distance of a graph, which models the process. We show both the exact way how to calculate it and heuristic approaches to compute the estimated similarity more quickly. We propose methods to improve graph edit distance based on the assumption that the ETL process model is a directed acyclic graph.
Adam Roman
Schedae Informaticae, Volume 19, 2010, pp. 35 - 52
This work is motivated by the ˇCern´y Conjecture – an old unsolved problem in the automata theory. We describe the results of the experiments on synchronizing automata, which have led us to two interesting results. The first one is that the size of an automaton alphabet may play an important role in the issue of synchronization: we have found a 5-state automaton over a 3-letter alphabet which attains the upper bound from the ˇCern´y Conjecture, while there is no such automaton (except ˇCern´y automaton C5) over a binary alphabet. The second result emerging from the experiments is a theorem describing the dependencies between the automaton structure S expressed in terms of the so-called merging system and the maximal length of all minimal synchronizing words for automata of type S.
Adam Roman
Schedae Informaticae, Volume 25, 2016, pp. 227 - 236
https://doi.org/10.4467/20838476SI.16.017.6198Mutation testing is considered as one of the most effective quality improvement technique by assessing the strength of the actual test suite. If no test is able to kill a given mutant, this means that the tests are not strong enough and we need to write additional one that will be able to kill this mutant. However, mutation testing is very time consuming. In this paper we investigate if it is possible to reduce the scope of the mutation analysis by running it only on the new or changed part of the code. Using data from the real open-source projects we analyze if there is a relation between mutation scope reduction and effectiveness of the mutation analysis.
Adam Roman
Schedae Informaticae, Volume 20, 2011, pp. 137 - 159
https://doi.org/10.4467/20838476SI.11.007.0293This paper shows a new combinatorial problem which emerged from studies on an artificial intelligence classification model of a hierarchical classifier. We introduce the notion of proper clustering and show how to count their number in a special case when 3 clusters are allowed. An algorithm that generates all clusterings is given. We also show that the proposed approach can be generalized to any number of clusters, and can be automatized. Finally, we show the relationship between the problem of counting clusterings and the Dedekind problem.