FAQ
logo of Jagiellonian University in Krakow

Extreme classification under limited space and time budget

Publication date: 24.03.2017

Schedae Informaticae, 2016, Volume 25, pp. 9 - 23

https://doi.org/10.4467/20838476SI.16.001.6182

Authors

,
Kalina Jasińska
Institute of Computing Science, Poznan University of Technology, Poznan, Poland
All publications →
,
Krzysztof Dembczyński
Institute of Computing Science, Poznan University of Technology, Poznan, Poland
All publications →
Nikos Karampatziakis
Microsoft Research, Redmond, USA
All publications →

Titles

Extreme classification under limited space and time budget

Abstract

We discuss a new framework for solving extreme classification (i.e., learning problems with an extremely large label space), in which we reduce the original problem to a structured prediction problem. Thanks to this we can obtain learning algorithms that work under a strict time and space budget. We mainly focus on a recently introduced algorithm, referred to as LTLS, which is to our best knowledge the first truly logarithmic time and space (in the number of labels) method for extreme classification. We compare this algorithm with two other approaches that also rely on transformation to structured prediction problems. The first algorithm encodes original labels as binary sequences. The second algorithm follows the label tree approach. The comparison shows the trade-off between computational complexity (in time and space) and predictive performance.

References

[1] Viterbi A.J., Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 1967.

[2] Seshadri N., Sundberg C.E.W., List viterbi decoding algorithms with applications.IEEE Transactions on Communications, 1994.

[3] Doppa J., Fern A., Tadepalli P., Hc-search: Learning heuristics and cost functions for structured prediction. In: Journal of Artificial Intelligence Research (JAIR), 2013.

[4] Jasinska K., Karampatziakis N., Log-time and log-space extreme classication. In: Workshop on Extreme Classification at Neural Information Processing Systems (NIPS), 2016. 21

[5] Dembczynski K., Kotłowski W., Waegeman W., Busa-Fekete R., Hüllermeier E.,Consistency of probabilistic classifier trees. In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD). Springer-Verlag 2016.

[6] Jasinska K., Dembczynski K., Busa-Fekete R., Pfannschmidt K., Klerx T., Hüllermeier E., Extreme F-measure maximization using sparse probability estimates. In: International Confernece on Machine Learning (ICML), 2016.

[7] Yen I.E.H., Huang X., Ravikumar P., Zhong K., Dhillon I., Pd-sparse: A primal and dual sparse approach to extreme multiclass and multilabel classification. In: International Conference on Machine Learning (ICML), 2016.

[8] Bhatia K., Jain H., Kar P., Varma M., Jain P., Sparse local embeddings for extreme multi-label classification. In: Neural Information Processing Systems (NIPS), 2015.

[9] Yu H., Jain P., Kar P., Dhillon I.S., Large-scale multi-label learning with missing labels. In: International Conference on Machine Learning (ICML), 2014.

[10] Weston J., Bengio S., Usunier N., Wsabie: Scaling up to large vocabulary image annotation. In: International Joint Conference on Artificial Intelligence (IJCAI), 2011.

[11] Mineiro P., Karampatziakis N., Fast label embeddings via randomized linear algebra. In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), 2015.

[12] Prabhu Y., Varma M., FastXML: A fast, accurate and stable tree-classifier for extreme multi-label learning. In: Knowledge Discovery and Data Mining (KDD), 2014.

[13] Choromanska A., Langford J., Logarithmic time online multiclass prediction. In: Neural Information Processing Systems (NIPS), 2015.

[14] Niculescu-Mizil A., Abbasnejad E., Label filters for large scale multilabel classification. In: Workshop on Extreme Classification at the International Confernece on Machine Learning (ICML), 2015.

[15] Weston J., Makadia A., Yee H., Label partitioning for sublinear ranking. In: International Conference on Machine Learning (ICML), 2013.

[16] Cissé M., Usunier N., Artières T., Gallinari P., Robust bloom filters for large multilabel classification tasks. In: Neural Information Processing Systems (NIPS), 2013.

[17] Jasinska,K., Dembczynski K., Consistent label tree classifiers for extreme multilabel classification. In: Workshop on Extreme Classification at the International Confernece on Machine Learning (ICML), 2015.

[18] Vijayanarasimhan S., Shlens J., Monga R., Yagnik J., Deep networks with largeb output spaces. In: Workshop contribution at International Conference on Learning Representation (ICLR), 2014. 22

[19] Shrivastava A., Li P., Improved asymmetric locality sensitive hashing (ALSH) for maximum inner product search (mips). In: Uncertainty in Artificial Intelligence (UAI), 2015.

[20] Fagin R., Lotem A., Naor M., Optimal aggregation algorithms for middleware. In: Principles of Database Systems (PODS). ACM 2001.

[21] Stock M., Pahikkala T., Airola A., De Baets B., Waegeman, W., Efficient pairwise learning using kernel ridge regression: an exact two-step method. Computing Research Repository (CoRR), 2016.

[22] Indyk P., Motwani R., Approximate nearest neighbors: Towards removing the curse of dimensionality. In: ACM Symposium on Theory of Computing, 1998.

[23] Beygelzimer A., Langford J., Zadrozny B., Machine learning techniques-reductions between prediction quality metrics. In: Performance Modeling and Engineering. Springer-Verlag 2008.

[24] Beygelzimer A., Daumé H., Langford J., Mineiro P., Learning reductions that really work. In: Proceedings of the IEEE, 2016.

[25] Dietterich T., Bakiri G., Solving multiclass learning problems via error-correcting output codes. Journal of Machine Learning Research (JMLR), 1996.

[26] Huffman D., A method for the construction of minimum-redundancy codes. Proceedings of the Institute of Radio Engineers (IRE), 1952.

[27] Weinberger K., Dasgupta A., Langford J., Smola A., Attenberg J., Feature hashing for large scale multitask learning. In: International Conference on Machine Learning (ICML), ACM, 2009.

[28] Dembczynski K., Waegeman W., Cheng W., Hüllermeier E., An analysis of chaining in multi-label classification. In: European Conference on Artificial Intelligence (ECAI), 2012.

[29] Mena D., Montanes E., Quevedo J.R., del Coz J.J., Using a* for inference in probabilistic classifier chains. In: International Joint Conference on Artificial Intelligence (IJCAI), 2015.

[30] Beygelzimer A., Langford J., Lifshits Y., Sorkin G.B., Strehl A.L., Conditional probability tree estimation analysis and algorithms. In: Uncertainty in Artificia lIntelligence (UAI), 2009.

[31] Fox J., Applied regression analysis, linear models, and related methods. Sage, 1997.

[32] Morin F., Bengio Y., Hierarchical probabilistic neural network language model. In: Artificial Intelligence and Statistics Conference (AISTATS), 2005, pp. 246–252.

[33] Lafferty J.D., McCallum A., Pereira F.C.N., Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: International Conference on Machine Learning (ICML), 2001. 23

[34] Tsochantaridis Y., Joachims T., Hofmann T., Altun Y., Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research (JMLR), 2005.

[35] Langford J., Strehl A., Li L., Vowpal wabbit, 2007 http://mloss.org/software/ view/53/.

Information

Information: Schedae Informaticae, 2016, Volume 25, pp. 9 - 23

Article type: Original article

Titles:

Polish:

Extreme classification under limited space and time budget

English:

Extreme classification under limited space and time budget

Authors

Institute of Computing Science, Poznan University of Technology, Poznan, Poland

Institute of Computing Science, Poznan University of Technology, Poznan, Poland

Microsoft Research, Redmond, USA

Published at: 24.03.2017

Article status: Open

Licence: None

Percentage share of authors:

Kalina Jasińska (Author) - 33%
Krzysztof Dembczyński (Author) - 33%
Nikos Karampatziakis (Author) - 34%

Article corrections:

-

Publication languages:

English

View count: 3190

Number of downloads: 1846

<p> Extreme classification under limited space and time budget</p>