Algorithms for Predicting Structured Data (Tutorial Proposal) .fr

as well as invited talks at premier venues such as the International Conference on Machine Learning. He also served as a guest editor for the special issue on ... Johannes Fürnkranz, Eyke Hüllermeier, editor(s), Preference Learning, Springer-.
49KB taille 5 téléchargements 287 vues
Algorithms for Predicting Structured Data (Tutorial Proposal) Thomas G¨artner1 and Shankar Vembu2 1

2

1

Fraunhofer IAIS, Sankt Augustin, Germany University of Illinois at Urbana-Champaign, USA {thomas.gaertner,shankar.vembu}@gmail.com

Overview

Structured prediction is the problem of predicting multiple outputs with complex internal structure and dependencies among them. Algorithms (and models) for predicting structured data have existed since the mid-80s. For example, recurrent neural networks and hidden Markov models have been in use for a long time in temporal pattern recognition problems. With the introduction of support vector machines in the 90s, the machine learning community has been witnessing a growing interest in discriminative models of learning. In this tutorial, we plan to cover recent developments in discriminative learning algorithms for predicting structured data.

2

Target Audience

We believe this tutorial will be useful to machine learning researchers including graduate students who would like to gain a deeper understanding of structured prediction and state-of-the-art approaches to solve this problem. Structured prediction has several applications in the areas of natural language processing, computer vision and computational biology, just to name a few. We firmly believe that the material presented in this tuturial will reach a broad spectrum of researchers working in the afore-mentioned application areas. The tutorial is mostly self-contained (see outline below). Anybody who has taken a graduate-level course in machine learning should be able to understand the material without any difficulties.

3

Tutorial Outline

I. Basics: Generative versus discriminative learning – Loss functions – Perceptron – Logistic regression – Support vector machines II. Algorithms: Problem setting – Loss functions revisited – Structured perceptron – Conditional random fields – Large margin methods (M3Ns, SVMs) – Joint kernel maps – Search-based models III. Advanced Topics: Exact and approximate inference – Predicting combinatorial structures – Generalization bounds

2

Thomas G¨ artner and Shankar Vembu

4

Information on the Presenters

Thomas G¨ artner is the head of the research group on “Computational Aspects of Mining and Learning” at the University of Bonn and lead scientist for machine learning at the Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS. He was recently admitted to the Emmy Noether program of the German Research Foundation. He has published extensively in the areas of kernel methods and structured data in general. An adapted version of his thesis is now the first monograph on kernel methods for structured data. He has co-chaired a workshop on Mining and Learning with Graphs in 2006, has given tutorials as well as invited talks at premier venues such as the International Conference on Machine Learning. He also served as a guest editor for the special issue on Mining and Learning with Graphs for the Machine Learning Journal. Shankar Vembu is a post-doctoral researcher in the Cognitive Computation Group at the University of Illinois, Urbana Champaign. He recently finished his PhD studies with a thesis on structured prediction in the Knowledge Discovery and Machine Learning Lab at the University of Bonn under the supervision of Prof. Dr. Stefan Wrobel and Dr. Thomas Gaertner. Shankar Vembu’s research interests are in machine learning algorithms particularly in the areas of structured prediction, graph-based learning, and kernel methods. His research on structured prediction was selected as one of the top papers at ECML PKDD 2009 leading to a fast-track publication in the Machine Learning Journal.

References 1. Yasemin Altun. Discriminative methods for label sequence learning . PhD thesis, Department of Computer Science, Brown University, 2005. 2. G¨ okhan H. Bakir, Thomas Hofmann, Bernhard Sch¨ olkopf, Alexander J. Smola, Ben Taskar, and S.V.N. Vishwanathan. Predicting structured data. MIT Press, Cambridge, Massachusetts, USA, 2007. 3. Michael Collins. Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2002. 4. Hal Daum´e III. Practical structured learning techniques for natural language processing. PhD thesis, University of Southern California, 2006. 5. Thomas Finley and Thorsten Joachims. Training structural SVMs when exact inference is intractable. In Proceedings of the 25th International Conference on Machine Learning, 2008. 6. Thomas G¨ artner. Kernels for structured data. PhD thesis, Universit¨ at Bonn, 2005. 7. Thomas G¨ artner and Shankar Vembu. On structured output training: Hard cases and an efficient alternative. Machine Learning Journal (Special Issue of ECML PKDD), 76(2):227–242, 2009. 8. Alex Kulesza and Fernando Pereira. Structured learning with approximate inference. In Advances in Neural Information Processing Systems 20, 2007. 9. John Lafferty, Andrew McCallum, and Fernando Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning, 2001.

Algorithms for Predicting Structured Data

3

10. Ben Taskar, Carlos Guestrin, and Daphne Koller. Max-margin Markov networks. In Advances in Neural Information Processing Systems 16, 2003. 11. Ben Taskar. Learning structured prediction models: A large margin approach. PhD thesis, Stanford University, 2004. 12. Ben Taskar, Vassil Chatalbashev, Daphne Koller, and Carlos Guestrin. Learning structured prediction models: A large margin approach. In Proceedings of the 22nd International Conference on Machine Learning, 2005. 13. Ioannis Tsochantaridis, Thorsten Joachims, Thomas Hofmann, and Yasemin Altun. Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6:1453–1484, 2005. 14. Shankar Vembu, Thomas G¨ artner, and Mario Boley. Probabilistic structured predictors. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, 2009. 15. Shankar Vembu. Learning to predict combinatorial structures. PhD thesis, Universit¨ at Bonn, 2009 (submitted). 16. Shankar Vembu and Thomas G¨ artner. Label ranking algorithms: A survey. In Johannes F¨ urnkranz, Eyke H¨ ullermeier, editor(s), Preference Learning, SpringerVerlag, 2010. (to appear) 17. Jason Weston, Olivier Chapelle, Andr´e Elisseeff, Bernhard Sch¨ olkopf, and Vladimir Vapnik. Kernel dependency estimation. In Advances in Neural Information Processing Systems 15, 2002.