Vitae - Sebastian Peña Saldarriaga

University of Nantes - UFR Sciences. Born: 5 July 1982 ... “Ranking fusion methods applied to on-line ... ing, University Rennes 1, France. Advisors: Anita ...
85KB taille 1 téléchargements 35 vues
Sebastián Peña Saldarriaga LINA - TALN University of Nantes - UFR Sciences 2 rue de la Houssinière - BP 92208 44322 Nantes Cedex 03 - France Phone: +33 2.51.12.58.08

Born: 5 July 1982 Citizenship: Colombian [email protected] http://sebastianpena.info/

A. Research Activities During my PhD I have dealt with several aspects of information access in online handwritten data collections that are being created with digital pens and pen-based input devices. My research over the last three years was part of a project funded by French National Research Agency grant ANR-06-TLOG-009. The overall goal of the CIEL project was to provide means for processing and retrieving online handwritten documents (writer identification, automatic classification, equation recognition, etc.). The main contribution of my work has been the exploration of text-based approaches to online handwritten document categorization and retrieval. In the first part of my work, I studied the problem of document categorization from noisy texts obtained through handwriting recognition and analyzed the behaviour of several machine learning algorithms applied to noisy data. The second part dealt with query-based information retrieval in handwritten document collections. We proposed several strategies to improve existing methods relying on word spotting or on standard methods in information retrieval applied to recognized texts. Previous to my PhD thesis, my research interests were devoted to natural language processing. I worked on improving a rule-based natural language generation system in the context of a rational dialog agent, and also on the implementation of rule-based information extraction methods for biomedical information processing. Since my PhD, my research interests moved further into machine learning and its applications in pattern recognition and information retrieval.

B. Publications B.1

Journal Papers

[1] S. Peña Saldarriaga, C. Viard-Gaudin, and E. Morin. “Impact of on-line handwriting recognition performance on text categorization.” International Journal on Document Analysis & Recognition. To appear, http://dx.doi.org/10.1007/s10032-009-0108-6.

B.2

Conference Papers

[2] S. Peña Saldarriaga, E. Morin, and C. Viard-Gaudin. “Ranking fusion methods applied to on-line handwriting information retrieval.” In proceedings of the 32nd Annual European Conference on Information Retrieval (ECIR 2010). LNCS, vol. 5993, pp. 253–264, 2010. [3] S. Peña Saldarriaga, C. Viard-Gaudin, and E. Morin. “Combining approaches to on-line handwriting information retrieval.” In Document Recognition & Retrieval XVII (DRR 2010). Proceedings of the SPIE-IS&T Electronic Imaging, vol. 7534, p. 753403, 2010. [4] S. Peña Saldarriaga, E. Morin, and C. Viard-Gaudin. “Using top n recognition candidates to categorize on-line handwritten documents.” In proceedings of the 10th International Conference on Document Analysis & Recognition (ICDAR 2009), pp. 881–885, 2009.

[5] S. Peña Saldarriaga, C. Viard-Gaudin, and E. Morin. “On-line handwritten text categorization.” In Document Recognition & Retrieval XVI (DRR 2009). Proceedings of the SPIE-IS&T Electronic Imaging, vol. 7247, p. 724709, 2009. (Best student paper award) [6] S. Peña Saldarriaga, E. Morin, and C. Viard-Gaudin. “Categorization of on-line handwritten documents.” In proceedings of the 8th International Workshop on Document Analysis Systems (DAS 2008), pp. 95–102, 2008.

B.3

French Conference Papers

[7] S. Peña Saldarriaga, E. Morin, and C. Viard-Gaudin. “Fusion de résultats en recherche d’information : application aux documents manuscrits en-ligne.” In actes du 6e Colloque International Francophone sur l’Écrit et le Document (CIFED 2010), pp. 3–18, 2010. [8] S. Peña Saldarriaga, E. Morin, and C. Viard-Gaudin. “Un nouveau schéma de pondération pour la catégorisation de documents manuscrits.” In actes de la 16e Conférence sur le Traitement Automatique des Langues (TANL 2009), http://www-lipn.univ-paris13. fr/taln09/paper/paper_TALN_69.html, 2009. [9] S. Peña Saldarriaga, E. Morin, and C. Viard-Gaudin. “Impact de la reconnaissance de l’écriture en-ligne sur une tâche de catégorisation.” In actes de la 6e Conférence en Recherche d’Information et Applications (CORIA 2009), pp. 219–234, 2009.

C. Education March 2010

Ph.D. in Computer Science, University of Nantes, France. Advisors: Christian Viard-Gaudin & Emmanuel Morin. Dissertation: “Text-based Approaches to On-Line Handwritten Document Categorization & Retrieval” (in French).

June 2007

M.Sc. in Computer Science with focus on Medical Information Processing, University Rennes 1, France. Advisors: Anita Burgun & Philippe Mabo. Dissertation: “Design and Implementation of an Information System for Cardiac Resynchronization Devices” (in French).

Fall 2006

M.Sc. in Computer Science with focus on NLP, University of Marne-laVallée and France Télécom R&D, France. Advisors: Franck Panaget & Eric Laporte. Dissertation: “Revision Planning in a Natural Language Generation System for Man-Machine Dialog” (in French).

June 2004

B.Sc. Computer Science, University of Marne-la-Vallée

D. Work & Teaching Experience Current

Postdoctoral Position (from March to August 2010) DEPART project, LINA The main goal of the DEPART project is the development of models for the recognition of complex objects and the translation of multilingual documents based on the joint analysis of handwriting and speech modalities. With respect to the project’s goal, my work focuses on the design and supervision of the data collection process.

Fall 2008 Spring 2010

Teaching fellow and assistant Computer Science Departement, Nantes Technological Institute (IUT) Lectures on Advanced Topics in Object Oriented Programming in Java: reflection, annotations, internationalisation & localisation. Tutorial instructor on Object Oriented Programming, GUI Programming in Java, and Programmatic Access to Information Stored in Databases.

E. Languages Spanish: French: English:

Native Fluent Fluent