Automatic Inference of Indexing Rules for MEDLINE Aurélie Névéol and Sonya E. Shooshan National Library of Medicine 8600 Rockville Pike Bethesda, MD 20894, USA {neveola,sonya}@nlm.nih.gov
Vincent Claveau IRISA - CNRS Campus de Beaulieu 35042 Rennes, France
[email protected]
Methods
Inductive Logic Programming (ILP): MEDLINE Citations
• Supervised machine learning technique used to infer rules that are expressed with logical clauses (Prolog clauses) based on a set of examples also represented using Prolog (Muggleton and Raedt, 1994).
…
Biomedical Literature -MTI has been used at NLM since 2002 (Aronson et al., 2004)
• Provides simple representations for relational problems
Medical Subject Headings (MeSH)
• Rules can be easily interpreted • Caveat : complexity of rule inference from large sets of positive and negative examples.
- 45% recall on MeSH main heading recommendations
MTI
- This experiment aims at enriching MTI’s Subheading Attachment module
• Solution: new definition of subsumption that allows us to go through the sets of examples efficiently by exploiting hierarchical relationships between main headings – based on work by Buntine (1988)
ILP
• Suitable for any rule inference problem involving structured knowledge encoded by ontologies.
Indexing Rule:
MeSH main headings
If a main heading from the "Anatomy" tree and a "Carboxylic Acids" term are recommended for indexing, then the pair "[Carboxylic Acids]/pharmacology" should
Challenges of MeSH Indexing:
also be recommended.
Fully Automatic: - ILP - ILP-filtered - Baseline
Indexing Rules
• Scale: MeSH 2008 includes 24,767 main headings (e.g. Hand, Aphasia), 83 subheadings (e.g. Genetics, Pharmacology), which amounts to 581,560 indexing terms (e.g. Hand, Aphasia/Genetics)
Expert knowledge
Fully Manual: - Expert rules
• Multiclass: the number of indexing terms per article is not known in advance
Semi Automatic: - ILP-reviewed
MeSH pairs
• Complex cognitive task: There is no unique correct set of terms for a given article. Consistency among indexers is about 35% for main heading/subheading combinations
Results
Subheading
Method
Administration & ILP dosage Manual ILP-filtered ILP-reviewed Baseline ILP Genetics Manual ILP-filtered ILP-reviewed Baseline ILP Metabolism Manual ILP-filtered ILP-reviewed Baseline ILP Pharmacology Manual ILP-filtered ILP-reviewed Baseline ILP Physiology Manual ILP-filtered ILP-reviewed Baseline
Nb. rules 166 1 95 124 200 226 181 172 134 61 123 73 217 7 183 74 70 0 64 70 -
Precision (%) 38 54 45 37 26 55 65 55 55 33 49 58 49 49 37 47 67 48 47 28 46 46 46 28
Recall (%) 29 1 25 29 9 39 28 39 39 10 38 20 38 38 12 28 3 28 28 12 24 24 24 10
F-measure (%) 33 1 32 33 13 46 39 46 46 15 43 30 43 43 18 35 5 35 35 17 32 32 32 15
Subheading
|E+|
Administration & dosage 5,300
40,000
|E-|
Computing time 75 minutes
Precision (%) 41
Recall (%) 53
F-measure (%) 47
Genetics
5,700
30,500
51 minutes
50
59
55
Metabolism
4,500
21,000
37 minutes
42
60
50
Pharmacology
5,000
22,000
45 minutes
49
54
51
Physiology
5,200
34,000
46 minutes
41
42
42
Performance of ILP rule inference on MEDLINE citations.
Discussion: • Performance: – ILP is superior to the baseline and provides the best F-measure on the test corpus – ILP recall is higher when main headings come from MEDLINE vs. MTI – Some adjustments are needed to avoid trigger term/ target term redundancy in ILP rules – Manual rules provide the best precision (when available) • Filtering vs. Manual Review: – Biggest impact of filtering on Administration and dosage – Filtering generally improves ILP performance – Manual review is not always efficient
Performance on the test corpus using MTI main heading recommendations
References: • Alan R. Aronson, James G. Mork, Clifford W. Gay, Susanne M. Humphrey, and Willie J. Rogers. 2004. The NLM Indexing Initiative’s Medical Text Indexer. In Proceedings of Medinfo 2004, San Francisco, California, USA. •Wray L. Buntine. 1988. Generalized Subsumption and its Application to Induction and Redundancy. Artificial Intelligence, 36:375–399. • Stephen Muggleton and Luc De Raedt. 1994. Inductive logic programming: Theory and methods. Journal of Logic Programming, 19/20:629–679.
Acknowledgements: This study was supported in part by the Intramural Research Programs of the National Institutes of Health, National Library of Medicine. A. Névéol was supported by an appointment to the National Library of Medicine Research Participation Program administered by the Oak Ridge Institute for Science and Education through an inter-agency agreement between the U.S. Department of Energy and the National Library of Medicine. The authors would like to thank Alan R. Aronson and James G. Mork for their comments and feedback on this work.