Ontology for fMRI as a Biomedical Informatics Method - Daniel

pressions that can be processed by computer programs or web services, providing a unique taxonomic frame to .... (OWL), the semantic web enables the description of information that is ...... Notes in Computer Science 4182; Third Asia Infor-.
1MB taille 1 téléchargements 216 vues
Magn Reson Med Sci, Vol. 7, No. 3, pp. 141–155, 2008

REVIEW

Ontology for fMRI as a Biomedical Informatics Method Toshiharu NAKAI1,*, Epifanio BAGARINAO2, Yoshio TANAKA2, Kayako MATSUO1, and Daniel RACOCEANU3,4 1Functional

Brain Imaging Lab, Department of Gerontechnology, National Center for Geriatrics and Gerontology 36–3 Gengo, Morioka-cho, Ohbu, Aichi 474–8522, Japan 2Grid Technology Research Center, National Institute of Advanced Industrial Science and Technology, Tsukuba, Ibaraki, Japan 3IPAL-Image Perception, Access and Language, French National Center for Scientiˆc Research (CNRS, NUS, I2R/A*STAR, UJF), Singapore 4Faculty of Sciences, University of Besançon, Besançon, France (Received April 17, 2008; Accepted July 2, 2008)

Ontological engineering is one of the most challenging topics in biomedical informatics because of its key role in integrating the heterogeneous database used by biomedical information services. Ontology can translate concepts and their real-world relationships into expressions that can be processed by computer programs or web services, providing a unique taxonomic frame to describe a pathway for extracting, processing, storing, and retrieving information. In developing clinical functional neuroimaging, which requires the integration of heterogeneous information derived from multimodal measurement of the brain, these features will be indispensable. Neuroimaging ontology is remarkable in that it requires detailed description of the hypothesis, the paradigm employed, and a scheme for data generation. Neuroimaging modalities, such as functional magnetic resonance imaging (fMRI), magnetoencephalography (MEG), electroencephalography (EEG), and near infrared spectroscopy (NIRS), share similar application purposes, imaging protocol, analyzing methods, and data structure; semantic gaps that remain among the modalities will be bridged as ontology develops. High-performance, global resource information database (GRID) computing and the applications organized as service-oriented computing (SOC) will support the heavy processing to integrate the heterogeneous neuroimaging system. We have been developing such a distributed intelligent neuroimaging system for real-time fMRI analysis, called BAXGRID, and a neuroimaging database. The fMRI ontology of this system will be integrated with established medical ontologies, such as the Uniˆed Medical Language System (UMLS). Keywords: BAXGRID, functional magnetic resonance imaging (fMRI), GRID, neuroimaging, ontology cation of ontology in biomedicine and initiate a study of ontology for neuroimaging; we focus especially on fMRI because it is indispensable for comparing multi-center data as well as for integrating neuroimaging data to improve the reliability of functional brain maps. In medicine, well-known ontologies include the Uniˆed Medical Language System (UMLS), Systematized Nomenclature of Medicine (SNOMED), and Foundational Model of Anatomy (FMAa). Besides these systematic frame-

Introduction Since the discovery of the blood-oxygen-level dependent (BOLD) phenomenon,1 functional magnetic resonance imaging (fMRI) has contributed enormously to cognitive neuroscience, but standardization of the activation map and precision of individual analysis remain as technical issues for clinical applications of fMRI. We review the appli*Corresponding author, Phone: +81-562-44-5651, extension 5633, Fax: +81-562-46-7827, E-mail: toshi@nils.go.jp

a

141

http://sig.biostr.washington.edu/projects/fm/AboutFM.html

142

works that cover the entire medical ˆeld, ontology can be applied for speciˆc biomedical research purposes, such as verifying the radiologic-pathologic correlation of brain tumor cases for multicenter study.2 Although this ontological approach was useful in evaluating the relationship between the sensitivity/speciˆcity of pathology and the phraseology employed for neuroimaging classiˆcation, after dataset assembly, post-test probability estimation remained. This retrospective approach is closer to indexing text contents than controlling lexicons for coding schemes to frame the features of images and pathology. Neuroimaging ontology will systematically organize the knowledge and techniques for cognitive task design, experimental procedure, neuroanatomy, brain signals, and behavioral data so that imaging sessions can be precisely reproduced and knowledge eŠectively shared for both research and clinical purposes. In particular, ontology will enhance multimodal neuroimaging, which re‰ects neuronal activation via diŠerent physical principles. We herein summarize the concept of ontology and its development in bio- and medical informatics and discuss the application of ontology for neuroimaging, especially for the fMRI database, and the direction for future development.

Ontology as Technology for Formal and Logical Expressions Ontology originated as a philosophical concept in Greece that explored the essence of things. In computer science, its meaning is diŠerent.3 We will not explore philosophical considerations, and unlike computer science, we do not believe that ontology is a mechanism for building queries by using a common ontological form mapped to each underlying resource. Rather, we consider ontologies as explicit formal speciˆcations of the terms in the area of interest (domain) and the relations among them;4 ontology is a technology that represents formal and logical expressions of the concepts related to the domain. It aims to describe the standard hierarchical structure of the classes (concepts), subclasses, properties of each class (slots), i.e. its various features and attributes, and restrictions on slots (facets). A knowledge base consists of a group of individual instances of classes. Thus, ontology represents a model of the reality of the world, and the concepts in ontology must re‰ect this reality (Fig. 1). This process of iterative design will likely continue throughout the life cycle of the ontology (Fig. 2). Examples of published ontologies can be checked using the ontology look-up service (OLSb). Ontologies are developed to share a common un-

T. Nakai et al.

derstanding of the structure of information among people or software agents, enable reuse of domain knowledge, make domain assumptions explicit, separate domain from operational knowledge, and structure and analyze domain knowledge. The most remarkable example of ontology is semantic web design to enhance visibility of knowledge on the web.5 Using several languages, such as Extensible Markup Language (XML), XML schema, Resource Description Framework (RDF), RDF schema (RDFS), and Web Ontology Language (OWL), the semantic web enables the description of information that is understandable by computers so that they can perform more of the tedious work involved in ˆnding, sharing, and combining information supplied by the web. These languages form a stack in the architectural layer of a semantic web: RDF oŠers a simple graph reference model; RDFS, a simple vocabulary and axioms for objectoriented modeling; and OWL, an additional knowledge base oriented toward ontology constructs and axioms. These ontological languages are themselves meta-ontology, instances of which are semantic web ontology. A single bit of knowledge consists of 3 elements, subject, property (predicate), and value (object). This RDF triple is stored in the RDF database (triple stored) as instances of ontology in the semantic web according to the deˆnition of the ontological language. The query languages, such as RDQL (RDF Data Query Language), have been implemented in a number of RDF systems for extracting information from RDF graphs. The RDF graph is a directed labeled graph to describe RDF data models; it consists of a set of nodes connected by arcs that form a node-arc-node pattern.6 Prot áeg áec, the popular ontology editor, is also an RDFS editor and provides interfaces to create/edit ontologies and store them in RDF/XML or OWL format. Inclusion of these ontology resources enables a service to be semantic.

Bio-ontology Ontologies support recent rapid and gross advances in biosciences and biotechnologies. Most concepts in biomedicine cannot be expressed as formulas, but the entities can be described in natural language. To merge heterogeneous knowledge components in biomedicine and enable computational query of the concepts of interest, the ontological ``conceptualization of reality'' is essential in b c

http://www.ebi.ac.uk/ontology-lookup/ http://protege.stanford.edu/ Magnetic Resonance in Medical Sciences

143

Ontology for fMRI

Fig. 1. Ontology uses ontological languages to translate the concepts and instances and their relationships in the real world into knowledge base resources. This ˆgure shows the scheme in functional magnetic resonance imaging (fMRI). In the real world, subjects perform tasks based on a hypothesis. Paradigm generators give the subjects stimuli designed to invoke hypothetical neuronal activities of interest. Neuronal ˆrings are detected as a hemodynamic response to the blood oxygenation level-dependent (BOLD) phenomenon, and ‰uctuation of signal intensity of echo-planar imaging (EPI) is detected by statistical analysis. Ontology translates this series of biological phenomenon in the real world into systematic descriptions for knowledge-based information resources. This knowledge base enables not only semantic indexing of the database contents or further data mining but also standardization of paradigm generation and imaging protocols. CBIR, content-based image retrieval

deciding indexing performance and characterizing domain. Protein Ontology (PO)7 and Gene Ontology (GO) are 2 remarkable bio-ontological ˆelds. Protein Ontology consists of various domains, such as a protein kinase resource, peptidase database, and transcription factor database,7 that assist research and diagnosis. Gene Ontologyd has been developed collaboratively to support the biologically meaningful annotation of genes and their products.8 The 4 central domains of GO are molecular function, biological process, cellular component, and sequence features. The main part of the ontology ˆle refers to these domains, and the annotation ˆles are taxon-speciˆc ˆlters for database projects dedicated to species. These main parts of GO are correlated to external databases by mapping ˆles, which attempt to regress the concepts of the external data-

bases, although translation is not necessarily complete. To explore a number of Gene Ontology data, search tools have been provided to answer questions of biological interest by using natural language.9 Expansion of Gene Ontology to include the design of a measurement protocol was designed into the ontology for Microarray Gene Expression Data (MGED),10 which provides semantics to describe the treatment of the sample and microarray chip technology according to the concepts speciˆed. Neuroimaging ontologies will employ this approach. Open Biomedical Ontology (OBO)e is a widespectrum collaborative project of science-based ontology developers, including PO and GO. Its goal is to create a suite of orthogonal interoperable reference ontologies in the biomedical domain. OBO is open to any developers and has provided ˆlters to

d

e

http://www.geneontology.org/

Vol. 7 No. 3, 2008

http://obofoundry.org/

144

T. Nakai et al.

Fig. 2. The ``real world'' is inspection, neurological examination, or diagnosis in clinics, functional magnetic resonance imaging (fMRI) or other neuroimaging, and treatment. Neuroimaging ontology handles the acquisition of information from the real world, knowledge building in the database, query access for decision-making by users, and connection to other knowledge units. The results of medical evaluation and treatment are re‰ected in the real world. As=assessment of information in the real world; IF=(user) interface

inter-map the ontologies and attempt to establish a common design philosophy and implementation in the biomedical domain.

UMLS Usage in the Current Medical Informatic System and Ontology for Medical Images In modern hospitals, digital images are produced in huge quantities and used primarily for immediate diagnosis and therapy. In medical informatics, despite the introduction of Digital Imaging and Communications in Medicine (DICOM)f, medical image format standardization, and picture archiving and communication system (PACS)g medical information storage and management systems, much eŠort is needed to use these standards e‹ciently and eŠectively for diagnosis assistance. In the same way that PACS expands the possibilities of conventional hard-copy medical image storage by providing capabilities for oŠ-site viewing and reporting and simultaneous access to information by practitioners at various physical locations, Content-based Medical Image Retrieval (CBMIR)11 opens the way to the next generation of medical procedures. For instance, CBMIR systems could provide advanced diagnosis assistance and set up f

DICOM–Digital Imaging and COmmunication in Medicine, http://medical.nema.org/ g PACS–Picture Archiving and Communication Systems

semantic links between the related medical information to improve health care. Furthermore, datamining could be used for research purposes, medical query expansion, and evidence-based medicine (EBM) and image-based reasoning (IBR) applications generated by similarity-based image retrieval. In addition, as medical domain knowledge becomes more complex, the decision support systems in radiology and computer-aided diagnostics for radiological practice need more powerful data and metadata management and retrieval. The role of ontologies will be increasingly more important in all these medical image- and medical multimedia-based management, retrieval, and reasoning systems. Common components of ontologies (individuals, classes, attributes, relations, restrictions, rules, assertions, events, and others) constitute an interesting support to formalize equivalent medical knowledge models (diagnosis rules, radiologic clues, decision trees, contextual graphs, and others). Because of the critical responsibility of medical doctors and the sensitivity to false-negative responses in medicine, more eŠorts should be made in medical computer-aided systems to validate existing knowledge models (truth maintenance systems, etc.).12 Nevertheless, inspiration from semantic web approaches can ensure a coherent approach, guiding the medical informatics community to the next generation of the Medical Multimedia Semantic webh. Among existing medical ontologies, one of Magnetic Resonance in Medical Sciences

145

Ontology for fMRI

Table 1. CPT ICD–9-CM

LOINC MeSH NLM-MED RxNorm SNOMED

Well-known uniˆed medical language system (UMLS) metathesaurus sources

Current Procedural Terminology International Classiˆcation of Diseases, Ninth Revision, Clinical Modiˆcation Logical Observation Identiˆer Names and Codes Medical Subject Headings National Library of Medicine Medline Data RxNorm Vocabulary SNOMED Clinical Terms

http://www.ama-assn.org U.S. Department of Health and Human Services Centers for Medicare & Medicaid Services Baltimore, MD The Regenstrief Institute Indianapolis, IN National Library of Medicine, Bethesda, MD http://www.nlm.nih.gov/pubs/factsheets/mesh.html http://www.nlm.nih.gov/ http://www.nlm.nih.gov/research/umls/rxnorm/index.html College of American Pathologists, Chicago, IL http://www.snomed.org

the most complete and most used is the National Library of Medicine's (NLM)i UMLS. Its purpose is to facilitate the development of computer systems that behave as if they ``understand'' the meaning of the language of biomedicine and health. The UMLS Metathesaurusj is a large, multipurpose, and multilingual vocabulary database that contains information about biomedical and health-related concepts, their various names, and the relationships among them. All concepts in the Metathesaurus are assigned to at least one semantic type from the semantic network, providing consistent categorization of all UMLS concepts. The UMLS Metathesaurus knowledge source uses several tools (programs) to ˆlter the UMLS concepts and relationships needed for a particular ˆeld or application; MetamorphoSysk creates useful UMLS Metathesaurus subsets by selecting appropriate sources (Table 1) and applying ˆlters and options to reˆne selected source content in customized subsets. UMLS utilizes the structured medical Metathesaurus, which allows homogeneous fusion between UMLS-compliant concepts from diŠerent medical media (images, reports, and others)13,14 as well as automatic query expansion and rule extraction. The Metathesaurus is updated many times a year. Some of the source vocabularies of the UMLS Metathesaurus, such as the Neuronames Brain MMedWeb project (A*STAR, NUS, IPAL, Singapore)– http://ipal.i2r.a-star.edu.sg/Projects/index.html. i US National Library of Medecine–http://www.nlm.nih. gov/ j About 150 vocabulary sources that contribute strings or relationships to the 2007AC UMLS Metathesaurus are listed at: http://www.nlm.nih.gov/research/umls/metab4.html k MetamorphoSys–an UMLS installation wizard and customization tool included in each UMLS release h

Vol. 7 No. 3, 2008

Hierarchy (NEU)l, can be used as a basis for neuroimaging ontology. The ontology obtained after MetamorphoSys ˆlters these sources should be completed and validated with the help of specialized ontology validation on-line servicesm assisted by neurospecialists. Medical image analysis and conceptualization is an important use of medical ontology for medical image management. Each image (Fig. 3) or region of interest (ROI) from the image is associated with a semantic label that corresponds to a combination of UMLS concepts and visual percepts (visual vocabulary).15 At least 3 types of UMLS concepts (see Image Retrieval in Medical Applications (IRMA) code)16 can be deˆned17 that could be associated to one image or region: modality concepts belonging to the UMLS semantic type ``Diagnostic Procedure''; anatomy concepts belonging to UMLS semantic types ``Body Part, Organ, or Organ Component'' or ``Body Location or Region''; and pathology concepts belonging to the UMLS semantic types ``Acquired Abnormality'' or ``Disease or Syndrome.'' A structured learning framework based on Support Vector Machines (SVM) is often used17 to facilitate modular design and extract medical semantics from images. Complementary indexing approaches are developed within this statistical learning framework: global indexing to access image modality; local indexing to access semantic local features; anatomy concept; and pathology concepts. Because the global approach seems to be l

Neuronames Brain Hierarchy, Seattle (WA): University of Washington, Primate Information Center. http://rprcsgi.rprc. washington.edu/neuronames/ m National Center for Biomedical Ontology–http://www.bio ontology.org/resources.html

146

T. Nakai et al.

Fig. 3. a) Medical image classiˆcation according to modality concept: example of use of low-level feature (3 moment herpes simplex virus [HSV], gray-level histogram, HSV histogram, color statistical parameters, texture-Gabor ˆlters, thumbnails, etc.) as input for support vector machine (SVM; one-versus-all approach) to identify high-level semantic information associated with the modality concept in the neuroimage content. b) Modality tree showing the hierarchy between modality concepts retained for the classiˆcation. Link with uniˆed medical language system (UMLS) concept unique identiˆer (CUI). c) Example of medical image classiˆcation based on SVM.

e‹cient but the training is time consuming and training-set sensitive, the local approach seems to be e‹cient for classiˆcation but less e‹cient for medical image retrieval, at least in its regular Global Resource Information Database (GRID) form. The solution may pass through patches (visual vocabularies) assigned to particular ROIs in medical images. The size and position of the patches seem to be the main concern for such pathology, modality, and anatomy-related approach using adaptive patches. In most classic approaches, each classiˆer is trained in the ``one-versus-all'' (OVA) mode (the concept of interest versus everything else); we refer to this semantic labeling framework as supervised OVA. There has been an eŠort to solve a problem in greater generality by resorting to unsupervised learning,18 particularly by latent semantic analysis.19,20 An interesting initiative (applied to natural images)21 proposes combining the advantages of OVA and unsupervised formulation through a

reformulation of the supervised formulation. This consists of deˆning a multiclass classiˆcation problem, in which each semantic concept of interest deˆnes an image class. This Supervised Multiclass Labeling (SML) formulation retains the classiˆcation and retrieval optimality of supervised OVA as well as its ability to avoid restrictive independence assumptions. Although early retrieval architectures were based on the query-by-example paradigm, which formulates image retrieval as a search for the best database match to a user-provided query image, it was quickly realized that the design of fully functional retrieval systems would require support for semantic queries, i.e., use of ontologies,22 which opened the way to content-context-based EBM, able to give more pertinence and reliability to a classical EBM approach and based essentially on group patient statistics.

Magnetic Resonance in Medical Sciences

Ontology for fMRI

147

Functional MR Imaging Database System for Global Access Establishing a shared ontology for fMRI oŠers several advantages: users can transparently retrieve data from across diŠerent fMRI database systems; interoperability of the diŠerent applications for fMRI analysis is ensured, which is useful for computer-mediated meta-analysis of datasets; and together with GRID technology,23 a shared ontology can facilitate global access to distributed fMRI database systems, which until now have remained independent from each other. GRID technology can be used to facilitate controlled sharing and management of a large number of distributed datasets, such as digital medical images. Replication of medical images will be unnecessary when they can be shared over wide networks. It can also optimize the use of storage resources by pooling heterogeneous storage between distributed sites and enabling existing medical applications, such as PACS, to treat distributed images as if they are local. Several initiatives have explored the use of GRID technology to enable database systems for secure global access. One, the MammoGrid project,24,25 used the GRID as its information infrastructure to develop a Europe-wide database of mammograms. Another Europe-based project, the MediGridn, aimed to explore the use of GRID technology to process medical image databases available in hospitals today. The Globus Medicus project extended the Globus Toolkit's capability to provide seamless GRID integration of the DICOM standard protocol used in most healthcare and medical research institutes. Some players in the information technology industry have also started oŠering an enterprise GRID-based business solution to medical data storage and management. Early last year, IBM started oŠering its new GRID Medical Archive Solution (GMAS), providing hospitals with a multitier, -application, and -site enterprise storage archiveo. Similar projects have been undertaken in the ˆeld of neuroscience. One, the Medical GRID (MedGrid) projectp, aimed to study and demonstrate the use of GRID technology in analyzing and managing functional MR imaging datasets. The MedGrid testbed was formed in 2004 and involved researchers from 5 institutions in 3 countriesq. To enable n

http://www.creatis.insa-lyon.fr/MEDIGRID/ Press release: http://www–03.ibm.com/press/us/en/ pressrelease/21553.wss p http://www.medgrid.org/ o

Vol. 7 No. 3, 2008

Fig. 4. The graphical user interface (GUI) of BAXSQL, showing several datasets from 2 remote data servers. Users interact directly with the GUI and manipulate remote datasets using commands accessible from the menu items provided. Access to remote data servers is transparent to the user as if datasets are located locally.

global access to shared data sources available within the testbed, a GRID-based fMRI data management and analysis tool, called BAXSQL,26 was developed (Fig. 4). BAXSQL facilitates the federation of fMRI datasets from the 3 data servers available in the MedGrid testbed. Its features include multi-database querying, dataset selection, and download capabilities to the local machine. Moreover, it has several built-in functions for common functional MR imaging analysis routines, such as realignment, smoothing, and standard general linear model (GLM)-based statistical analysis. The routines are implemented such that the testbed's analysis servers are used to perform the computations. Security is important for systems allowing global access. BAXSQL implements a 2-level security mechanism using standard-based GRID technology. On the GRID level, each user needs a standard X.509 security certiˆcate issued by the virtual organization's (VO) certiˆcate authority. This certiˆq

Participating institutes include the National Institute of Advanced Industrial Science and Technology (AIST), the National Center for Geriatrics and Gerontology (NCGG), and Osaka University (OU) all in Japan, the Ateneo de Manila University (ADMU) in the Philippines, and the National Taiwan University (NTU) in Taiwan.

148

cate can be employed to authenticate the user when resources within the GRID are utilized. In addition, data server owners need to grant access privileges before GRID users can access stored fMRI datasets. This is controlled by the backend database's access control mechanism. The implication is that even if the user gains access to the data server using his X.509 certiˆcate, he still may be unable to manipulate the stored datasets without proper privileges being granted by the data server's owner. Overall, BAXSQL can be used to build fMRI database systems accessible globally over the public Internet and with appropriate security features. It can facilitate the sharing of fMRI datasets even across national borders. To attain maximum compatibility, BAXSQL uses only a limited number of metadata to describe fMRI datasets, but as more research groups share their respective datasets, a standard for fMRI data sharing is necessary. Thus, a shared ontology for fMRI is very important. Future releases of BAXSQL will support a common ontology for fMRI data management and analysis.

Ontology for fMRI In this section, we conceptualize fMRI ontology for clinical neuroimaging as a tutorial of ontology building. Bodenreider and colleagues noted 7 points to ensure successful ontology development: community involvement, clear goals, limited scope, simple structure, continuous evolution, active curation, and early use.3 From this perspective, the scientiˆc neuroimaging ontology is intended to establish consistent annotation among the cognitive process, functional map overlaid to neuroanatomy, measurement methods, and analysis principles so that any examiner can reproduce brain mapping. Its scope is to explore the principles and structures of the neuronal system, clarifying the mechanisms of cognition and approaching the theory of mind by merging major neuroimaging modalities, such as fMRI, magnetoencephalography (MEG), electroencephalography (EEG), and near infrared spectroscopy (NIRS). The goal of clinical neuroimaging ontology, in contrast, is to establish an intelligent assistance system for early diagnosis of cognitive impairment, preoperative mapping, investigations of pathophysiological status, and monitoring of treatment. Its scope will be more focused on the query of similar brain images and activation maps for individual assessment. Although scientiˆc and clinical neuroimaging will share the majority of resources, the annotation of concepts and instances will be slightly diŠerent, which can be explained by the contrast of ``research ‰ow'' and ``clinical ‰ow''

T. Nakai et al.

(Fig. 5). The Neuroscience Database Gateway (NSD) lists several fMRI database projectsr. The fMRI Data Center (fMRI-DC)s, a representative project to support the fMRI research community, supplies common datasets for meta-analyses and future development of new methods. The data center (DC) tool is dedicated to register the original datasets submitted by researchers, and it serves browsing and displaying functions, all of which are written in Java programming language. They also developed an ontological template for data management based on the Prot áeg áe 2000 platform. Table 2 lists the main classes. Overall, the ontology of fMRI DC is dedicated to support neuroscience research communities so that registered researchers can interact and exchange knowledge. The Source for Neuroimaging Tools and Resources (NITRCt), a project initiated in 2007, is a consortium supported by the National Institutes of Health (NIH) to identify fMRI tools and resources for the neuroimaging community, provide information about these tools and access to them in a common format, facilitate community interaction to make them more usable by a broader research community, and facilitate neuroimaging science and neuroinformatics collaboration and education. The project includes ontology development. Determining the domain and scope of the ontology The database access and management style will be diŠerent for clinical neuroimaging database resources. Only trained and qualiˆed experts can contribute contents to the database according to predeˆned guidelines. Research-based paradigms are not necessarily applicable to the clinical index. Clinical users are mostly interested in retrieving data as reference activation maps to compare with individual neuroimaging results; they are not necessarily supposed to expand the database. The domain of ontology we model is to cover fMRI data indexing and retrieval to assist in clinical neuropsychological diagnosis. The basic question that ontology should answer is how an individual activation map matches the standard data for each paradigm or condition. For clinical neuroimaging, the questions are: 1) What tasks should be chosen for the subject based on the symptoms and ˆndings in neuropsychological examinations? 2) How was the variance of the patient's activation from that of normal subjects? 3) What are the history, clinical r

http://ndg.sfn.org/ http://www.fmridc.org/ t http://www.nitrc.org/ s

Magnetic Resonance in Medical Sciences

Ontology for fMRI

149

Fig. 5. Comparison of ``clinical'' and ``scientiˆc'' neuroimaging. Imaging and analysis techniques are the same, but purpose and information of interests are diŠerent. This should be represented in the ontology of each type of functional magnetic resonance imaging (fMRI) usage. The science mode requires preliminary studies to develop or revise the paradigms, the protocol for fMRI experiments, and related recording methods.

Table 2. Major classes in the ontology of functional magnetic resonance imaging (fMRI) data center (DC) Analysis (overview of analysis) Analysis transform (data processing procedures) Date (days since epoch) Event (description of the examination) Experiment description (scan protocols, task design) Experimental data (personal, behavioral, and run data) Machine (types of magnetic resonance [MR] system and radiofrequency [RF] coils) Measurement (display measurement) Miscellaneous (analysis context, bulk data, keywords, data format, type of experiment) Person (researcher, subject) Text (format for the texts) Time of day (oŠset time) Timestamp (scale of timestamps) Unit of measure (time, length, magnetic ˆeld, etc.) URI (uri value) System-class (basic deˆnition of ontology structure)

status, pathology, behavior, brain functional data obtained by other modalities, and genetic information of subjects who have similar activation maps to that of the patient? 4) How can the patient's cognitive status or dysfunction be characterized and classiˆed? Was cognitive function recovered by treatment? Was there any secondary or plastic change of the neuronal system? 5) For preoperative evaluation, what is the anatomical relationship between the lesion and eloquent areas (language, moVol. 7 No. 3, 2008

tor, or other important centrals)? Can the neurological symptoms be explained by the lesion? From the viewpoints of the domain and scope of functional neuroimaging ontology, the following details should be clariˆed in the ontology description: 1) complete task designs, instructions, and instrumentations, including behavioral or physiological data acquisition; 2) standard imaging parameters for anatomical and functional images; and 3) standard analysis protocols and parameters to generate an activation map. Including these points, ontology for clinical functional neuroimaging can share concepts of scientiˆc neuroimaging ontology; therefore, we discussed clinical neuroimaging ontology by modifying the ontology proposed by fMRI DC, which has already deˆned many of the important terms for neuroimaging. The following steps tracked the procedure of ``Ontology Development 101'' from the developer groups of Prot áeg áe 2000u. Deˆning the classes and the class hierarchy After the essential terms for classes are chosen, they should be organized into a taxonomic (subclass-superclass) hierarchy. We combined the topdown and bottom-up approach. Table 3 shows the list of major classes and initial superclass-subclass design. The hierarchy for clinical functional neuroimaging is organized by the computational structure similarity of concepts. This point is quite relatu

http://protege.stanford.edu/publications/ ontology_development/ontology101.html

150

T. Nakai et al.

Table 3.

Initial proposal of functional magnetic resonance imaging (fMRI) ontology

Feature description Study description title, category, target diagnosis, synopsis, (references) Examination description subject, study, session list, simultaneous recordings Session description task, imaging parameters, data Task description title, category, target cognition, synopsis, task control/resources, (instructions, references) Fusion description modality, mathematical method Data format deˆnition Data Magnetic resonance (MR) imaging functional, anatomical, diŠusion Other functional modality Electroencephalography (EEG), magnetoencephalography (MEG), near infrared spectroscopy (NIRS), electromyogram (EMG), etc. Behavioral data reaction time, accuracy, motion performance Map activation, fusion, template Task resources picture, sound, characters, text Parameter description Task control block/event related (ER), stimulus onset asynchrony (SOA), duration, jittering, epoch order Imaging pulse sequence, repetition time (TR), echo time (TE), slice thickness, matrix, ˆeld of vision (FOV) Data preprocessing motion correction, normalization, smoothing Statistical analysis statistical method, threshold, post-hoc, covariance Unit deˆnition Relational info ID deˆnition study, examination, session, task, data, subject, personal record Keywords anatomy, cognition, physiology, pathology, imaging, data processing Individual record Clinical record Top classes and their subclasses of functional magnetic resonance imaging (fMRI) ontology for clinical application based on medical global resource information database (GRID). The major classes are based on the similarity of the data type and structure. The subclasses re‰ect the procedure of neuroimaging or the role of the information.

ed to the development of the actual application for image analysis, data storage, and retrieval to integrate multimodal measurements. Among the major classes, feature description explains the universal concepts for the database, experiments, and data processing, whereas parameter description deˆnes the variable part of data acquisition and analysis. ``Relational info'' supports handling of the instances and knowledge for user interface and semantic indexing. The current version is based on

an MR-centered design, but the sub-classes can be expanded for each neuroimaging modality. Deˆning the properties of classes and slots Next, the internal structure of each concept, i.e., the property of the class, which comprises slots and facets of the slot, should be described. Slots characterize the class from diŠerent viewpoints, and facets explain a slot's actual values. Most remaining terms other than the classes in the term list of Magnetic Resonance in Medical Sciences

Ontology for fMRI

151

Fig. 6. The class editor view of Prot áeg áe 2000 showing the class hierarchy and the slot template of clinical functional magnetic resonance imaging (fMRI) ontology. The facets of each slot are listed as column titles of the slot table; their properties are shown in each row.

the ontology are the properties. Here, 1) all subclasses of a class inherit the slot of that class, and 2) a slot should be attached to the most general class that can have that property. Accordingly, the class ``Exam_description'' can be organized as in Fig. 6. A class's internal structure can be brie‰y summarized as ``the SLOT of the CLASS is the FACET.'' For example, ``the study_id (slot) of the study_ description (class) is an integer (facet),'' or ``the study_category (slot) of the study_description (class) is the class motor_behavior (facet).'' Deˆning the facets of the slots and describing their allowed values The properties of a slot are facets. A slot can have facets to describe value type, cardinality, allowed values, classes to be called, and further features of the values the slot may take. The common types of value are a string; a numeric type, such as an integer and ‰oat; Boolean (true/false); an enumerated type listing the possible values; and the instance type. An instance-type slot indicates a set of instances listed in other classes. Slot cardinality deˆnes how many values a slot can have. Figure 7 shows an example of a facet set of a slot. Practically, we frequently encounter the question, ``Will it be a new class or a property value?'' If a distinction is important in the domain and we think that objects with diŠerent values are recognized as diŠerent kinds of objects, then we should Vol. 7 No. 3, 2008

create a new class for distinction. In the example, the diŠerence of ``gradient-strength'' is represented as the maximum extent of phase or frequency encoding, i.e., the diŠerence of k-space trajectory and the slice proˆle. ``Gradient-strength'' is related to other imaging parameters, such as the number of slices in a volume, slice thickness, or echo time (TE). It is a ˆxed property of the hardware related to quality issues; however, it is not the primary factor to answer the competency questions mentioned; therefore, ``gradient-strength'' is assigned to a slot in this ontology. Filling in the values of slots for instances The base of the class hierarchy consists of instances. Instances are actual data in the knowledge base. Creating an individual instance of a class is to choose a class, creating an individual instance and ˆlling the slot values. We encountered a similar question as that previous–should it be a class or an instance? The answer depends on the potential applications of the ontology. If the concept is the most speciˆc representation in the hierarchy of the knowledge base, then it will be an instance. For example, ``quadrature birdcage head coil,'' a subclass of the class ``RF coil,'' represents a type of electric circuit for head imaging and can include several products from each vendor; therefore, ``quadrature birdcage head coil'' is a class, and the products speciˆed by the product name or number

152

T. Nakai et al.

Fig. 7. The slot editor view of Prot áeg áe 2000. This example shows the properties of ``gradientstrength'' from fMRI-DC ontology with the unit ``mT/m.'' The domain of this slot is ``MRI Scanner'' class. ``Allowed classes'' indicates constraint on the values of a type-instance slot. The value of the slot ``gradient-strength'' can only be an instance of the class ``Measurement'' or any of its children.

Medical and bioinformatics have been independent of each other, having diŠerent backgrounds, research and development topics, and application; however, recent advances in biomedical engineering urged merging of the 2 ˆelds into biomedical informatics (BMIv). In Europe, BIOINFOMED study of this fusion was initiated in 2001, and many projects have been derived from this idea.27 As infrastructures to support those projects, GRID computing has been employed as HealthGRIDw at the levels of both data-sharing and human collaboration.28,29 In the United States, NIH launched the

Biomedical Informatics Research Network (BIRN) in 2001x. BIRN is more oriented toward medical informatics, the major sub-domains of which are morphometry, fMRI, and animal imaging. Ontology takes part in organizing the resources of healthGRIDs,30 such as biomedical databases, computing power, medical experience, medical devices, and management of the projects. In Asia, MedGRID, a project to apply GRID computing to neuroimaging, was initiated in 2004, as described above. In 2006, the ONCO-Mediay project was started as the collaboration of 5 Asian countries and Europe. This project attempts to develop a novel GRID-distributed, contextual and semanticbased, intelligent information access framework for medical images and to explore new access applications for medical images in diagnosis assistance, teaching, and research using semantic, visual, and context-sensitive medical information with GRID computing. Ontology development is a primary activity in this project because the ontologic approach eŠectively integrates heterogeneous database systems, which are common in medicine.27,31 Service-oriented computing (SOC)32 is the new

v

x

are instances. However, if usage of the ontology does not require the products to be speciˆed, ``quadrature birdcage head coil'' may be an instance of the class ``RF coils.'' When an instance of the class ``quadrature birdcage head coil'' has new products and is a modiˆcation of current products, then it will be a new instance, but it cannot be a sub-instance of the current product because only classes can be arranged in a hierarchy.

Biomedical Informatics and Future Direction

'BMI' is also used for biomedical imaging or brain machine interface. w http://community.healthgrid.org/

y

http://www.nbirn.net/ ONtology and COntext related MEdical image Distributed Intelligent Access, http://www.onco-media.com/ Magnetic Resonance in Medical Sciences

Ontology for fMRI

153

technical trend in software development for this integration; SOC is intended to compose applications by discovering and invoking network-available services to accomplish various tasks in an environment of mass distribution. The representative SOCbased technologies, such as the Simple Object Access Protocol (SOAP), Web Services Choreography Description Language (WS-CDL), and Business Process Execution Language for Web Services (BPEL4WS), are Web services based on open standards and are employed to assemble application components into a loosely coupled network of services that can enable dynamic processes for the task. XML-based languages, such as BPEL4WS, are used to ``orchestrate'' the whole system, i.e., to describe how services interact at the message level, including the business logic and execution order of interactions. WS-CDL describes the ``choreography'' of individual processes of public message exchanges, interaction rules, and agreements among multiple business processes. Thus, GRID computing achieves high-performance fMRI for clinical applications and organizes the neuroimaging database as part of biomedical informatics, and ontology for fMRI plays a role in integrating them into e-health. Ontology is for application development as well as for indexing neuroimaging data.

Acknowledgements This research was supported by a Grant-in-Aid for Scientiˆc Research (KAKENHI) # 18300179, from the Ministry of Education, Culture, Sports, Science, and Technology, Japan and by French program ICT-Asia, supported by the French Ministry of Foreign AŠairs and the French National Research Center (CNRS), #AFD : 2006-GOE/ CDE/AJ-no 376.

Appendix List of Abbreviations BIRN Biomedical Informatics Research Network BOLD Blood Oxygen Level Dependency BPEL4WS Business Process Execution Language for Web Services CBIR Content Based Image Retrieval CBMIR Content-Based Medical Image Retrieval CUI Concept Unique Identiˆer DC Data Center DICOM Digital Imaging and Communications in Medicine Vol. 7 No. 3, 2008

EBM EEG EMG EPI FMA

Evidence-Based Medicine Electroencephalography ElectroMyoGram Echo Planar Imaging Foundational Model of Anatomy ontology fMRI functional Magnetic Resonance Imaging GLM General Linear Model GMAS GRID Medical Archive Solution GO Gene Ontology GRID Global Resource Information Database GUI Graphic User Interface IBR Image-Based Reasoning IRMA Image Retrieval in Medical Applications MedGrid Medical GRID MEG MagnetoEncephaloGraphy NEU Neuronames Brain Hierarchy NIRS Near InfraRed Spectroscopy NSD Neuroscience Database Gateway NTRC Neuroimaging Tools and ResourCes OBO Open Biomedical Ontology OLS Ontology Look-up Service OVA One-Versus-All OWL Web Ontology Language PACS Picture Archiving and Communication System PO Protein Ontology RDF Resource Description Framework RDFS ResourceDescription Framework Schema RDQL Resource Description framework data Query Language ROI Region Of Interest SML Supervised Multiclass Labeling SNOMED Systematized Nomenclature of Medicine SOAP Simple Object Access Protocol SOC Service-Oriented Computing SVM Support Vector Machine UMLS Uniˆed Medical Language System WS-CDL Web Services Choreography Description Language XML Extensible Markup Language References 1.

2.

Ogawa S, Lee TM, Kay AR, Tank DW. Brain magnetic resonance imaging with contrast dependent on blood oxygenation. Proc Natl Acad Sci USA, 1990; 87:9868–9872. Juli àa-Sap áe M, Acosta D, Maj áos C, et al. Comparison between neuroimaging classiˆcations and

154

3.

4.

5.

6. 7.

8.

9.

10.

11.

12.

13.

14.

15.

T. Nakai et al.

histopathological diagnoses using an international multicenter brain tumor magnetic resonance imaging database. J Neurosurg 2006; 105:6–14. Bodenreider O, Stevens R. Bio-ontologies: current trends and future directions. Brief Bioinform 2006; 7:256–274. Gruber TR. A translation approach to portable ontology speciˆcation. Knowledge Acquisition 1993; 5:199–220. Ding L, Kolari P, Ding Z, Avancha S. Using ontologies in the semantic web: a survey, In: Sharman R, Kishore R, Ramesh R, eds. Ontologies–A handbook of principles, concepts and applications in information systems. New York: Springer, 2007; 79–113. Shelley P. Practical RDF. Sebastopol, CA: O'Reilly & Associates, 2003. Wolstencroft K, McEntire R, Stevens R, Tabernero L, Brass A. Constructing ontology-driven protein family databases. Bioinformatics 2005; 21: 1685–1692. Gene Ontology Consortium. The Gene Ontology project in 2008. Nucleic Acids Res 2008; 36(database issue):D440–444. Shoop E, Casaes P, Onsongo G, et al. Data exploration tools for the Gene Ontology database. Bioinformatics 2004; 20:3442–3454. Whetzel PL, Parkinson H, Causton HC, et al. The MGED Ontology: a resource for semantics-based description of microarray experiments. Bioinformatics 2006; 22:866–873. M äuller H, Michoux N, Bandon D, Geissbuhler A. A review of content-based image retrieval systems in medical applications–clinical beneˆts and future directions. Int J Med Inform 2004; 73:1–23. Tutac AE, Racoceanu D, Putti T, Xiong W, Leow W-K, Cretu V. Knowledge-guided semantic indexing of breast cancer histopathology images, biomedical engineering and informatics: new development and the future, In: Yonghong Peng, Yufeng Zhang, eds. Proc of the First International Conference on BioMedical Engineering and Informatics, IEEE Computer Society, Sanya, Hainan, China, 2008; 107–112. Racoceanu D, Lacoste C, Teodorescu R, Vuillemenot N. A semantic fusion approach between medical images and reports using UMLS, In: Ng HT, et al. eds. Information Retrieval Technology, Lecture Notes in Computer Science 4182; Third Asia Information Retrieval Symposium, Singapore. Springer Berlin/Heidelberg, 2006; 460–475. Teodorescu R, Racoceanu D, Leow W-K, Cretu V. Prospective study for semantic inter-media fusion in content-based medical image retrieval. Medical Imaging Technology 2008; 26:1–11. Lim J-H, Chevallet J-P. Vismed: a visual vocabulary approach for medical image indexing and retrieval, In: Information Retrieval Technology, Lecture Notes in Computer Science; Second Asia

16.

17.

18.

19.

20.

21.

22.

23.

24.

25.

26.

27.

28.

29.

Information Retrieval Symposium, Jeju Island, Korea. Springer Berlin/Heidelberg, 2005; 84–96. Lehmann TM, Schubert H, Keysers D, Kohnen M, Wein BB. The IRMA code for unique classiˆcation of medical images, In: Proceedings SPIE, 2003; 109–117. Lacoste C, Chevallet J-P, Lim J-H, et al. Intermedia concept-based medical image indexing and retrieval with UMLS at IPAL, In: Evaluation of Multilingual and Multi-modal Information Retrieval, Lecture Notes in Computer Science 4730, 7th Workshop of the Cross-Language Evaluation Forum, Alicante, Spain. Springer Berlin/Heidelberg, 2006; 694–701. Barnard K, Forsyth D. Learning the semantics of words and pictures. The Eighth IEEE International Conference on Computer Vision, 2001; Vancouver, Canada; 408–415. Hofmann T. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 2001; 42:177–196. Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R. Indexing by latent semantic analysis. J Am Soc Info Sci 1990; 41:391–407. Carneiro G, Chan AB, Moreno PJ, Vasconcelos N. Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans PAMI 2007; 29:394–410. Picard R. Digital libraries: meeting place for highlevel and low-level vision, In: Li SZ, Mital DP, Teoh EK, Wang H, eds. Recent Developments in Computer Vision, Lecture Notes in Computer Science 1035; Asian Conference on Computer Vision, Singapore. Springer, 1995; 3–12. Foster I, Kesselman C. The Grid: Blueprint for a Future Computing Infrastructure. Morgan Kaufmann Publishers, Inc., San Francisco; 1999. Warren R, Thompson D, del Frate C, et al. A comparison of some anthropometric parameters between an Italian and a UK population: ``proof of principle'' of a European project using MammoGrid. Clin Radiol 2007; 62:1052–1060. Estrella F, Hauer T, McClatchey R, Odeh M, Rogulin D, Solomonides T. Experiences of engineering GRID-based medical software. Int J Med Inform 2007; 76:621–632. Bagarinao E, Tanaka Y. A functional MRI data management tool for the medical grid using NinfG; 6th IEEE International Conference on Computer and Information Technology, Seoul, Korea. IEEE Computer Society, 2006. Maojo V, Tsiknakis M. Biomedical informatics and healthGRIDs: a European perspective. IEEE Eng Med Biol Mag 2007; 26:34–41. Olive M, Rahmouni H, Solomonides T. From HealthGrid to SHARE: a selective review of projects. Stud Health Technol Inform 2007; 126: 306–313. Breton V, Blanquer I, Hernandez V, Legr áe Y,

Magnetic Resonance in Medical Sciences

Ontology for fMRI

30.

31.

Solomonid áes T. Proposing a roadmap for HealthGrids. Stud Health Technol Inform 2006; 120:319–329. Smirnov A, Pashkin M, Chilov N, Levashova T. Ontology-based knowledge repository support for healthgrids. Stud Health Technol Inform 2005; 112:47–56. Camarasu S, Benoit-Cattin H, Montagnat J,

Vol. 7 No. 3, 2008

155

32.

Racoceanu D. Content-Based Medical Image Indexing and Retrieval on Grids. First International Symposium on ICT for Health, Ateneo de Manila University, Manila, Philippines, Philippine J Info Tech, 2008; 1:46. Papazoglou MP, Traverso P, Dustdar S, Leymann F. Service-oriented computing: state of the art and research challenges. Computer 2007; 40:38–45.