(Microsoft PowerPoint - pham_poster_SIGIR [Mode de compatibilit\351])

Goal: to provide a still image representation to fill the semantic gap and to allow a fast retrieval. Our proposal: â¢ a graph-based representation of image content.

Télécharger le PDF

4MB taille 6 téléchargements 364 vues

commentaire

Report

SPATIAL RELATIONSHIPS IN VISUAL GRAPH MODELING FOR IMAGE CATEGORIZATION Trong-Ton Pham1,2, Philippe Mulhem2, Loic Maisonnasse3 [email protected], [email protected], [email protected]

Grenoble INP1 - Laboratoire Informatique de Grenoble2 - TecKnowMetrix3 Acknowledgment: AVEIR (ANR France), Merlion Ph.D (Singapore), ACM SIGIR travel grant

Introduction

Visual Graph Modeling Visual graph Gi for an image i defined by Gi=

C1

•S

C3 C2

Feature Extraction

Graph Modeling

Concept Construction

Graph Matching

WC

set of weighted vissual concepts set WC

WCi

•S

Visual Graph Modeling

Bag-of-word Model

i

[Blei and Jordan 2003] [Barnard et al. 2003] [Fei-Fei and Fergus 2007]

i

WE

set of weighted relations set WE

i WE

9

Goal: to provide a still image representation to fill the semantic gap and to allow a fast retrieval.

= {(c, # (c, i)) | c ∈ C} = {(( c , c ' ), l , # ( c , c ' , l , i )) | ( c , c ) ∈ C × C ' , l ∈ L}

Example:

Our proposal: • a graph-based representation of image content • a fast graph matching inspired by Language Model of IR Application: robot localization, scene identification

Language Model for Graph Matching Given the query graph Gq=, the probability generated by a trained graph Gd is computed as

q q q P (G q | G d ) = P( SWC | G d ) × P ( SWE | SWC ,Gd )

• Relation sets independence hypothesis

• Concept sets independence hypothesis

∏ P (W

q C

q q P ( S WE | S WC ,G d ) =

|Gd )

q P (WEq | SWC ,Gd ) ∝

P (c | G d ) # ( c , q )

q | SWC ,G d )

∏ P ( L (c , c ' ) = l | W

q q d # ( c ,c ',l , q ) C ,WC ' , G )

( c ,c ',l )∈C ×C '× L

c∈C

• Jelinek-Mercer smoothing

• Jelinek-Mercer smoothing P (c | G d ) = (1 − λc )

q E

• Multinomial distribution model

• Multinomial distribution model

∏

∏ P (W q W Eq ∈ S WE

q W Cq ∈ S WC

P(WCq | G d ) ∝

[Song and Croft, 1999]

Relations

Concepts q P ( S WC |Gd ) =

extension of standard Language Modeling

# (c , d ) # (c , D ) + λc # (*, d ) # (*, D )

P ( L(c, c' ) = l | WCq , WCq' , G d ) = (1 − λl )

# (c , c ' , l , d ) # (c , c ' , l , D ) + λl # (c, c' ,*, d ) # (c, c' ,*, D )

Case 1: Robot Localization

Case 2: Scene Identification

Objective: localization a mobile robot within a known environment with visual information

Objective: a mobile image search platform to enhance tourist experiences (Snap2Tell)

Collection: RobotVision for ImageCLEF’09

Training set: night condition

• 3 image sets: training set of 1034 images, validation set of 909 images and test set of 1690 images

Collection: Singapore Tourist Object Identification Collection (STOIC) • Training set of 3189 images, test set of 660 images

• 5 rooms and an unknown room in test set

• 101 Singapore popular landmarks (101 classes) Validation set: sunny condition (after 6 months)

Proposed models:

Proposed models:

• Graph without relation LM =< {WCpatch, WCsift}, Φ >

• Graph without relation LM = < {WCpatch}, Φ >

• Graph with inter-relation set VGM=< {WCpatch, WCsift}, {WEinside} >

• Graph with intra-relation sets VGM = < {WCpatch}, {WEleft_of ,WEtop_of }>

Test set: unknown condition (after 20 months)

WCpatch #1

p1

#1

top_of, #1 left_of, #1

#2

p2 #2

#2

#5

#3

#1

p4

left_of, #1 top_of, #2 #3

WEinside

c2

c1

left_of, #1

top_of, #2 left_of, #1 #2

c3

#3

s1

si

left_of, #1

#1

c4

WCsift

top_of, #1 left_of, #1

Result & Discussion #class

LM

VGM

SVM

RobotVision Validation Test

5 6

0.579 0.416

0.675 (+16.6%) 0.449 (+7.9%)

0.535 0.439

STOIC-101

101

0.789

0.809 (+2.5%)

0.744

Categorizing of STOIC-101 and RobotVision image collections

This result shown: •

stability of visual graph induction process from different types of visual concepts

•

benefits of using spatial relationships among different visual concepts

•

good matching performance of visual graphs (~ 5 graphs/sec)

Future works: •

adding more visual concepts and integrating new type of relations

•

completing the general graph theory and framework for image search

(Microsoft PowerPoint - pham_poster_SIGIR [Mode de compatibilit\351])

des documents recommandant