A Rough Set Formalization of Quantitative Evaluation with Ambiguity Patrick Paroubek and Xavier Tannier, LIMSI-CNRS, {pap,xtannier}@limsi.fr C ONTEXT Reference
O BJECTIVES – No formal framework exists for studying the evaluation paradigm ; we propose to lay the foundation for such model based on the mathematical notion of “rough sets”. – We propose to consider the notion of potential performance space, for describing the performance variations corresponding to the ambiguity present in the hypothesis data.
Systems
Lorem ipsum dolor sit amet,
Lorem ipsum dolor sit amet,
consectetur adipisicing elit, sed do
consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore
eiusmod tempor incididunt ut labore Lorem ipsum dolor sit amet, et dolore magna aliqua. Ut enim ad consectetur adipisicing elit, sed do minim veniam, quis nostrud eiusmod tempor incididunt ut labore exercitation ullamco laboris Lorem nisi ut ipsum dolor sit amet,
et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
Evaluation :
et dolore magna aliqua. Ut enim ad consectetur adipisicing elit, sed do aliquip ex ea commodo consequat. minim veniam, quis nostrud Loremutipsum eiusmod tempor incididunt laboredolor sit amet, Duis aute irure dolor in reprehenderit
aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit
exercitation ullamco laboris nisi ut consectetur adipisicing elit, sed do dolore magna aliqua. Ut enim ad in voluptate velit esse cillumetdolore aliquip ex ea commodo consequat. minim veniam, quis eiusmod nostrud tempor incididunt ut labore eu fugiat nulla pariatur.
in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
of Technology, Objective, Quantitative, Black-Box.
Duis aute irure dolor in reprehenderit et dolore magna exercitation ullamco laboris nisi ut aliqua. Ut enim ad in voluptate velit esse cillum dolore minim veniam, quis nostrud aliquip ex ea commodo consequat. eu fugiat nulla pariatur. ullamco laboris nisi ut Duis aute irure dolor exercitation in reprehenderit ea commodo consequat. in voluptate velit essealiquip cillum ex dolore Duis aute irure dolor in reprehenderit eu fugiat nulla pariatur.
Evaluation
in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Overlap Ambiguity Graded Relevance Decision
A ROUGH SET MODEL OF AMBIGUITY Let A = (U ; A) be an information system (A ⊆ U × A) and let B ⊆ A and X ⊆ U . Lower and upper approximation of X are : BX = {x/[x]B ⊆ X} BX = {x/[x]B ∩ X 6= ∅} |BX| αB (X) = is the accuracy approximation coeff.
E VALUATION tn
fp
tp
fn
|BX|
If we consider an equivalence relation ≈ instead of = X
H
|H∩R|∪(X\|H∪R|) acc = |X| |(H∪R)\(H∩R)| err = |X| |H∩R j = |H∪R| 1 f = α (1−α) , 0 < α < 1, p p+ r
R
tn
X
fn
HHH H H
R
tp tp tp
=
|H∩R| , r |H|
=
|H∩R| |R|
∀ protocols, perf ormance = φ(|T P |, |F P |, |F N |, |T N |)
α can quantify the amount of change, e.g. for precision : p.(1 − α≈(H)) ≤ p≈ ≤ p.(1 + α≈(H))
P OTENTENTIAL P ERFORMANCE S PACE +
Decision
current precision w.r.t. decisions made remaining PPS w.r.t. decisions made
+ Decision
+
OK
+ Precision
+
NO
OK Decision
+
+
...
Precision
+
Decision
NO Decision
OK
+
+
+
Precision
First Decision
OK +
NO Decision
+ Precision
+ Precision
...
NO
...
OK + Precision
|Rtp|
NO
...
ρ=
m S
ρ1 =
j=1 q S
ρi =
k=1 u S k=1
If ambiguity is allowed in the hypothesis data, one can ask what is the limit performance range defined by failures or successes while disambiguating the remaining (partially) undecided annotations. The potential variation defines what we call the potential performance space. The measure of decision gauge the level of annotation disambiguation |{x/[x]≈={x}}| |{x/|[x]≈|=1}| = D= |H/≈| |H/≈| The amount of performance variability due to partially disambiguated hypo|Rtp| thesis data can be quantified with : αR(tp) = E XAMPLE : PASSAGE PARSING ANNOTATIONS
ρj , m ∈ N {rl /l ∈ N, rl ⊂
k P(S
× A)}
{r ⊂ P((S ∪ ρx1) × (S ∪ ρx2) . . . ×(S ∪ ρxk ) × A), 1 ≤ xk < i}
S : segmentation into words units A : chunk and relation labels ρ1 : word chunk label associations, word word or word chunk relations ρ2 : relations between chunks only ρ : ρ1 ∪ ρ2 * mod−n 3/3
H ∩ R = continuous line relations H R = dashed lines relations 2 Decision for relations = 4 4 αR(tp) = 7
to other crops
* comp 2/2
comp 1/2 , such as
GP
cotton
,
GN mod−n 1/3
soybean
and
GN coord
* mod−n 2/3
rice GN
coord