Grammatical inference

A class L of languages has an ... L has finite elasticity if it does not have ..... Cdlh. 2010. Exercises. ○ Build a language L that is not k-reversible,. ∀k≥0.
544KB taille 5 téléchargements 316 vues
Learning from Text Colin de la Higuera University of Nantes

Zadar, August 2010

1

Acknowledgements z

Laurent Miclet, Jose Oncina, Tim Oates, AnneMuriel Arigon, Leo Becerra-Bonache, Rafael Carrasco, Paco Casacuberta, Pierre Dupont, Rémi Eyraud, Philippe Ezequel, Henning Fernau, JeanChristophe Janodet, Satoshi Kobayachi, Thierry Murgue, Frédéric Tantini, Franck Thollard, Enrique Vidal, Menno van Zaanen,...

http://pagesperso.lina.univ-nantes.fr/~cdlh/ http://videolectures.net/colin_de_la_higuera/ Cd lh

20

10

2 Zadar, August 2010

Outline 1. 2. 3. 4. 5.

Motivations, definition and difficulties Some negative results Learning k-testable languages from text Learning k-reversible languages from text Conclusions

http://pagesperso.lina.univ-nantes.fr/~cdlh/slides/ C Chapters 8 and 11 d lh

20

10

3 Zadar, August 2010

1 Identification in the limit A class of languages

L

yields

Pres ⊆ ℕ→X a

L The naming function

A learner

G A class of grammars

L(a(ϕ))=yields(ϕ) Cd lh

20

10

ϕ(ℕ)=ψ(ℕ) ⇒yields(ϕ)=yields(ψ) 4 Zadar, August 2010

Learning from text z z z

Only positive examples are available Danger of over-generalization: why not return Σ*? The problem is “basic”: z z

z Cd lh

20

Negative examples might not be available Or they might be heavily biased: nearmisses, absurd examples…

Base line: all the rest is learning with help

10

5 Zadar, August 2010

GI as a search problem PTA

? Cd lh

Σ 20

10

6 Zadar, August 2010

Questions? z z z

Cd lh

20

Data is unlabelled… Is this a clustering problem? Is this a problem posed in other settings?

10

7 Zadar, August 2010

2 The theory z

z

z

Gold 67: No super-finite class can be identified from positive examples (or text) only Necessary and sufficient conditions for learning Literature: z z

Cd lh

20

10

inductive inference, ALT series, … 8 Zadar, August 2010

Limit point z

A class L of languages has a limit point if there exists an infinite sequence Ln n∈ℕ of languages in L such that L0 ⊂ L1 ⊂ … Ln ⊂

…, and there exists another language L∈ such that L = z Cd lh

20

∪n∈ℕLn

L

L is called a limit point of L 10

9 Zadar, August 2010

L is a limit point

L0 L1 L2 L3

Cd lh

20

10

Li

L 10 Zadar, August 2010

Theorem If L admits a limit point, then learnable from text

L

is not

Proof: Let si be a presentation in length-lex order for Li, and s be a presentation in length-lex order for L. Then ∀n∈ℕ ∃i / ∀k≤n sik = sk

Note: having a limit point is a sufficient condition for non learnability; not a necessary Cd lh 20 condition 1 0

11

Zadar, August 2010

Mincons classes z

z

Cd lh

20

A class is mincons if there is an algorithm which, given a sample S, builds a G∈G such that S ⊆ L ⊆ L(G) ⇒L = L(G) Ie there is a unique minimum (for inclusion) consistent grammar

10

12 Zadar, August 2010

Accumulation point (Kapur 91) A class L of languages has an accumulation point if there exists an infinite sequence Sn n∈ℕ of sets such that S0 ⊆ S1 ⊆ … Sn ⊆ …, and L= ∪n∈ℕSn ∈ L …and for any n∈ℕ there exists a language Ln’ in L such that Sn ⊆ Ln’ ⊂ L. The language L is called an accumulation point of L Cd lh

20

10

13 Zadar, August 2010

L is an accumulation point

Ln’ S0 S1 S2 S3

Cd lh

20

10

Sn

L 14 Zadar, August 2010

Theorem (for Mincons classes)

L admits an accumulation point iff

L is not learnable from text Cd lh

20

10

15 Zadar, August 2010

Infinite Elasticity If a class of languages has a limit point there exists an infinite ascending chain of languages L0 ⊂ L1 ⊂ … ⊂ Ln ⊂ …. z This property is called infinite elasticity z

Cd lh

20

10

16 Zadar, August 2010

Infinite Elasticity

x0

Cd lh

20

10

x1

xi X i+1 Xi+2

x2 x3

Xi+3 Xi+4

17 Zadar, August 2010

Finite elasticity

L has finite elasticity if it does not have

infinite elasticity

Cd lh

20

10

18 Zadar, August 2010

Theorem (Wright) If

L (G)

has finite elasticity and is

mincons, then

Cd lh

20

10

G is learnable.

19 Zadar, August 2010

Tell tale sets L(G’)

TG

Fo

rbi dd en

x1

L(G) x2 Cd lh

20

10

x3

x4 20 Zadar, August 2010

Theorem (Angluin) is learnable iff there is a computable partial function ψ: G ×ℕ→Σ* such that:

G

∀n∈ℕ, ψ(G,n) is defined iff G∈G and L(G)≠∅

1) 2)

3)

Cd lh

20

10

∀G∈G, TM={ψ(G,n): n∈ℕ} is a finite subset of L(G) called a tell-tale subset ∀G,G’∈M, if TM⊆ L(G’) then L(G’)⊄ L(G)

21 Zadar, August 2010

Proposition (Kapur 91)

L

A language L in has a tell-tale subset iff L is not an accumulation point.

(for mincons)

Cd lh

20

10

22 Zadar, August 2010

Summarizing z

z z

Cd lh

20

Many alternative ways of proving that identification in the limit is not feasible Methodological-philosophical discussion We still need practical solutions

10

23 Zadar, August 2010

3 Learning k-testable languages P. García and E. Vidal. Inference of K-testable languages in the strict sense and applications to syntactic pattern recognition. Pattern Analysis and Machine Intelligence, 12(9):920–925, 1990 P. García, E. Vidal, and J. Oncina. Learning locally testable languages in the strict sense. In Workshop on Algorithmic Learning Theory (Alt 90), pages 325–338, 1990

Zadar, August 2010

24

Definition Let k≥0, a k-testable language in the strict sense (k-TSS) is a 5-tuple Zk=(Σ, I, F, T, C) with: z z

z z Cd lh

z 20

10

Σ a finite alphabet I, F ⊆ Σk-1 (allowed prefixes of length k-1 and suffixes of length k-1) T ⊆ Σk (allowed segments) C ⊆ Σ