a step towards the use of writer's properties for text

his handwriting can be collected e.g. when dealing with full text recognition. ... In the following paragraph we review the various kinds of morphological and.
87KB taille 1 téléchargements 262 vues
,Q3URF,(((:+$5¶%UXVVHOV%HOJLXPSS

A STEP TOWARDS THE USE OF WRITER'S PROPERTIES FOR TEXT RECOGNITION Ali Nosary, Laurent Heutte, Thierry Paquet and Yves Lecourtier Laboratoire Perception, Systèmes, Information (PSI), Université de Rouen 76821 Mont Saint Aignan cedex, France E-mail: [email protected] Tel: +33-2-35146588 Fax: +33-2-35146618

 ,QWURGXFWLRQ Learning machines to read omni-writer handwritten texts requires, on the one hand, sophisticated and highly adapted algorithms of pattern recognition and, on the other hand, the management of the various knowledge sources to exploit the human reader experience and cognitive structures. Conventional systems which consider the recognition of omni-writer handwritten texts as a problem of character/word recognition have been pushed to their limits [1-2]. Indeed, numerous studies have investigated the problem of defining well-adapted features for handwritten character recognition. Sophisticated algorithms for cursive handwritten word recognition have been proposed also in the last few years to cope with the problem of word segmentation. Moreover, many approaches use multiple classifier combination techniques to improve word recognition. As a matter of fact, conventional systems recognise handwritten words independently one from the others in a sequential scheme. Finally, the independent word hypothesis are often analysed in a post-treatment stage using syntactical constraints. These omni-writer systems must be trained on large databases in order to learn the inter-writer properties. However, recent studies try to counteract the problem of the variability in handwriting styles by classifying the handwriting styles into families of writers. This enables to select specific recognisers for each handwriting style [3], [4], [5], [6]. Figure 1 illustrates the general diagram of this kind of systems. These systems still use only the inter-writer properties either globally or within a particular group of writers. Furthermore, they still proceed the recognition of isolated words independently. In a simplified manner, these systems are based on highly adapted algorithms for word recognition but have poor capabilities to take into account the writer's properties during the recognition task. But, we will show that there exists for a given writer a redundancy of patterns in his handwriting for which a dependency between treatments can help the recognition task.

+DQGZULWWHQ 'RFXPHQW ,PDJH

+ROLVWLF $SSURDFKHV &ODVVLILHU &RPELQDWLRQ

3UHSURFHVVLQJ

+DQGZULWLQJ 6W\OH &KDUDFWHUL]DWLRQ

$QDO\WLFDO $SSURDFKHV :RUG5HFRJQLWLRQ



3RVW WUHDWPHQW

5HFRJQL]HG 7H[W

,Q3URF,(((:+$5¶%UXVVHOV%HOJLXPSS )LJ &RQYHQWLRQDOV\VWHPVJHQHULFGLDJUDP



:ULWHU VSURSHUWLHV

The fundamental property of handwriting, which makes the written communication possible, is that there exist inter-writer invariants since the morphological differences between patterns representing distinct letters are more important than those between different allographes of a same letter (inter-writer variability). We postulate now that each writer draws the same letters using the same patterns (his own handwriting references): we call these references "the writer invariants". The writer's invariants, reflecting the morphological redundancy of his handwriting, can be defined as the set of similar patterns or graphemes extracted from the segmentation of his handwriting using a particular segmentation technique. Therefore, when these writer's invariants are far from the interwriter invariants, the recognition of a particular handwriting will fail in the conventional approaches. In return, we can expect to obtain the writer's particularities only when large samples of his handwriting can be collected e.g. when dealing with full text recognition. 7KHGHWHFWLRQRIWKHPRUSKRORJLFDOLQYDULDQWVLVSHUIRUPHGXVLQJDQDXWRPDWLFFODVVLILFDWLRQ PHWKRGZKLFKZRUNVRQGDWDEDVHJUDSKHPHVLVVXHGIURPWKHVHJPHQWDWLRQRIWKHKDQGZULWWHQWH[W 7KH JUDSKHPHV DUH ILUVW QRUPDOLVHGLQWR D IL[HG VL]H 7KH SURSRVHGPHWKRGFRQVLVWV WKHQ RI WKUHH VWHSV ,Q WKH ILUVW VWHS DQ LQLWLDO SDUWLWLRQ RI WKH VHW RI JUDSKHPHV LV REWDLQHG XVLQJ D VHTXHQWLDO FOXVWHULQJDOJRULWKP>@7KLVSKDVHDOORZVWRHVWLPDWHWKHQXPEHURIFOXVWHUVLQWKHJUDSKHPHVHW )RU WKLV DOJRULWKP WZR LPSRUWDQW FKRLFHV DUH QHFHVVDU\  D SUR[LPLW\ PHDVXUH DQG DQ DGHTXDWH WKUHVKROG *UDSKHPH 6HW

6HTXHQWLDO &OXVWHULQJ

NPHDQV $OJRULWKP

1HDUHVW1HLJKERXU &OXVWHU)LOWHULQJ

)LJ 7KHGLIIHUHQWVWHSVLQWKHGHWHUPLQDWLRQRILQYDULDQWV

0RUSKRORJLFDO ,QYDULDQWV

7KHVHFRQGVWHSOLHVLQUHILQLQJWKHREWDLQHGFOXVWHUVXVLQJD NPHDQV OLNHDOJRULWKPOHDGLQJ WRKRPRJHQHRXVFOXVWHUV7KLVDOJRULWKPUHTXLUHVDSULRUNQRZOHGJHRIWKHQXPEHURIFODVVHVLQWKH GDWDEDVH DQG DQ DSSURSULDWH LQLWLDOLVDWLRQ RI FODVV FHQWURLGV 7KHVH SDUDPHWHUV DUH VXSSOLHG E\ WKH ILUVWVWDJH VHTXHQWLDOFOXVWHULQJ  7KH WKLUGVWHSDLPV DW YHULI\LQJ DXWRPDWLFDOO\ WKH FOHDUQHVV RI JURXSV (DFKFOXVWHULVWKXV DQDO\VHGZLWKUHVSHFWWRLWVQHDUHVWQHLJKERXUWRGHWHFWLQFRQVLVWHQWJUDSKHPHVHJJUDSKHPHVWKDW KDYHWKHLUQHDUHVWQHLJKERXUVEHORQJLQJWRDQRWKHUFOXVWHU7KLV SKDVHLVUHSHDWHGRQWKHLQYDULDQWV XQWLO VWDEOH FOXVWHUV DUH REWDLQHG 7KLV XQVXSHUYLVHG DQDO\VLV DOORZV WR REWDLQ VRPH YHU\ VSHFLILF VHWVRIJUDSKHPHVZLWKQXPHURXVRFFXUUHQFHVZKLOHWKHRWKHUVFRQWDLQVLQJXODUSDWWHUQVWKDWUDUHO\ RFFXULQWKHWH[W)LJXUHLOOXVWUDWHVDQH[DPSOHRILQYDULDQWJURXSVREWDLQHGE\WKLVPHWKRG



,Q3URF,(((:+$5¶%UXVVHOV%HOJLXPSS )LJ 6DPSOHVRILQYDULDQWFOXVWHUVH[WUDFWHGIURPDKDQGZULWWHQSDJH

,QWKHIROORZLQJVHFWLRQZHVKRZKRZWKHVHLQYDULDQWJURXSVFDQEHH[SORLWHGLQDUHFRJQLWLRQ V\VWHP 

$GDSWLQJWKHUHDGLQJWDVNWRWKHZULWHU

Let us recall that the writer's particularities so called "writer's invariants" in the previous section are intrinsically used by mono-writer systems. In this case, the invariants are determined during a preliminary learning stage to adapt the recognition task to the writer. Of course in the case of an omni-writer system, this preliminary stage is no longer possible. However, we believe that the writer invariants could be helpful if they were used during the recognition task by an omni-writer system. This is why we propose a reading system architecture able to manage both the learning of the writer invariants and the recognition of his handwriting. In such a manner, the reading system will be able to adapt itself to the handwriting of the writer. Therefore, so as a human reader can do, the system will collect and exploit all the possible knowledge that can be extracted from the handwritten text. In the following paragraph we review the various kinds of morphological and symbolic knowledge that can be exploited by the system. We will show in paragraph 3.3 how a reading system, based on these various knowledge, can adapt itself to the writer’s handwriting during the recognition task.  .QRZOHGJH0RGHOOLQJ Let us remark that due to the intrinsic sequentiality between treatments used in the conventional systems, these are usually described using functional modelling. As a consequence, the knowledge modelling is restricted to the modelling of the data exchange between functions. The proposed system relies on the interaction between learning and recognition processes. Therefore, as there cannot be any sequentiality between treatments, functional modelling is no longer possible to describe the system architecture. We thus introduce a knowledge modelling that takes into account the structure of handwriting and allows to highlight the data and their associated type of knowledge. Interaction levels between these knowledge will then be introduced using knowledge sources (functions) to model the desired interactive architecture. Considering that the whole text of the writer is segmented into graphemes using well-known techniques encountered in the literature [8], each grapheme (corresponding to a letter or not) is characterised by:

,QWULQVLF 0RUSKRORJLFDO .QRZOHGJH ,0.  any knowledge that can be extracted from the grapheme pattern alone, e.g. a set of features detected on the grapheme image, &RQWH[WXDO0RUSKRORJLFDO.QRZOHGJH &0.  any knowledge about the grapheme pattern that can be extracted from its environment, e.g. the writer's invariant it belongs to and the location in the text of its morphological neighbors. Now the following symbolic knowledge about each grapheme can be provided by different treatments: ,QWULQVLF 6\PEROLF.QRZOHGJH ,6. : any knowledge about the possible letter (label) that can be associated to the grapheme considered alone (e.g. obtained from IMK) using classical recognition schemes that exploit inter-writer invariants. 

,Q3URF,(((:+$5¶%UXVVHOV%HOJLXPSS

&RQWH[WXDO 6\PEROLF .QRZOHGJH &6.  any knowledge about the possible letter that can be associated to the grapheme by referring to its environment (e.g. obtained from CMK). Furthermore, the lexical constraints of each grapheme within a word are taken into account to provide additional CSK. 

7KHXVHRINQRZOHGJHPRGHOOLQJLQFRQYHQWLRQDODSSURDFKHV

7KH FRQYHQWLRQDOZRUGUHFRJQLVHUV GR QRW H[SORLW DQ\RIWKHFRQWH[WXDONQRZOHGJHOLVWHGLQ SDUDJUDSK  ,QGHHG WKH UHFRJQLWLRQ WDVN LV PDLQO\ EDVHG RQ D OHWWHU UHFRJQLWLRQ PRGXOH W KDW SURYLGHVOHWWHUK\SRWKHVLV ,6. IRUWKHJUDSKHPHLPDJH ,0. 7KHQDWWKHZRUGOHYHOWKHXVHRI OH[LFDO FRQVWUDLQWV DOORZV WR SURYLGH ZRUG K\SRWKHVLV $V RQH FDQ VHH LQ ILJXUH  WKH ZRUG UHFRJQLWLRQSURFHVVLVEDVHGRQLQGHSHQGHQWDQGORFDOH[SHUWLVHPDGHDWWKHJUDSKHPHOHYHO /H[LFDO $QDO\VLV

/HWWHU +\SRWKHVLV

,0.

:RUG +\SRWKHVLV

,6.

*UDSKHPH 5HFRJQLWLRQ



)LJ .QRZOHGJHLQFRQYHQWLRQDOV\VWHPV

([SORLWLQJFRQWH[WXDONQRZOHGJH

Faced to a very distorted handwriting, a human reader is usually able to delay the reading of some words until more and more symbolic and morphological information (contextual information) are gathered to confirm the emitted hypothesis. This on-line learning mechanism allows the human reader to adapt himself to each specific handwriting. We state that the human expertise is based on the following assumption “Patterns having similar shapes must be associated to the same symbolic interpretation”. Applying this assumption in an automatic reading system requires the use of two kinds of contextual knowledge. On the one hand, the Contextual Morphological Knowledge (CMK) provided by similar patterns and on the other hand the Contextual Symbolic Knowledge (CSK) provided by the interpretation of these similar patterns. An attempt to illustrate how a recognition system can exploit this information is shown in figure 5. Assume that handwritten words have been already localised and that, for each of them, the segmentation into graphemes has been performed. Assume also that IMK, CMK and ISK have been extracted for each grapheme. The contextual knowledge can be exploited according to the following steps: a) for each grapheme within a word, its lexical CSK is derived from word hypothesis, b) a global and coherent CSK can be attached to each morphological invariants thanks to 

,Q3URF,(((:+$5¶%UXVVHOV%HOJLXPSS

CSK’s of similar graphemes, c) the global CSK of each morphological invariant can be distributed to each grapheme belonging to this cluster, d) this new symbolic knowledge about each grapheme can then allow to generate/delete word hypothesis. These four operations are repeated until a global coherence is reached. Finally, thanks to the morphological invariants of the writer and by exploiting the various knowledge cited above, the reading system is able to take coherent decisions with respect to morphological and lexical constraints. Such a reading system can thus operate what we call a "context-driven recognition". /H[LFDO $QDO\VLV

/H[LFDO $QDO\VLV

:RUG

*UDSKHPH 5HFRJQLWLRQ *UDSKHPH

*UDSKHPH 5HFRJQLWLRQ *UDSKHPHQP

,0.&0.,6.&6.

0RUSKRORJLFDO,QYDULDQWV

&L

:RUGQ

&6.

« &6.QP

,0.&0.,6.&6.

*OREDO&6.

&6. &RKHUHQFH $QDO\VLV

Fig. 5 Illustration of the role of the morphological invariants in the recognition system.

 &RQFOXVLRQ The reading activity, which is considered as an Information Processing System, involves the integration of different processes, knowledge, and modalities in order to transform a graphical representation into a symbolic representation. The modelling of this activity has been deeply studied in cognitive sciences [9]. Of particular interest is the model of McClelland and Rumelhart who have introduced the concept of interactivity in their human reading model “Interactive Activation Model”[10]. In this model, the visual word recognition (perception) is realised by the interaction between three different abstraction levels (feature letter word). We have shown in this paper that this concept of interactivity can be extended by introducing an interaction within the grapheme level. This is possible only in the context of handwritten text recognition where the morphological invariants of an handwriting can be extracted. Thanks to these invariants (writer’s properties), the interactions between and within different abstraction levels can be guaranteed and will allow the automatic reading system to adapt itself to each handwriting. 

,Q3URF,(((:+$5¶%UXVVHOV%HOJLXPSS

 5HIHUHQFHV >@9*RYLQGDUDMX5.6ULKDULDQG616ULKDUL+DQGZULWWHQ7H[W5HFRJQLWLRQ 3URFHHGLQJVRI FRQIHUHQFHRQ'RFXPHQW$QDO\VLV6\VWHPV '$6 *HUPDQ\SS >@7%UHXHO$V\VWHPIRUWKHRIIOLQHUHFRJQLWLRQRIKDQGZULWWHQWH[W ,QWHUQDWLRQDO&RQIHUHQFH RQ3DWWHUQ5HFRJQLWLRQ ,&35 -HUXVDOHP >@-3&UHWWH]$VHWRIKDQGZULWLQJIDPLOLHVVW\OHUHFRJQLWLRQ ,&'$5 0RQWUHDOSS  >@  / 6FKRPDNHU * $EELQN DQG 6 6HOHQ :ULWHU DQG ZULWLQJVW\OH &ODVVLILFDWLRQ LQ WKH UHFRJQLWLRQ RI RQOLQH KDQGZULWLQJ 3URFHHGLQJV RI WKH (XURSHDQ :RUNVKRS RQ +DQGZULWLQJ $QDO\VLVDQG5HFRJQLWLRQ$(XURSHDQ3HUVSHFWLYH/RQGRQ >@  $ /HUR\ &RUUHODWLRQ EHWZHHQ KDQGZULWLQJ FKDUDFWHULVWLFV ,Q +DQGZULWLQJ DQG 'UDZLQJ 5HVHDUFK%DVLFDSSOLHGLVVXHV0/6LPQHUDQG&*/HHGKDPSS >@  5 . 3RZDOND 1 6KHUNDW 5 - :KLWURZ 5HFRJQL]HU FKDUDFWHULVDWLRQ IRU FRPELQLQJ KDQGZULWWHQUHFRJQLWLRQUHVXOWVDWZRUGOHYHO,&'$5¶0RQWUHDOSS >@3*DGHU%)RUVWHU0*DQ]EHUJHU$*LOOLHV0:DKOHQDQG7@ * &DVH\(/HFROLQHW$VXUYH\RIPHWKRGVRIVHJPHQWDWLRQ 3DWWHUQ$QDO\VLVDQG0DFKLQH ,QWHOOLJHQFH9RO1RSS >@$0-DFREVDQG-*UDLQJHU0RGHOVRIYLVXDOZRUGUHFRJQLWLRQVDPSOLQJWKHVWDWHRIWKHDUW -RXUQDO RI ([SHULPHQWDO 3V\FKRORJ\ +XPDQ 3HUFHSWLRQ DQG 3HUIRUPDQFH 6SHFLDO 6HFWLRQ 0RGHOLQJ9LVXDO:RUG5HFRJQLWLRQ9QRSS >@-/0F&OHOODQGDQG'(5XPHOKDUW$QLQWHUDFWLYHDFWLYDWLRQPRGHORIFRQWH[W HIIHFWVLQ OHWWHUSHUFHSWLRQ3V\FKRORJLFDO5HYLHZ