JEP-TALN 2004 - session on Arabic Language Processing Arabic

Apr 22, 2004 - The adopted parsing tool, concentrates on 4 important issues: Constituent vs sentence recognition, Coordination, Preposition semantic function ...
130KB taille 18 téléchargements 283 vues
JEP-TALN 2004, Arabic Language Processing, Fez, 19-22 April 2004

JEP-TALN 2004 - session on Arabic Language Processing Arabic-English Machine Translation Systems: Discrepancies and Implications 'U6DPHK$O$QVDU\  DQG3URI6HKDP(O.DUHK     $OH[DQGULD8QLYHUVLW\ 3KRQHWLFVGHSDUWPHQW )DFXOW\RI$UWV $OH[DQGULD8QLYHUVLW\ $O6KDWE\ $OH[DQGULD (J\SW VDODQVDU\#OLQNQHW   $OH[DQGULD8QLYHUVLW\ 3KRQHWLFVGHSDUWPHQW )DFXOW\RI$UWV $OH[DQGULD (J\SW VHONDUHK#\DKRRIU

Abstract 7KLV SDSHU ZLOO DQDO\]H FHUWDLQ WUDQVODWLRQ VRIWZDUH DQG IRFXV RQ FHUWDLQ OLQJXLVWLF SKHQRPHQDWKDWDUHRIWHQFODLPHGWREHWKHPRVWSUREOHPDWLFDUHDVLQPDFKLQHWUDQVODWLRQ,W ZLOO EH VKRZQ WKDW LPSOHPHQWLQJ WKH V\VWHP ZLWK DQ DGHTXDWH SDUVHU ZLOO HPSRZHU WKH WUDQVODWLRQVRIWZDUH7KHDGRSWHGSDUVLQJWRROFRQFHQWUDWHVRQLPSRUWDQWLVVXHV&RQVWLWXHQW YVVHQWHQFHUHFRJQLWLRQ&RRUGLQDWLRQ3UHSRVLWLRQVHPDQWLFIXQFWLRQDQG3UHSRVLWLRQDO3KUDVH 33  DWWDFKPHQW  7KH SDUVHU SUHVHQWHG LV EDVHG RQ D VDPSOH FRUSXV RI 0RGHUQ 6WDQGDUG $UDELF DQG D IRUPDO JUDPPDU WR DQDO\]H 06$ VWUXFWXUHV DXWRPDWLFDOO\ 7KH IRUPDO GHVFULSWLRQ LV LPSOHPHQWHG XVLQJ WKH $IIL[ *UDPPDU RYHU )LQLWH /DWWLFHV $*)/  UHSUHVHQWLQJ D OLQJXLVWLF DSSURDFK LQ WHUPV RI IXQFWLRQV DQG FDWHJRULHV WR DFFRXQW IRU WKH VHTXHQFHV LQVLGH WKH VWUXFWXUHV WRJHWKHU ZLWK WKH UHODWLRQV JRYHUQLQJ WKHVH VHTXHQFHV DFFXUDF\H[FHHGHG,WLVQHFHVVDU\DWWKLVSRLQWWRPDNHWZRLPSRUWDQWLVVXHVFOHDU)LUVW LQ WKH SUHVHQW VWDJH RI WKH ZRUN ZLOO EH OLPLWHG WR 1RXQ 3KUDVH VWUXFWXUHV 6HFRQG WKH DGHTXDF\RIWKHJUDPPDULVOLPLWHGWRWKHH[WHQWFRYHUHGE\WKHVDPSOHFRUSXVXQGHUVWXG\

Keywords $UDELFDQGFRPSXWHUV$UDELF0DFKLQH7UDQVODWLRQ$UDELF/DQJXDJH3URFHVVLQJ3DUVLQJ

Sameh Al-Ansary and Seham El-Kareh

1 Introduction 0DFKLQH WUDQVODWLRQ UHVHDUFK KDV RIWHQ EHHQ FULWLFL]HG IRU LJQRULQJ GHYHORSPHQWV LQ OLQJXLVWLF WKHRU\ 7KHUH ZRXOG DSSHDU WR EH ZLGH FRPPXQLFDWLRQ JDE EHWZHHQ WKHRUHWLFDO OLQJXLVWLFV DQG SUDFWLFDO PDFKLQH WUDQVODWLRQ UHVHDUFK 6RPH REVHUYHUV EHOLHYH WKDW WKHUH DUH JRRG UHDVRQV IRU WKLV VLWXDWLRQ XQWLO UHFHQWO\ OLQJXLVWLF WKHRULHV KDG QRW SURYLGHG DGHTXDWH DFFRXQWVRIDOODVSHFWVRIODQJXDJHXVHDJRRGOLQJXLVWLFWKHRU\PD\KDYHJLYHQDFRQYLQFLQJ DQDO\VLV RI VD\ TXDQWLILHUV RU FRRUGLQDWLRQ EXW QRW H[SODLQHG DOO WKH SHFXOLDULWLHV RI DFWXDO XVDJH LQ WKH FRYHUDJH UHTXLUHG IRU PDFKLQH WUDQVODWLRQ +RZHYHU UHFHQW WKHRULHV VXFK DV /H[LFDO )XQFWLRQDO *UDPPDU 6KLKDGDK HW DO    RU *HQHUDOL]HG 3KUDVH 6WUXFWXUH *UDPPDU *D]GDUHWDO DQGWKHLUYDULRXVGHULYDWLYHVKDYHVHWRXWH[SOLFLWO\WRFRYHUDV EURDGDUDQJHDVSRVVLEOHQRWRQO\ZLWKLQRQHVSHFLILFODQJXDJHEXWDOVRIRUGLIIHUHQWW\SHVRI ODQJXDJHV,QWKHSDVWDQGXQIRUWXQDWHO\LWLVJHQHUDOO\WUXHWRGD\PXFKRIOLQJXLVWLFWKHRU\ ZDV EDVHG RQ SKHQRPHQD REVHUYHG LQ (QJOLVK WKH ODQJXDJH RI WKH PDMRULW\ RI WKHRUHWLFDO OLQJXLVWLFV7KLVQHJOHFWRIRWKHUODQJXDJHVKDVEHHQZK\OLQJXLVWLFWKHRU\KDVKDGOHVVLPSDFW RQ0DFKLQH7UDQVODWLRQWKDQVRPHREVHUYHUVPLJKWKDYHH[SHFWHG,QRWKHUZRUGVOLQJXLVWLF WKHRULHV KDYH UDUHO\ DGGUHVVHG TXHVWLRQV RI FRQWUDVWLYH OLQJXLVWLFV LH WKH ZD\ LQ ZKLFK GLIIHUHQW ODQJXDJHV XVH GLIIHUHQW PHDQV WR H[SUHVV VLPLODU PHDQLQJV DQG LQWHQWLRQV 6XFK TXHVWLRQVDUHRIFRXUVHDWWKHKHDUWRI0DFKLQH7UDQVODWLRQ

2 Linguistic and Formal Framework ,QWKLVVHFWLRQWKHOLQJXLVWLFDQGIRUPDOIUDPHZRUNLVJRLQJWREHH[SODLQHGEULHIO\LQZKLFK D 13 FDWHJRU\ RFFXUUHG ZKLFK UHSUHVHQWV WKH FRUH RI RXU SDUVLQJ WRRO WKDW FDQ EH XVHG HIILFLHQWO\LQWUDQVODWLQJ$UDELFVWUXFWXUHVLQWR(QJOLVK $QDO\]LQJ OLQJXLVWLFDOO\ WKH 13 D QXPEHU RI IXQFWLRQV FRXOG EH GLVWLQJXLVKHG 7KHVH IXQFWLRQVDUHWKHKHDGWKHGHWHUPLQHUVDQGWKHSRVWPRGLILHU7KHLQGLYLGXDOEHKDYLRURIWKHVH IXQFWLRQV UDQJHV EHWZHHQ GHWHUPLQDWLRQ DQG SRVWPRGLILFDWLRQ RI WKH QXFOHXV RI WKH 13 WKH HOHPHQW RFFXUV LQ WKH KHDG IXQFWLRQ 7KH13LQLWVVLPSOHVWIRUPRQO\FRQVLVWVRIDKHDG (YHQ LQ D PRUH RU OHVV FRPSOLFDWHG 13 RQO\ RQH KHDG IXQFWLRQ LV WR EH GLVWLQJXLVKHG 7KH KHDG LV VSHFLILHGDQGGHILQHGDVWKHXQLWZKLFKLVPDUNHGIRULWVIXQFWLRQDWWKHQH[WKLJKHU OHYHO RI GHVFULSWLRQ DQG FDQQRW EH GHOHWHG ZLWKRXW DIIHFWLQJ WKH PHDQLQJ RI WKH FRQVWLWXHQW $O$QVDU\  'LWWHUV %\WKLVGHILQLWLRQWKHKHDGIXQFWLRQRIDQ13FDQRQO\EH UHDOL]HG E\ WKH FDWHJRU\ QRXQ 2ZHQV   GLVWLQJXLVKHG VHYHUDO VXEFDWHJRULHV DEOH WR UHDOL]HWKLVIXQFWLRQ FI2ZHQ¶V&K $FFRUGLQJWRRXUVXEFODVVLILFDWLRQRIQRXQVD FRPPRQ QRXQ SURQRXQ SURSHU QRXQ SUHVHQW SDUWLFLSOH SDVVLYH SDUWLFLSOH DGMHFWLYDO QRXQ VWDQGDUGLQILQLWLYH YHUEDOQRXQ QRXQRIWLWOH«HWFDUHH[DPSOHVRIKHDGVRIDQ13 IRUPRUH GHWDLOVFI $O$QVDU\ DQG (O.DUHK6$O$QVDU\6 ,QH[WHQVLRQRIWKHKHDG DQ HOHPHQW FDQ IXQFWLRQ DV DGHWHUPLQHUWRWKHKHDGRIWKH137KHHOHPHQWRFFXS\LQJWKLV IXQFWLRQ PD\ RFFXU EHIRUHRUDIWHUWKHKHDG7KLVEULQJVXVWRGLIIHUHQWLDWHEHWZHHQZKDW LV FDOOHGD³SUHGHWHUPLQHU´ 35('(7 DQG³SRVWGHWHUPLQHU´ 32' +RZHYHULWKDVWREHNHSW LQ PLQG WKDW WKH\ DUH PXWXDOO\ H[FOXVLYH LQ UHODWLRQ WR WKH KHDG LH WKH\ FRXOG QRW RFFXU WRJHWKHU 7KH FDWHJRU\ LQ WKH IXQFWLRQ RI SUHGHWHUPLQHU LV PDLQO\ WKH SUHIL[HG DUWLFOH  ³˰ѧϟ΍´ ZKLOHWKHFDWHJRU\LQWKHIXQFWLRQRISRVWGHWHUPLQHULVDQRUPDO13PDUNHGIRUJHQLWLYHFDVH 7KH SRVWPRGLILHUIXQFWLRQLVDOZD\VSODFHGDIWHUWKHKHDGRIWKH13DQGLVIRUWKLVUHDVRQ FDOOHG ³SRVW PRGLILHU´ 320  ,Q RXU DSSURDFK SRVWPRGLILFDWLRQ FRXOG DFFRUGLQJ WR LWV FDWHJRULDOUHDOL]DWLRQEHFODVVLILHGLQWR3320$'-3201320RU$'9320UHDOL]HGE\D

Arabic-English Machine Translation Systems: Discrepancies and Implications

SUHSRVLWLRQDO SKUDVH DGMHFWLYH SKUDVH QRXQ SKUDVH DQG DGYHUELDO SKUDVH UHVSHFWLYHO\ $Q DGGLWLRQDOHOHPHQWFRXOGEHGLVWLQJXLVKHGIXQFWLRQLQJDVDFRPSOHPHQWRIWKHKHDGRIWKH13 &203/ /LNHSRVWPRGLILFDWLRQWKHFRPSOHPHQWIXQFWLRQLVDOZD\VUHDOL]HGDIWHUWKHKHDG +RZHYHU LW LV QRW UHFRPPHQGHG WR WUHDW ERWK RI WKHP DV D SRVWPRGLILFDWLRQ VLQFH WKH FRPSOHPHQW KDV D SDUWLFXODU V\QWDFWLF IXQFWLRQ LQ UHODWLRQ WR WKH KHDG )RU H[DPSOH D SRVW PRGLILHUIROORZVLWVKHDGZLWKUHVSHFWWRµGHILQLWHQHVV¶µQXPEHU¶µJHQGHU¶DQGµFDVH¶7KHUH LV QR GLUHFW UHODWLRQ EHWZHHQWKHKHDGDQGLWVFRPSOHPHQWDVIDUDVDJUHHPHQWLVFRQFHUQHG 2QWKHFRQWUDU\WKHKHDGLPSRVHVVSHFLILFYDOXHVRQLWVFRPSOHPHQW )RUPDOO\ WKH OLQJXLVWLF GHVFULSWLRQ RI WKH 13 LQ 06$ FDQ EH UHSUHVHQWHG E\ PHDQV RI FRQWH[WIUHH UXOHV 7R LPSOHPHQW WKH IRUPDO JUDPPDU D WZR OHYHO DSSURDFK IRU V\QWDFWLF GHVFULSWLRQZDVXVHGE\PHDQVRIWKH$*)/ $IIL[*UDPPDURYHU)LQLWH/DWWLFHV IRUPDOLVP ,QWKLVZD\5227LVWKHVWDUWV\PEROLQRXUJUDPPDUDQG5227LVUHZULWWHQLQWKHSKUDVDO FDWHJRU\ 13 7KH 13 LV LQ LWV WXUQ UHZULWWHQ DV D VHTXHQFH RI RSWLRQDO DQG REOLJDWRU\ IXQFWLRQDO HOHPHQWV ZKLFK FRQVWLWXWHV WKH ILUVW OHYHO RI GHVFULSWLRQ V\QWDFWLF OHYHO  7KH GHVFULSWLRQDOWHUQDWHVEHWZHHQIXQFWLRQVDQGFDWHJRULHVWLOOWKHGHVFULSWLRQLQOH[LFDOWHUPVKDV EHHQUHDFKHG6WDUWLQJZLWKRXULQLWLDOODEHO5227WKHILUVWUXOHLV522713$QXPEHURI UHVWULFWLRQVLVDSSOLHGYLDVRPHOLQJXLVWLFIHDWXUHVWRGHWHUPLQHWKHGHSHQGHQFLHVDQGUHODWLRQV EHWZHHQ WKH HOHPHQWV RI WKH 13 ZKLFK FRQVWLWXWHV WKH VHFRQG OHYHO RI GHVFULSWLRQ DIIL[ OHYHO  7KXV RXU VWDUW ODEHO FRXOG EH UHYLVHG DV 5227 13 GHILQLWHQHVV QXPEHUJHQGHUSHUVRQFDVH %\PHDQVRIWKHQRQWHUPLQDODIIL[YDULDEOHVWKHHOHPHQWVRIWKH ILUVWDQGVHFRQGOHYHOVRIGHVFULSWLRQDUHGHDOWZLWK$WWKHILUVWOHYHOQRQWHUPLQDOHOHPHQWV DUHDUUDQJHGLQSKUDVHVWUXFWXUHUXOHVFDOOHGV\QWD[UXOHVRUK\SHUUXOHV7KHVHSKUDVHVWUXFWXUH UXOHV DUH FRQWH[WIUHH UXOHV GHVFULELQJ V\QWDFWLF VWUXFWXUHV $V LW KDV EHHQ VHHQ ZLWK µGHILQLWHQHVV¶ QXPEHU JHQGHU SHUVRQ DQG FDVH RWKHU DIIL[YDULDEOHVFDQEHDWWDFKHGWRWKH QRQWHUPLQDORIWKHILUVWOHYHO7KHVHPHWDDIIL[HVFRQVWLWXWHWKHVHFRQGOHYHORIGHVFULSWLRQ )RUPRUHGHWDLOVDERXWFRQYHQWLRQVIRUZULWLQJUXOHVRI$*)/FRQIHU.RVWHU  

3 Important Issues ,QWKLVVHFWLRQVRPHOLQJXLVWLFSKHQRPHQDWKDWKLJKOLJKWLPSRUWDQWSUREOHPDWLFLVVXHVLQWKH ILHOG RI $UDELF(QJOLVK 0DFKLQH 7UDQVODWLRQ ZLOO EH SUHVHQWHG 7KHVH OLQJXLVWLF SKHQRPHQD DUH&RQVWLWXHQWYVVHQWHQFHUHFRJQLWLRQ&RRUGLQDWLRQ3UHSRVLWLRQVHPDQWLFIXQFWLRQDQG33 DWWDFKPHQW,WZLOOEHGHPRQVWUDWHGKRZZHDNWKHH[LVWLQJWUDQVODWLRQV\VWHPVDUHDQGKRZWKH SDUVLQJDFFXUDF\RIWKHSDUVHUDGRSWHGFDQFRQWULEXWHWRUHDFKDQDFFHSWDEOHWUDQVODWLRQ

3.1 Constituent vs Sentence Recognition 2QH RI WKH LPSRUWDQW LVVXHV WKDW D 0DFKLQH 7UDQVODWLRQ V\VWHP VKRXOG WDNH FDUH RI LV WKH DELOLW\WRGLIIHUHQWLDWHEHWZHHQDSKUDVHDQGDVHQWHQFH,Q$UDELFDQRPLQDOVHQWHQFHFDQEH FRPSRVHGRIDWRSLFDQGDFRPPHQW7KHFRPPHQWFDQEHUHDOL]HGE\D33JLYLQJDFRPSOHWH PHDQLQJRIWKHZKROHVWUXFWXUHWKHVHQWHQFH&RQVLGHUWKHH[DPSOHLQ   

 ΔγέΪϤϟ΍ϲϓΪϤΤϣ PRÍDPPDGXQIL"DOPDGUDVDWLµ0RKDPPDGLVDWVFKRRO¶ 

,Q WKLV H[DPSOH ³ΪϤΤϣ´ 13  LV WKH WRSLF RI WKH VHQWHQFH ZKLOH ³ΔγέΪϤϟ΍ ϲϓ´ 33  LV WKH FRPPHQWRIWKHVHQWHQFH7KXVWKHWUDQVODWLRQJLYHQDERYHFDQEHFRQVLGHUHGDVDQDFFHSWDEOH WUDQVODWLRQ RI WKLV VHQWHQFH 7KLV NLQG RI VHQWHQFH VWUXFWXUH FDQ FDXVH D IRUPDO DPELJXLW\

Sameh Al-Ansary and Seham El-Kareh

FRQWUDGLFWLQJZLWKSKUDVHVWUXFWXUHV&RQVLGHUH[DPSOHVLQ D DQG D ZLWKWKHLUDXWRPDWLF WUDQVODWLRQ 3URJUDP  LV ,QWHU1HW 7UDQVODWLRQ 6HUYLFH IURP &,026 &RPSDQ\ KWWSZZZFLPRVFRP3URJUDPLV$MHHEKWWSWDUMLPDMHHEFRPDMHHE  

 D  ϦϴϴΑΎϫέ·˾ϰϠϋξΒϘϟ΍"DOTDEGXžDOD[DPVDWL"LUKDELMMQ 3URJUDP 7KHDUUHVWLVRQWHUURULVWV3URJUDP WDNLQJSRVVHVVLRQRIWHUURULVWV 

E   7KHDUUHVWRIWHUURULVWV    D  ϪΑΎΤλ΃ϦϣϑήτΘϤϟ΍ήϜϔϟ΍ϦϜϤΗWDPDNNXQX"DOILNUL"DOPRWDWDUULILPLQ"DVÍDELKL 3URJUDP $PDVWHU\WKHH[WUHPHWKLQNLQJLVIURPKLVRZQHUV 3URJUDP 7KHH[WUHPHWKLQNLQJWDNHSRVVHVVLRQRIKLVFRPSDQLRQV  E ([WUHPLVWWKLQNLQJKDVGRPLQDWHGWKHP  (DFK RI    D  DQG D  FRQVLVWV RI 13  33 KRZHYHU WKH VWUXFWXUH LQ   LV D VHQWHQFH ZKLOHWKHVWUXFWXUHVLQ D DQG D DUH13V7KHIDOVHWUDQVODWLRQLVFDXVHGE\WKHVRIWZDUHLQ ZKLFKWKHVRIWZDUHLVXQDEOHWRUHFRJQL]HDQGGLIIHUHQWLDWHEHWZHHQFRQVWLWXHQWVWUXFWXUHDQG VHQWHQFH VWUXFWXUH  7KH FRQWULEXWLRQ RI RXU SDUVLQJ WRRO UHVLGHV LQ XVLQJ RXU OLQJXLVWLF DQG IRUPDO VWUDWHJLHV WR JLYH D UHOLDEOH UHSUHVHQWDWLRQ 7KLV FRXOG EH H[DPLQHG WKURXJK WKH ODEHOOHG WUHH UHSUHVHQWDWLRQ LQ   FRQVHTXHQWO\ D UHOLDEOH WUDQVODWLRQ DV WKRVH JLYHQ LQ E  DQG E FDQEHREWDLQHG 

  D                 







   E 



3.2 Coordination &RRUGLQDWLRQ LV D SUREOHPDWLF LVVXH LQ DOPRVW DOO ODQJXDJHV ,W KDV EHHQ DVVXPH WKDW FRRUGLQDWLRQWDNHVSODFHDWWKHFDWHJRULFDOOHYHODQGWKDWLWLVGRPLQDWHGE\DVLQJOHIXQFWLRQ QRGH ,W KDV DOVR EHHQ DVVXPHG WKDW WKH FRRUGLQDWLRQ FRQFHUQV VLPLODU FDWHJRULHV ,Q WKH SUHVHQWVWDJHRIWKHIRUPDOGHVFULSWLRQRIFRRUGLQDWLRQWKHZRUNKDVEHHQOLPLWHGWRGHWHFWWKH ERXQGDULHVRIWKHFRRUGLQDWHG13VZLWKRXWJRLQJLQWRGHWDLOVRIUHYHDOLQJDOOLQIRUPDWLRQWKDW

Arabic-English Machine Translation Systems: Discrepancies and Implications

FDQ UHVXOW IURP FRRUGLQDWLQJ D 13 ZLWK DQRWKHU DQG KRZ WKH\ FDQ DIIHFW WKH LGHQWLW\ RI WKH ZKROH FRQVWLWXHQW  ,Q ZULWLQJ D IRUPDO JUDPPDU IRU GHVFULELQJ FRRUGLQDWLRQ WKH JUDPPDU VKRXOGFRQVLGHUWKHUHVXOWRIFRRUGLQDWLQJWKHIHDWXUHVRIWKHILUVW13ZLWKWKRVHRIWKHVHFRQG 13 )RU H[DPSOH FRQVLGHULQJ GHILQLWHQHVV ZKHQ WKH ILUVW 13 LV GHILQLWH DQG WKH VHFRQG LV LQGHILQLWHWKHFRRUGLQDWLRQZLOOUHVXOWLQDGHILQLWHFRQVWLWXHQW7KXVWKHIRUPDOGHVFULSWLRQRI FRRUGLQDWLRQVKRXOGGHDOZLWKIHDWXUHVOLNHGHILQLWHQHVVQXPEHUJHQGHUDQGSHUVRQWRUHYHDO KRZ WKH\ FDQ DIIHFW WKH ZKROH FRQVWLWXHQW )RU GHWDLOV DERXW WKLV NLQG RI LQYHVWLJDWLRQV FI 'LWWHUV     7R HQDEOH RXU GHVFULSWLRQ WR GHWHFW DXWRPDWLFDOO\ WKH ERXQGDULHV RI WZR FRRUGLQDWHG 13V DOO DYDLODEOH LQIRUPDWLRQ DW KDQG KDYH EHHQ PDGH XVHG RI 6LQFH WKH FRRUGLQDWRUVHSDUDWHVWZR13VWKHJUDPPDUWULHGWRJHWDOOSRVVLEOHDOWHUQDWLYHFRPELQDWLRQV WKDWFDQEHH[SUHVVHGEHIRUHDQGDIWHUWKHFRRUGLQDWRU7KXVDPRUHDFFXUDWHGHVFULSWLRQWRWKH 13WKDWSUHFHGHVWKHFRQMXQFWLRQKDVEHHQQHHGHGWRHQDEOHVSUHFLVHGLYLVLRQLQRQHKDQGDQG HOLPLQDWH WKH UHVW RI WKH DOWHUQDWLYHV RQ WKH RWKHU 8S WR WKH 13V DQDO\]HG WKH DIIL[HV µGHILQLWHQHVV¶ µVXEFODVV¶ DQG µFDVH¶ KDYH EHHQ XVHG WR FRQWURO WKH OLPLWV RI WKH 13V EHLQJ FRRUGLQDWHG ,Q D  D VWUXFWXUH WKDW KDV D FRRUGLQDWLRQ DW D FHUWDLQ OLQJXLVWLF OHYHO LH D SRVWGHWHUPLQDWLRQOHYHORIWKH1RPLQDO+HDGµϲδϠΠϣ¶FDQEHVHHQ$IXOOSDUVLQJLQODEHOOHG WUHH RI WKLV VWUXFWXUH LV SUHVHQWHG LQ    7KLV VWUXFWXUH KDV EHHQ WULHG WR EH WUDQVODWHG DXWRPDWLFDOO\WKHUHVXOWFDQEHVHHQLQ D    D Ώ΍ϮϨϟ΍ϭΥϮϴθϟ΍ϲδϠΠϣ˯Ύπϋ΃ϦϣΩΪϋ  žDGDGXQPLQ"DžGD"LPDJOLVDMM"DãLMX[LZDQQRZZDEL 3URJUDP 1XPEHURIWKHVKHLNKVFRXQFLOVPHPEHUVDQGWKHGHSXWLHV  3URJUDP +HZDLOHGIURPWKHFRXQFLORUVRIWKHZKLWHEHDUGVDQGWKHYLFHJHUHQWV    E $QXPEHURIWKHPHPEHUVRIVKHLNKVDQGWKHGHSXW\FRXQFLOV 

  



Sameh Al-Ansary and Seham El-Kareh

,WLVYHU\FOHDUIURPWKHDFFHSWDEOHWUDQVODWLRQLQ E WKDWWKHERXQGDULHVRIWKHFRRUGLQDWHG 13VDUHQRWSUHFLVH7KHIROORZLQJEUDFNHWVLQ  FDQVKRZWKHERXQGDULHVRIWKHFRRUGLQDWHG SDUWVDFFRUGLQJWRWKDWWUDQVODWLRQ



   

>>Ώ΍ϮϨϟ΍@ϭ>ΥϮϴθϟ΍ϲδϠΠϣ˯Ύπϋ΃@@ϦϣΩΪϋ

+RZHYHUWKHSDUVLQJWRRODGRSWHGUHO\LQJRQVXEFODVVGHILQLWHQHVVFDVHRIWKHFRRUGLQDWHV DQG WKH QXPEHU RI WKH 1RPLQDO +HDG µPDJOLVDMM¶ WR ZKLFK WKH FRRUGLQDWHV DUH SRVWGHWHUPLQLQJFRXOGUHOLDEO\GHWHFWWKHFRRUGLQDWHGSDUWVDVVKRZQLQ   

  >>Ώ΍ϮϨϟ΍@ϭ>ΥϮϴθϟ΍@@ϲδϠΠϣ˯Ύπϋ΃ϦϣΩΪϋ 

&RQVHTXHQWO\ UHOD\LQJ RQ WKH SDUVLQJ WRRO DGRSWHG D UHOLDEOH WUDQVODWLRQ FDQ EH REWDLQHG DV WKDWJLYHQLQ E 

3.3 Preposition Semantic Function ,WLVYHU\LPSRUWDQWIRUDJRRGWUDQVODWLRQV\VWHPWRDXWRPDWLFDOO\GHWHFWWKHPHDQLQJRID SUHSRVLWLRQLQDVWUXFWXUH7KLVLVVXHLVYHU\GDQJHURXVLQWKHWUDQVODWLRQIURPDQGWR$UDELF EHFDXVH WKH VDPH SUHSRVLWLRQ LQ $UDELF FDQ FRQYH\ PRUH WKDQ RQH VHPDQWLF PHDQLQJ &RQVLGHUWKHIROORZLQJH[DPSOHV   D ΪϤΣϷ΍ήΑΎΟϭΪϳ΍ΰϟϙέΎΒϣϦϣϥΎΘϟΎγέULVDODWDQPLQPXEDUDNOL]DMHGZDJDELU"DO"DKPDG  µ7ZROHWWHUVIURP0XEDUDNWR]D\HGDQGJDEHU$ODKPHG¶  E ϢϟΎόϟ΍ϲϓϲϧϮϳΰϔϴϠΗέ΍ϮΤϟΓΪϫΎθϣΔΒδϧϰϠϋ΃  "DžODQLVEDWLPXãDKDGDWLQOLKLZDULQWLOLIL]MRQLMMLQIL"DOžDODPL  µ$KLJKHVWUDWLRRIYLHZLQJIRUD79WDONLQWKHZRUOG¶ ,Q D  WKH SUHSRVLWLRQ ³˰ϟ´ SUHIL[HG WR ³Ϊϳ΍ί´ KDV D WDUJHW PHDQLQJ VLQFH WKH VWUXFWXUH KDV D VRXUFH ³Ϧϣ´ SUHFHGLQJ :KLOH LQ E  WKH VDPH SUHSRVLWLRQ KDV DQ DVVRFLDWLYH PHDQLQJ FRQQHFWHG WR WKH VXSHUODWLYH QRXQ WKH +HDG RI WKH 13 +RZHYHU LW KDV EHHQ QRWLFHG WKDW $UDELF 0DFKLQH WUDQVODWLRQ V\VWHPV KDYH IL[HG WKH VHPDQWLF IXQFWLRQ RI SUHSRVLWLRQV &RQVLGHUWKHIROORZLQJH[DPSOHVZLWKWKHLUFRUUHVSRQGLQJWUDQVODWLRQ     D ΫΎϘϧϺϟΔϴϣϼγϹ΍ΔϬΒΠϟ΍ΓΩΎϗTDGDWX"DOJDEKDWL"DO"LVODPL\\DWLOLO"LQTD&L   3URJUDP 7KHOHDGHUVRIWKH,VODPEORFWRWKH5HVFXH   3URJUDP 7KHFRPPDQGHUVRIWKH,VODPLFIURQWRIWKHGHOLYHUDQFH 

  E  Ϣ΋΍ήΠϟ΍ϩάϫϭήϜϔϟ΍΍άϫΔϧ΍Ω·ϰϟ·ΙΪΤΘϤϟ΍ϪϴΟϮΘϟΔϟϭΎΤϣ PXÍDZDODWXQOLWDZJLKL"DOPXWDÍDGGL6L"LOD"LGDQDWLKD6D"DOILNULZDKD&LKL "DOJDUD"LPL 3URJUDP $QDWWHPSWWRWKHVSHDNHUGLUHFWLQJWRWKHFRQYLFWLRQRIWKLVWKLQNLQJDQG WKHVHFULPHV 3URJUDP $QDWWHPSWWRFURZQKLPWKHVSRNHVPDQWRFRQGHPQDWLRQRIWKLV WKLQNLQJDQGWKLVWKHFULPHV  F ΎϜϳήϣϷΓέΎϳί ]LMDUDWXQOL"DPLULND  3URJUDP$YLVLWWR$PHULFD 3URJUDP$YLVLWWR$PHULFD

Arabic-English Machine Translation Systems: Discrepancies and Implications

$VLWLVFOHDUIURPWKHH[DPSOHVOLVWHGLQ  WRJHWKHUZLWKWKHLUDXWRPDWHGWUDQVODWLRQWKDWWKH PHDQLQJRIWKHSUHSRVLWLRQ³˰ϟ´LVDOZD\VIL[HGWKXVLWZDVLQFRUUHFWLQ DE DQGDFFHSWDEOH DFFLGHQWDOO\LQ F 7KHSDUVLQJV\VWHPDGRSWHGFRXOGOLQNWKHPHDQLQJRIWKHSUHSRVLWLRQ ZLWK WKH VWUXFWXUH LW RFFXUUHG LQ OHDGLQJ WR D UHOLDEOH WUDQVODWLRQ LQ WKLV UHVSHFW ,Q   DQ H[DPSOHRIDODEHOOHGWUHHUHSUHVHQWDWLRQRI D FDQEHVHHQ             ,QWKLVDQDO\VLVWKHSDUVHUFRXOGGHWHFWWKHDVVRFLDWLYHPHDQLQJRIWKHSUHSRVLWLRQ ˰ϟ ZKLFK GLUHFWO\OHDGVWRWUDQVODWHLVDV³IRU´QRW³WR´

3.4 PP Attachment $WWDFKLQJ WKH33WRDJLYHQSDUWLQWKHVWUXFWXUHFDQPDNHDNLQGRIDPELJXLW\WKDWDIIHFWV WUDQVODWLQJWKDWVWUXFWXUHIURP$UDELFWR(QJOLVK&RQVLGHUWKHVWUXFWXUHLQ  ,QIDFWLI\RX FRQFHQWUDWHIRUDZKLOH\RXZLOOQRWLFHWKDWLWFDQVXSSRUWWZRWUDQVODWLRQVGHSHQGLQJRQWKH DWWDFKPHQW RI WKH 33 ςγϭϷ΍ ϕήθϟ΍ ϲϓ  7KLV 33 FDQ EH FRQVLGHUHG DV D ORFDWLYH FRQVWLWXHQW DWWDFKHG WR  ΪϘϋ RU DV D SUHSRVLWLRQDO SRVWPRGLILHU DWWDFKLQJ WKLV FRQVWLWXHQW WR ϡϼγ ,Q IDFW SDUVLQJ WRRO FDQ JLYH IOH[LEOHO\ WZR SDUVLQJ WUHHV WKDW FDQ EH LPSOHPHQWHG LQ WZR GLIIHUHQW WUDQVODWLRQV 7KH GLVFRXUVH ZLOO WKH H[DFW WUDQVODWLRQ ,Q D  DQG D  WZR SDUVH WUHHV LQ ODEHOOHGEUDFNHWVIRUPDW ZHUHJLYHQZLWKWKHLUWUDQVODWLRQVLQ E DQG E DFFRUGLQJWRWKH VWUXFWXUH  

,QWKHWUHHGLDJUDPDERYHLWLVQRWFOHDUKRZWKHSDUVHUFRXOGGLVFRYHUWKHVHPDQWLFPHDQLQJRIWKHSUHSRVLWLRQ7KHRXWSXWRIWKHSDUVHULQ ODEHOOHG EUDFNHWLQJ IRUPDW LV DV IROORZV LQ ZKLFK WKH VHPDQWLF IXQFWLRQ RI WKH SUHSRVLWLRQ LV FOHDU 13 1+($' 1281 1'6 ΓΩΎϗ 32' 13 35('(7 $57,&/( ˰ϟ΍ 1+($' 1281 11& ΔϬΒΟ $'-320 $'-3 35('(7 $57,&/( ˰ϟ΍ $'-+($' 1281 1: Δϴϣϼγ· 3320 33 3+($'(5(PREP(ASSOCIATIVE˰ϟ))3&203/(0(17 13 35('(7 $57,&/( ˰ϟ 1+($' 1281 1,6 ΫΎϘϧ· 



Sameh Al-Ansary and Seham El-Kareh

   ςγϭϷ΍ϕήθϟ΍ϲϓϡϼδϠϟήϤΗΆϣΪϘϋ žDTGXPR"WDPDULQOLOVDODPLIL"DOãDUTL"DO"DZVDWL  D 

E +ROGLQJDSHDFHFRQIHUHQFHLQWKH0LGGOH(DVW



   D 

 E +ROGLQJDFRQIHUHQFHLQWKH0LGGOH(DVWDERXWSHDFH

Arabic-English Machine Translation Systems: Discrepancies and Implications

7KH SUREOHP EHFRPHV PRUH VHYHUH ZKHQ WKH SUHSRVLWLRQDO FRPSOHPHQW LV UHDOL]HG E\ D FRRUGLQDWHG13,Q  ZHVHHWKHQRPLQDOKHDG³ΔϟΎϛϭ´RIWKH13LVSRVWPRGLILHGE\DQ$'-3 ³ΔϴϜϳήϣϷ΍´ WKHQ E\ D 33 $FFRUGLQJ WR RXU OLQJXLVWLF DSSURDFK WKLV 33 LV FRPSRVHG RI 3+($'(5 DQG 3&203/(0(17 IXQFWLRQV 7KH ODWWHU LV UHDOL]HG E\ D FRRUGLQDWHG 13 $ IXOOODEHOOHG EUDFNHWVUHSUHVHQWDWLRQGHVFULELQJWKHZKROHVWUXFWXUHFDQEHVHHQLQ D 7KLV VWUXFWXUH LV WHVWHG RYHU D IDPRXV $UDELF(QJOLVK WUDQVODWLRQ V\VWHP WKH UHVXOW ZDV DV UHSUHVHQWHGLQ E   Ρϼδϟ΍ωΰϧϭ΢ϠδΘϟ΍ϦϣΪΤϠϟΔϴϜϳήϣϷ΍ΔϟΎϛϮϟ΍ 

"DOZLNDODWX"DO"DPULNLMMDWXOLOÍDGGLPLQD"DOWDVDOXÍLZDQD]žL"DOVLODÍL



 D 



E 3URJUDP 7KH$PHULFDQDJHQF\WRWKHUHVWULFWLRQLVIURPWKHDUPLQJDQGWKHZHDSRQ SXOOLQJRXW 3URJUDP 7KH$PHULFDQVWHZDUGVKLSWREHDQDWKHLVWIURPWKHDUPDPHQWDQGWKH GHPLOLWDULVDWLRQ  F 7KH$PHULFDQDJHQF\IRUOLPLWLQJDUPVDQGGLVDUPLQJ 7KH ODEHOOHG WUHH UHSUHVHQWDWLRQ DERYH OLQNV WKH 1+($' ZLWK LWV PRGLILHUV LQ D V\QWDFWLF KDUPRQ\ WKDW FDQ OHDG WR DQ DFFHSWDEOH VHPDQWLF LQWHUSUHWDWLRQ RI WKH ZKROH VWUXFWXUH DV LW DSSHDULQWKHWUDQVODWLRQLQ F 

3.5 Conclusion 7KLVSDSHUKDVIRFXVHGRQ0DFKLQH7UDQVODWLRQRI$UDELFLQZKLFKWKH$UDELFODQJXDJHZLOO EHDVRXUFHODQJXDJH:KDWWKHSDSHUWULHGWRSURYHLVWKDWLQRUGHUWRDFKLHYHDJRRGPDFKLQH

Sameh Al-Ansary and Seham El-Kareh

WUDQVODWLRQ IRU $UDELF LQ SULQFLSOH $UDELF VWUXFWXUHV VKRXOG EH XQGHUVWRRG E\ WKH PDFKLQH 8VLQJWKHSDUVHUDGRSWHGLWZDVSRVVLEOHWRIRFXVRQEXLOGLQJDIRUPDOPRGXOHIRUGHVFULELQJ $UDELF VWUXFWXUHV DQG WUDQVIHU WKHP ZLWK WKH VDPH PHDQLQJ LQWR WKH WDUJHW ODQJXDJH 6RPH SUREOHPDWLF DUHDV LQ $UDELF(QJOLVK PDFKLQH WUDQVODWLRQ KDYH EHHQ VXUYH\HG 7R D FHUWDLQ H[WHQW WKH JDEV LQ WKH WUDQVODWLRQV KDYH EHHQ VKRZHG DQG KRZ WKH SDUVLQJ V\VWHP DGRSWHG FRXOG GHDO ZLWK WKHVH JDEV :RUN LQ WKLV GLUHFWLRQ ZLOO KHOS WR IRUP D QHZ JHQHUDWLRQ RI $UDELF VWXGLHV XVLQJ ,QIRUPDWLRQ DQG&RPPXQLFDWLRQ7HFKQRORJ\IRUUHVHDUFKDQGWHDFKLQJ SXUSRVHV

References $O$QVDU\ 6   $ &RPSDUDWLYH &RUSXVEDVHG 6WXG\ RI  6SRNHQ DQG :ULWWHQ 0RGHUQ 6WDQGDUG$UDELF 06$ 3K'WKHVLV$OH[DQGULD8QLYHUVLW\(J\SW 

'LWWHUV : (   A Formal Approach to Arabic syntax: The Noun Phrase and Verb Phrase3K'1LMPHJHQ8QLYHUVLW\ 

(O.DUHK 6 $O$QVDU\ 6   $Q ,QWHUDFWLYH 0XOWL)HDWXUHV 326 7DJJHU ,Q the Proceedings of the International Conference on Artificial and Computational Intelligence for Decision Control and Automation in Intelligence for Decision Control and Automation in Engineering and Industrial Applications, 1DWXUDO/DQJXDJH3URFHVVLQJ3DQHOSS 0DUFK0RQDVWLU7XQLVLD 

0F(QHU\7RQ\$QGUHZ:LOVRQ  Corpora and Translation: uses and future prospects, /DQFDVWHU8&5(/ 

1LUHQEXUJ 6HUJHL HW DO   Machine Translation: A Knowledge-based Approach 6DQ 0DWHR&DO0RUJDQ.DXIPDQQ 

1HZWRQ -RKQ HG    Computers in Translation: A practical Appraisal /RQGRQ  5RXWOHGJH 

2ZHQV-RQDWKDQ  The Foundations of Grammar: An Introduction to Medieval Arabic Grammatical Theory,-RKQ%HQMDPLQSXEOLFDWLRQFRPSDQ\$PVWHUGDP 

6KLKDGDK 0 3DXO 5RRKQLN   /H[LFDO )XQFWLRQDO *UDPPDU DV D &RPSXWDWLRQDO /LQJXLVWLF 8QGHUSLQQLQJ WR $UDELF 0DFKLQH 7UDQVODWLRQ The proceedings of the 6th International Conference and Exhibition on Mutilingual Computing 8QLYHUVLW\ RI &DPEULGJH/RQGRQ$SULOSS± 

6FKXEHUW.ODXV  Contrastive Dependency Syntax for Machine Translation, 'RUGUHFKW )RULV3XEOLFDWLRQV 

6LJXUG%HQJW  Computerized Grammars for Analysis and Machine Translation/XQG /XQG8QLYHUVLW\3UHVV 

7UXMLOOR$UWXUR  Translation Engines: Techniques for Machine Translation,/RQGRQ 

:KLWHORFN 3HWHU DQG NLHUDQ .LOE\   Linguistic and Computational Techniques in Machine Translation System Design/RQGRQ8&/3UHVV