JEP-TALN 2004, Arabic Language Processing, Fez, 19-22 April 2004
JEP-TALN 2004 - session on Arabic Language Processing Arabic-English Machine Translation Systems: Discrepancies and Implications 'U6DPHK$O$QVDU\ DQG3URI6HKDP(O.DUHK $OH[DQGULD8QLYHUVLW\ 3KRQHWLFVGHSDUWPHQW )DFXOW\RI$UWV $OH[DQGULD8QLYHUVLW\ $O6KDWE\ $OH[DQGULD (J\SW VDODQVDU\#OLQNQHW $OH[DQGULD8QLYHUVLW\ 3KRQHWLFVGHSDUWPHQW )DFXOW\RI$UWV $OH[DQGULD (J\SW VHONDUHK#\DKRRIU
Abstract 7KLV SDSHU ZLOO DQDO\]H FHUWDLQ WUDQVODWLRQ VRIWZDUH DQG IRFXV RQ FHUWDLQ OLQJXLVWLF SKHQRPHQDWKDWDUHRIWHQFODLPHGWREHWKHPRVWSUREOHPDWLFDUHDVLQPDFKLQHWUDQVODWLRQ,W ZLOO EH VKRZQ WKDW LPSOHPHQWLQJ WKH V\VWHP ZLWK DQ DGHTXDWH SDUVHU ZLOO HPSRZHU WKH WUDQVODWLRQVRIWZDUH7KHDGRSWHGSDUVLQJWRROFRQFHQWUDWHVRQLPSRUWDQWLVVXHV&RQVWLWXHQW YVVHQWHQFHUHFRJQLWLRQ&RRUGLQDWLRQ3UHSRVLWLRQVHPDQWLFIXQFWLRQDQG3UHSRVLWLRQDO3KUDVH 33 DWWDFKPHQW 7KH SDUVHU SUHVHQWHG LV EDVHG RQ D VDPSOH FRUSXV RI 0RGHUQ 6WDQGDUG $UDELF DQG D IRUPDO JUDPPDU WR DQDO\]H 06$ VWUXFWXUHV DXWRPDWLFDOO\ 7KH IRUPDO GHVFULSWLRQ LV LPSOHPHQWHG XVLQJ WKH $IIL[ *UDPPDU RYHU )LQLWH /DWWLFHV $*)/ UHSUHVHQWLQJ D OLQJXLVWLF DSSURDFK LQ WHUPV RI IXQFWLRQV DQG FDWHJRULHV WR DFFRXQW IRU WKH VHTXHQFHV LQVLGH WKH VWUXFWXUHV WRJHWKHU ZLWK WKH UHODWLRQV JRYHUQLQJ WKHVH VHTXHQFHV DFFXUDF\H[FHHGHG,WLVQHFHVVDU\DWWKLVSRLQWWRPDNHWZRLPSRUWDQWLVVXHVFOHDU)LUVW LQ WKH SUHVHQW VWDJH RI WKH ZRUN ZLOO EH OLPLWHG WR 1RXQ 3KUDVH VWUXFWXUHV 6HFRQG WKH DGHTXDF\RIWKHJUDPPDULVOLPLWHGWRWKHH[WHQWFRYHUHGE\WKHVDPSOHFRUSXVXQGHUVWXG\
Keywords $UDELFDQGFRPSXWHUV$UDELF0DFKLQH7UDQVODWLRQ$UDELF/DQJXDJH3URFHVVLQJ3DUVLQJ
Sameh Al-Ansary and Seham El-Kareh
1 Introduction 0DFKLQH WUDQVODWLRQ UHVHDUFK KDV RIWHQ EHHQ FULWLFL]HG IRU LJQRULQJ GHYHORSPHQWV LQ OLQJXLVWLF WKHRU\ 7KHUH ZRXOG DSSHDU WR EH ZLGH FRPPXQLFDWLRQ JDE EHWZHHQ WKHRUHWLFDO OLQJXLVWLFV DQG SUDFWLFDO PDFKLQH WUDQVODWLRQ UHVHDUFK 6RPH REVHUYHUV EHOLHYH WKDW WKHUH DUH JRRG UHDVRQV IRU WKLV VLWXDWLRQ XQWLO UHFHQWO\ OLQJXLVWLF WKHRULHV KDG QRW SURYLGHG DGHTXDWH DFFRXQWVRIDOODVSHFWVRIODQJXDJHXVHDJRRGOLQJXLVWLFWKHRU\PD\KDYHJLYHQDFRQYLQFLQJ DQDO\VLV RI VD\ TXDQWLILHUV RU FRRUGLQDWLRQ EXW QRW H[SODLQHG DOO WKH SHFXOLDULWLHV RI DFWXDO XVDJH LQ WKH FRYHUDJH UHTXLUHG IRU PDFKLQH WUDQVODWLRQ +RZHYHU UHFHQW WKHRULHV VXFK DV /H[LFDO )XQFWLRQDO *UDPPDU 6KLKDGDK HW DO RU *HQHUDOL]HG 3KUDVH 6WUXFWXUH *UDPPDU*D]GDUHWDO DQGWKHLUYDULRXVGHULYDWLYHVKDYHVHWRXWH[SOLFLWO\WRFRYHUDV EURDGDUDQJHDVSRVVLEOHQRWRQO\ZLWKLQRQHVSHFLILFODQJXDJHEXWDOVRIRUGLIIHUHQWW\SHVRI ODQJXDJHV,QWKHSDVWDQGXQIRUWXQDWHO\LWLVJHQHUDOO\WUXHWRGD\PXFKRIOLQJXLVWLFWKHRU\ ZDV EDVHG RQ SKHQRPHQD REVHUYHG LQ (QJOLVK WKH ODQJXDJH RI WKH PDMRULW\ RI WKHRUHWLFDO OLQJXLVWLFV7KLVQHJOHFWRIRWKHUODQJXDJHVKDVEHHQZK\OLQJXLVWLFWKHRU\KDVKDGOHVVLPSDFW RQ0DFKLQH7UDQVODWLRQWKDQVRPHREVHUYHUVPLJKWKDYHH[SHFWHG,QRWKHUZRUGVOLQJXLVWLF WKHRULHV KDYH UDUHO\ DGGUHVVHG TXHVWLRQV RI FRQWUDVWLYH OLQJXLVWLFV LH WKH ZD\ LQ ZKLFK GLIIHUHQW ODQJXDJHV XVH GLIIHUHQW PHDQV WR H[SUHVV VLPLODU PHDQLQJV DQG LQWHQWLRQV 6XFK TXHVWLRQVDUHRIFRXUVHDWWKHKHDUWRI0DFKLQH7UDQVODWLRQ
2 Linguistic and Formal Framework ,QWKLVVHFWLRQWKHOLQJXLVWLFDQGIRUPDOIUDPHZRUNLVJRLQJWREHH[SODLQHGEULHIO\LQZKLFK D 13 FDWHJRU\ RFFXUUHG ZKLFK UHSUHVHQWV WKH FRUH RI RXU SDUVLQJ WRRO WKDW FDQ EH XVHG HIILFLHQWO\LQWUDQVODWLQJ$UDELFVWUXFWXUHVLQWR(QJOLVK $QDO\]LQJ OLQJXLVWLFDOO\ WKH 13 D QXPEHU RI IXQFWLRQV FRXOG EH GLVWLQJXLVKHG 7KHVH IXQFWLRQVDUHWKHKHDGWKHGHWHUPLQHUVDQGWKHSRVWPRGLILHU7KHLQGLYLGXDOEHKDYLRURIWKHVH IXQFWLRQV UDQJHV EHWZHHQ GHWHUPLQDWLRQ DQG SRVWPRGLILFDWLRQ RI WKH QXFOHXV RI WKH 13 WKH HOHPHQW RFFXUV LQ WKH KHDG IXQFWLRQ 7KH13LQLWVVLPSOHVWIRUPRQO\FRQVLVWVRIDKHDG (YHQ LQ D PRUH RU OHVV FRPSOLFDWHG 13 RQO\ RQH KHDG IXQFWLRQ LV WR EH GLVWLQJXLVKHG 7KH KHDG LV VSHFLILHGDQGGHILQHGDVWKHXQLWZKLFKLVPDUNHGIRULWVIXQFWLRQDWWKHQH[WKLJKHU OHYHO RI GHVFULSWLRQ DQG FDQQRW EH GHOHWHG ZLWKRXW DIIHFWLQJ WKH PHDQLQJ RI WKH FRQVWLWXHQW $O$QVDU\ 'LWWHUV %\WKLVGHILQLWLRQWKHKHDGIXQFWLRQRIDQ13FDQRQO\EH UHDOL]HG E\ WKH FDWHJRU\ QRXQ 2ZHQV GLVWLQJXLVKHG VHYHUDO VXEFDWHJRULHV DEOH WR UHDOL]HWKLVIXQFWLRQFI2ZHQ¶V&K $FFRUGLQJWRRXUVXEFODVVLILFDWLRQRIQRXQVD FRPPRQ QRXQ SURQRXQ SURSHU QRXQ SUHVHQW SDUWLFLSOH SDVVLYH SDUWLFLSOH DGMHFWLYDO QRXQ VWDQGDUGLQILQLWLYHYHUEDOQRXQ QRXQRIWLWOH«HWFDUHH[DPSOHVRIKHDGVRIDQ13IRUPRUH GHWDLOVFI$O$QVDU\ DQG(O.DUHK6$O$QVDU\6 ,QH[WHQVLRQRIWKHKHDG DQ HOHPHQW FDQ IXQFWLRQ DV DGHWHUPLQHUWRWKHKHDGRIWKH137KHHOHPHQWRFFXS\LQJWKLV IXQFWLRQ PD\ RFFXU EHIRUHRUDIWHUWKHKHDG7KLVEULQJVXVWRGLIIHUHQWLDWHEHWZHHQZKDW LV FDOOHGD³SUHGHWHUPLQHU´35('(7 DQG³SRVWGHWHUPLQHU´32' +RZHYHULWKDVWREHNHSW LQ PLQG WKDW WKH\ DUH PXWXDOO\ H[FOXVLYH LQ UHODWLRQ WR WKH KHDG LH WKH\ FRXOG QRW RFFXU WRJHWKHU 7KH FDWHJRU\ LQ WKH IXQFWLRQ RI SUHGHWHUPLQHU LV PDLQO\ WKH SUHIL[HG DUWLFOH ³˰ѧϟ´ ZKLOHWKHFDWHJRU\LQWKHIXQFWLRQRISRVWGHWHUPLQHULVDQRUPDO13PDUNHGIRUJHQLWLYHFDVH 7KH SRVWPRGLILHUIXQFWLRQLVDOZD\VSODFHGDIWHUWKHKHDGRIWKH13DQGLVIRUWKLVUHDVRQ FDOOHG ³SRVW PRGLILHU´ 320 ,Q RXU DSSURDFK SRVWPRGLILFDWLRQ FRXOG DFFRUGLQJ WR LWV FDWHJRULDOUHDOL]DWLRQEHFODVVLILHGLQWR3320$'-3201320RU$'9320UHDOL]HGE\D
Arabic-English Machine Translation Systems: Discrepancies and Implications
SUHSRVLWLRQDO SKUDVH DGMHFWLYH SKUDVH QRXQ SKUDVH DQG DGYHUELDO SKUDVH UHVSHFWLYHO\ $Q DGGLWLRQDOHOHPHQWFRXOGEHGLVWLQJXLVKHGIXQFWLRQLQJDVDFRPSOHPHQWRIWKHKHDGRIWKH13 &203/ /LNHSRVWPRGLILFDWLRQWKHFRPSOHPHQWIXQFWLRQLVDOZD\VUHDOL]HGDIWHUWKHKHDG +RZHYHU LW LV QRW UHFRPPHQGHG WR WUHDW ERWK RI WKHP DV D SRVWPRGLILFDWLRQ VLQFH WKH FRPSOHPHQW KDV D SDUWLFXODU V\QWDFWLF IXQFWLRQ LQ UHODWLRQ WR WKH KHDG )RU H[DPSOH D SRVW PRGLILHUIROORZVLWVKHDGZLWKUHVSHFWWRµGHILQLWHQHVV¶µQXPEHU¶µJHQGHU¶DQGµFDVH¶7KHUH LV QR GLUHFW UHODWLRQ EHWZHHQWKHKHDGDQGLWVFRPSOHPHQWDVIDUDVDJUHHPHQWLVFRQFHUQHG 2QWKHFRQWUDU\WKHKHDGLPSRVHVVSHFLILFYDOXHVRQLWVFRPSOHPHQW )RUPDOO\ WKH OLQJXLVWLF GHVFULSWLRQ RI WKH 13 LQ 06$ FDQ EH UHSUHVHQWHG E\ PHDQV RI FRQWH[WIUHH UXOHV 7R LPSOHPHQW WKH IRUPDO JUDPPDU D WZR OHYHO DSSURDFK IRU V\QWDFWLF GHVFULSWLRQZDVXVHGE\PHDQVRIWKH$*)/$IIL[*UDPPDURYHU)LQLWH/DWWLFHV IRUPDOLVP ,QWKLVZD\5227LVWKHVWDUWV\PEROLQRXUJUDPPDUDQG5227LVUHZULWWHQLQWKHSKUDVDO FDWHJRU\ 13 7KH 13 LV LQ LWV WXUQ UHZULWWHQ DV D VHTXHQFH RI RSWLRQDO DQG REOLJDWRU\ IXQFWLRQDO HOHPHQWV ZKLFK FRQVWLWXWHV WKH ILUVW OHYHO RI GHVFULSWLRQ V\QWDFWLF OHYHO 7KH GHVFULSWLRQDOWHUQDWHVEHWZHHQIXQFWLRQVDQGFDWHJRULHVWLOOWKHGHVFULSWLRQLQOH[LFDOWHUPVKDV EHHQUHDFKHG6WDUWLQJZLWKRXULQLWLDOODEHO5227WKHILUVWUXOHLV522713$QXPEHURI UHVWULFWLRQVLVDSSOLHGYLDVRPHOLQJXLVWLFIHDWXUHVWRGHWHUPLQHWKHGHSHQGHQFLHVDQGUHODWLRQV EHWZHHQ WKH HOHPHQWV RI WKH 13 ZKLFK FRQVWLWXWHV WKH VHFRQG OHYHO RI GHVFULSWLRQ DIIL[ OHYHO 7KXV RXU VWDUW ODEHO FRXOG EH UHYLVHG DV 5227 13GHILQLWHQHVV QXPEHUJHQGHUSHUVRQFDVH %\PHDQVRIWKHQRQWHUPLQDODIIL[YDULDEOHVWKHHOHPHQWVRIWKH ILUVWDQGVHFRQGOHYHOVRIGHVFULSWLRQDUHGHDOWZLWK$WWKHILUVWOHYHOQRQWHUPLQDOHOHPHQWV DUHDUUDQJHGLQSKUDVHVWUXFWXUHUXOHVFDOOHGV\QWD[UXOHVRUK\SHUUXOHV7KHVHSKUDVHVWUXFWXUH UXOHV DUH FRQWH[WIUHH UXOHV GHVFULELQJ V\QWDFWLF VWUXFWXUHV $V LW KDV EHHQ VHHQ ZLWK µGHILQLWHQHVV¶ QXPEHU JHQGHU SHUVRQ DQG FDVH RWKHU DIIL[YDULDEOHVFDQEHDWWDFKHGWRWKH QRQWHUPLQDORIWKHILUVWOHYHO7KHVHPHWDDIIL[HVFRQVWLWXWHWKHVHFRQGOHYHORIGHVFULSWLRQ )RUPRUHGHWDLOVDERXWFRQYHQWLRQVIRUZULWLQJUXOHVRI$*)/FRQIHU.RVWHU
3 Important Issues ,QWKLVVHFWLRQVRPHOLQJXLVWLFSKHQRPHQDWKDWKLJKOLJKWLPSRUWDQWSUREOHPDWLFLVVXHVLQWKH ILHOG RI $UDELF(QJOLVK 0DFKLQH 7UDQVODWLRQ ZLOO EH SUHVHQWHG 7KHVH OLQJXLVWLF SKHQRPHQD DUH&RQVWLWXHQWYVVHQWHQFHUHFRJQLWLRQ&RRUGLQDWLRQ3UHSRVLWLRQVHPDQWLFIXQFWLRQDQG33 DWWDFKPHQW,WZLOOEHGHPRQVWUDWHGKRZZHDNWKHH[LVWLQJWUDQVODWLRQV\VWHPVDUHDQGKRZWKH SDUVLQJDFFXUDF\RIWKHSDUVHUDGRSWHGFDQFRQWULEXWHWRUHDFKDQDFFHSWDEOHWUDQVODWLRQ
3.1 Constituent vs Sentence Recognition 2QH RI WKH LPSRUWDQW LVVXHV WKDW D 0DFKLQH 7UDQVODWLRQ V\VWHP VKRXOG WDNH FDUH RI LV WKH DELOLW\WRGLIIHUHQWLDWHEHWZHHQDSKUDVHDQGDVHQWHQFH,Q$UDELFDQRPLQDOVHQWHQFHFDQEH FRPSRVHGRIDWRSLFDQGDFRPPHQW7KHFRPPHQWFDQEHUHDOL]HGE\D33JLYLQJDFRPSOHWH PHDQLQJRIWKHZKROHVWUXFWXUHWKHVHQWHQFH&RQVLGHUWKHH[DPSOHLQ
ΔγέΪϤϟϲϓΪϤΤϣ PRÍDPPDGXQIL"DOPDGUDVDWLµ0RKDPPDGLVDWVFKRRO¶
,Q WKLV H[DPSOH ³ΪϤΤϣ´ 13 LV WKH WRSLF RI WKH VHQWHQFH ZKLOH ³ΔγέΪϤϟ ϲϓ´ 33 LV WKH FRPPHQWRIWKHVHQWHQFH7KXVWKHWUDQVODWLRQJLYHQDERYHFDQEHFRQVLGHUHGDVDQDFFHSWDEOH WUDQVODWLRQ RI WKLV VHQWHQFH 7KLV NLQG RI VHQWHQFH VWUXFWXUH FDQ FDXVH D IRUPDO DPELJXLW\
Sameh Al-Ansary and Seham El-Kareh
FRQWUDGLFWLQJZLWKSKUDVHVWUXFWXUHV&RQVLGHUH[DPSOHVLQD DQGD ZLWKWKHLUDXWRPDWLF WUDQVODWLRQ 3URJUDP LV ,QWHU1HW 7UDQVODWLRQ 6HUYLFH IURP &,026 &RPSDQ\ KWWSZZZFLPRVFRP3URJUDPLV$MHHEKWWSWDUMLPDMHHEFRPDMHHE
D ϦϴϴΑΎϫέ·˾ϰϠϋξΒϘϟ"DOTDEGXDOD[DPVDWL"LUKDELMMQ 3URJUDP 7KHDUUHVWLVRQWHUURULVWV3URJUDP WDNLQJSRVVHVVLRQRIWHUURULVWV
E 7KHDUUHVWRIWHUURULVWV D ϪΑΎΤλϦϣϑήτΘϤϟήϜϔϟϦϜϤΗWDPDNNXQX"DOILNUL"DOPRWDWDUULILPLQ"DVÍDELKL 3URJUDP $PDVWHU\WKHH[WUHPHWKLQNLQJLVIURPKLVRZQHUV 3URJUDP 7KHH[WUHPHWKLQNLQJWDNHSRVVHVVLRQRIKLVFRPSDQLRQV E ([WUHPLVWWKLQNLQJKDVGRPLQDWHGWKHP (DFK RI D DQG D FRQVLVWV RI 13 33 KRZHYHU WKH VWUXFWXUH LQ LV D VHQWHQFH ZKLOHWKHVWUXFWXUHVLQD DQGD DUH13V7KHIDOVHWUDQVODWLRQLVFDXVHGE\WKHVRIWZDUHLQ ZKLFKWKHVRIWZDUHLVXQDEOHWRUHFRJQL]HDQGGLIIHUHQWLDWHEHWZHHQFRQVWLWXHQWVWUXFWXUHDQG VHQWHQFH VWUXFWXUH 7KH FRQWULEXWLRQ RI RXU SDUVLQJ WRRO UHVLGHV LQ XVLQJ RXU OLQJXLVWLF DQG IRUPDO VWUDWHJLHV WR JLYH D UHOLDEOH UHSUHVHQWDWLRQ 7KLV FRXOG EH H[DPLQHG WKURXJK WKH ODEHOOHG WUHH UHSUHVHQWDWLRQ LQ FRQVHTXHQWO\ D UHOLDEOH WUDQVODWLRQ DV WKRVH JLYHQ LQ E DQGE FDQEHREWDLQHG
D
E
3.2 Coordination &RRUGLQDWLRQ LV D SUREOHPDWLF LVVXH LQ DOPRVW DOO ODQJXDJHV ,W KDV EHHQ DVVXPH WKDW FRRUGLQDWLRQWDNHVSODFHDWWKHFDWHJRULFDOOHYHODQGWKDWLWLVGRPLQDWHGE\DVLQJOHIXQFWLRQ QRGH ,W KDV DOVR EHHQ DVVXPHG WKDW WKH FRRUGLQDWLRQ FRQFHUQV VLPLODU FDWHJRULHV ,Q WKH SUHVHQWVWDJHRIWKHIRUPDOGHVFULSWLRQRIFRRUGLQDWLRQWKHZRUNKDVEHHQOLPLWHGWRGHWHFWWKH ERXQGDULHVRIWKHFRRUGLQDWHG13VZLWKRXWJRLQJLQWRGHWDLOVRIUHYHDOLQJDOOLQIRUPDWLRQWKDW
Arabic-English Machine Translation Systems: Discrepancies and Implications
FDQ UHVXOW IURP FRRUGLQDWLQJ D 13 ZLWK DQRWKHU DQG KRZ WKH\ FDQ DIIHFW WKH LGHQWLW\ RI WKH ZKROH FRQVWLWXHQW ,Q ZULWLQJ D IRUPDO JUDPPDU IRU GHVFULELQJ FRRUGLQDWLRQ WKH JUDPPDU VKRXOGFRQVLGHUWKHUHVXOWRIFRRUGLQDWLQJWKHIHDWXUHVRIWKHILUVW13ZLWKWKRVHRIWKHVHFRQG 13 )RU H[DPSOH FRQVLGHULQJ GHILQLWHQHVV ZKHQ WKH ILUVW 13 LV GHILQLWH DQG WKH VHFRQG LV LQGHILQLWHWKHFRRUGLQDWLRQZLOOUHVXOWLQDGHILQLWHFRQVWLWXHQW7KXVWKHIRUPDOGHVFULSWLRQRI FRRUGLQDWLRQVKRXOGGHDOZLWKIHDWXUHVOLNHGHILQLWHQHVVQXPEHUJHQGHUDQGSHUVRQWRUHYHDO KRZ WKH\ FDQ DIIHFW WKH ZKROH FRQVWLWXHQW )RU GHWDLOV DERXW WKLV NLQG RI LQYHVWLJDWLRQV FI 'LWWHUV 7R HQDEOH RXU GHVFULSWLRQ WR GHWHFW DXWRPDWLFDOO\ WKH ERXQGDULHV RI WZR FRRUGLQDWHG 13V DOO DYDLODEOH LQIRUPDWLRQ DW KDQG KDYH EHHQ PDGH XVHG RI 6LQFH WKH FRRUGLQDWRUVHSDUDWHVWZR13VWKHJUDPPDUWULHGWRJHWDOOSRVVLEOHDOWHUQDWLYHFRPELQDWLRQV WKDWFDQEHH[SUHVVHGEHIRUHDQGDIWHUWKHFRRUGLQDWRU7KXVDPRUHDFFXUDWHGHVFULSWLRQWRWKH 13WKDWSUHFHGHVWKHFRQMXQFWLRQKDVEHHQQHHGHGWRHQDEOHVSUHFLVHGLYLVLRQLQRQHKDQGDQG HOLPLQDWH WKH UHVW RI WKH DOWHUQDWLYHV RQ WKH RWKHU 8S WR WKH 13V DQDO\]HG WKH DIIL[HV µGHILQLWHQHVV¶ µVXEFODVV¶ DQG µFDVH¶ KDYH EHHQ XVHG WR FRQWURO WKH OLPLWV RI WKH 13V EHLQJ FRRUGLQDWHG ,Q D D VWUXFWXUH WKDW KDV D FRRUGLQDWLRQ DW D FHUWDLQ OLQJXLVWLF OHYHO LH D SRVWGHWHUPLQDWLRQOHYHORIWKH1RPLQDO+HDGµϲδϠΠϣ¶FDQEHVHHQ$IXOOSDUVLQJLQODEHOOHG WUHH RI WKLV VWUXFWXUH LV SUHVHQWHG LQ 7KLV VWUXFWXUH KDV EHHQ WULHG WR EH WUDQVODWHG DXWRPDWLFDOO\WKHUHVXOWFDQEHVHHQLQD D ΏϮϨϟϭΥϮϴθϟϲδϠΠϣ˯ΎπϋϦϣΩΪϋ DGDGXQPLQ"DGD"LPDJOLVDMM"DãLMX[LZDQQRZZDEL 3URJUDP 1XPEHURIWKHVKHLNKVFRXQFLOVPHPEHUVDQGWKHGHSXWLHV 3URJUDP +HZDLOHGIURPWKHFRXQFLORUVRIWKHZKLWHEHDUGVDQGWKHYLFHJHUHQWV E $QXPEHURIWKHPHPEHUVRIVKHLNKVDQGWKHGHSXW\FRXQFLOV
Sameh Al-Ansary and Seham El-Kareh
,WLVYHU\FOHDUIURPWKHDFFHSWDEOHWUDQVODWLRQLQE WKDWWKHERXQGDULHVRIWKHFRRUGLQDWHG 13VDUHQRWSUHFLVH7KHIROORZLQJEUDFNHWVLQ FDQVKRZWKHERXQGDULHVRIWKHFRRUGLQDWHG SDUWVDFFRUGLQJWRWKDWWUDQVODWLRQ
>>ΏϮϨϟ@ϭ>ΥϮϴθϟϲδϠΠϣ˯Ύπϋ@@ϦϣΩΪϋ
+RZHYHUWKHSDUVLQJWRRODGRSWHGUHO\LQJRQVXEFODVVGHILQLWHQHVVFDVHRIWKHFRRUGLQDWHV DQG WKH QXPEHU RI WKH 1RPLQDO +HDG µPDJOLVDMM¶ WR ZKLFK WKH FRRUGLQDWHV DUH SRVWGHWHUPLQLQJFRXOGUHOLDEO\GHWHFWWKHFRRUGLQDWHGSDUWVDVVKRZQLQ
>>ΏϮϨϟ@ϭ>ΥϮϴθϟ@@ϲδϠΠϣ˯ΎπϋϦϣΩΪϋ
&RQVHTXHQWO\ UHOD\LQJ RQ WKH SDUVLQJ WRRO DGRSWHG D UHOLDEOH WUDQVODWLRQ FDQ EH REWDLQHG DV WKDWJLYHQLQE
3.3 Preposition Semantic Function ,WLVYHU\LPSRUWDQWIRUDJRRGWUDQVODWLRQV\VWHPWRDXWRPDWLFDOO\GHWHFWWKHPHDQLQJRID SUHSRVLWLRQLQDVWUXFWXUH7KLVLVVXHLVYHU\GDQJHURXVLQWKHWUDQVODWLRQIURPDQGWR$UDELF EHFDXVH WKH VDPH SUHSRVLWLRQ LQ $UDELF FDQ FRQYH\ PRUH WKDQ RQH VHPDQWLF PHDQLQJ &RQVLGHUWKHIROORZLQJH[DPSOHV D ΪϤΣϷήΑΎΟϭΪϳΰϟϙέΎΒϣϦϣϥΎΘϟΎγέULVDODWDQPLQPXEDUDNOL]DMHGZDJDELU"DO"DKPDG µ7ZROHWWHUVIURP0XEDUDNWR]D\HGDQGJDEHU$ODKPHG¶ E ϢϟΎόϟϲϓϲϧϮϳΰϔϴϠΗέϮΤϟΓΪϫΎθϣΔΒδϧϰϠϋ "DODQLVEDWLPXãDKDGDWLQOLKLZDULQWLOLIL]MRQLMMLQIL"DODODPL µ$KLJKHVWUDWLRRIYLHZLQJIRUD79WDONLQWKHZRUOG¶ ,Q D WKH SUHSRVLWLRQ ³˰ϟ´ SUHIL[HG WR ³Ϊϳί´ KDV D WDUJHW PHDQLQJ VLQFH WKH VWUXFWXUH KDV D VRXUFH ³Ϧϣ´ SUHFHGLQJ :KLOH LQ E WKH VDPH SUHSRVLWLRQ KDV DQ DVVRFLDWLYH PHDQLQJ FRQQHFWHG WR WKH VXSHUODWLYH QRXQ WKH +HDG RI WKH 13 +RZHYHU LW KDV EHHQ QRWLFHG WKDW $UDELF 0DFKLQH WUDQVODWLRQ V\VWHPV KDYH IL[HG WKH VHPDQWLF IXQFWLRQ RI SUHSRVLWLRQV &RQVLGHUWKHIROORZLQJH[DPSOHVZLWKWKHLUFRUUHVSRQGLQJWUDQVODWLRQ D ΫΎϘϧϺϟΔϴϣϼγϹΔϬΒΠϟΓΩΎϗTDGDWX"DOJDEKDWL"DO"LVODPL\\DWLOLO"LQTD&L 3URJUDP 7KHOHDGHUVRIWKH,VODPEORFWRWKH5HVFXH 3URJUDP 7KHFRPPDQGHUVRIWKH,VODPLFIURQWRIWKHGHOLYHUDQFH
E ϢήΠϟϩάϫϭήϜϔϟάϫΔϧΩ·ϰϟ·ΙΪΤΘϤϟϪϴΟϮΘϟΔϟϭΎΤϣ PXÍDZDODWXQOLWDZJLKL"DOPXWDÍDGGL6L"LOD"LGDQDWLKD6D"DOILNULZDKD&LKL "DOJDUD"LPL 3URJUDP $QDWWHPSWWRWKHVSHDNHUGLUHFWLQJWRWKHFRQYLFWLRQRIWKLVWKLQNLQJDQG WKHVHFULPHV 3URJUDP $QDWWHPSWWRFURZQKLPWKHVSRNHVPDQWRFRQGHPQDWLRQRIWKLV WKLQNLQJDQGWKLVWKHFULPHV F ΎϜϳήϣϷΓέΎϳί ]LMDUDWXQOL"DPLULND 3URJUDP$YLVLWWR$PHULFD 3URJUDP$YLVLWWR$PHULFD
Arabic-English Machine Translation Systems: Discrepancies and Implications
$VLWLVFOHDUIURPWKHH[DPSOHVOLVWHGLQ WRJHWKHUZLWKWKHLUDXWRPDWHGWUDQVODWLRQWKDWWKH PHDQLQJRIWKHSUHSRVLWLRQ³˰ϟ´LVDOZD\VIL[HGWKXVLWZDVLQFRUUHFWLQDE DQGDFFHSWDEOH DFFLGHQWDOO\LQF 7KHSDUVLQJV\VWHPDGRSWHGFRXOGOLQNWKHPHDQLQJRIWKHSUHSRVLWLRQ ZLWK WKH VWUXFWXUH LW RFFXUUHG LQ OHDGLQJ WR D UHOLDEOH WUDQVODWLRQ LQ WKLV UHVSHFW ,Q DQ H[DPSOHRIDODEHOOHGWUHHUHSUHVHQWDWLRQRID FDQEHVHHQ ,QWKLVDQDO\VLVWKHSDUVHUFRXOGGHWHFWWKHDVVRFLDWLYHPHDQLQJRIWKHSUHSRVLWLRQ˰ϟ ZKLFK GLUHFWO\OHDGVWRWUDQVODWHLVDV³IRU´QRW³WR´
3.4 PP Attachment $WWDFKLQJ WKH33WRDJLYHQSDUWLQWKHVWUXFWXUHFDQPDNHDNLQGRIDPELJXLW\WKDWDIIHFWV WUDQVODWLQJWKDWVWUXFWXUHIURP$UDELFWR(QJOLVK&RQVLGHUWKHVWUXFWXUHLQ ,QIDFWLI\RX FRQFHQWUDWHIRUDZKLOH\RXZLOOQRWLFHWKDWLWFDQVXSSRUWWZRWUDQVODWLRQVGHSHQGLQJRQWKH DWWDFKPHQW RI WKH 33 ςγϭϷ ϕήθϟ ϲϓ 7KLV 33 FDQ EH FRQVLGHUHG DV D ORFDWLYH FRQVWLWXHQW DWWDFKHG WR ΪϘϋ RU DV D SUHSRVLWLRQDO SRVWPRGLILHU DWWDFKLQJ WKLV FRQVWLWXHQW WR ϡϼγ ,Q IDFW SDUVLQJ WRRO FDQ JLYH IOH[LEOHO\ WZR SDUVLQJ WUHHV WKDW FDQ EH LPSOHPHQWHG LQ WZR GLIIHUHQW WUDQVODWLRQV 7KH GLVFRXUVH ZLOO WKH H[DFW WUDQVODWLRQ ,Q D DQG D WZR SDUVH WUHHV LQ ODEHOOHGEUDFNHWVIRUPDW ZHUHJLYHQZLWKWKHLUWUDQVODWLRQVLQE DQGE DFFRUGLQJWRWKH VWUXFWXUH
,QWKHWUHHGLDJUDPDERYHLWLVQRWFOHDUKRZWKHSDUVHUFRXOGGLVFRYHUWKHVHPDQWLFPHDQLQJRIWKHSUHSRVLWLRQ7KHRXWSXWRIWKHSDUVHULQ ODEHOOHG EUDFNHWLQJ IRUPDW LV DV IROORZV LQ ZKLFK WKH VHPDQWLF IXQFWLRQ RI WKH SUHSRVLWLRQ LV FOHDU 131+($'12811'6ΓΩΎϗ 32'1335('(7$57,&/(˰ϟ 1+($'128111&ΔϬΒΟ $'-320$'-335('(7 $57,&/(˰ϟ $'-+($'12811:Δϴϣϼγ· 3320333+($'(5(PREP(ASSOCIATIVE˰ϟ))3&203/(0(17 1335('(7$57,&/(˰ϟ 1+($'12811,6ΫΎϘϧ·
Sameh Al-Ansary and Seham El-Kareh
ςγϭϷϕήθϟϲϓϡϼδϠϟήϤΗΆϣΪϘϋ DTGXPR"WDPDULQOLOVDODPLIL"DOãDUTL"DO"DZVDWL D
E +ROGLQJDSHDFHFRQIHUHQFHLQWKH0LGGOH(DVW
D
E +ROGLQJDFRQIHUHQFHLQWKH0LGGOH(DVWDERXWSHDFH
Arabic-English Machine Translation Systems: Discrepancies and Implications
7KH SUREOHP EHFRPHV PRUH VHYHUH ZKHQ WKH SUHSRVLWLRQDO FRPSOHPHQW LV UHDOL]HG E\ D FRRUGLQDWHG13,Q ZHVHHWKHQRPLQDOKHDG³ΔϟΎϛϭ´RIWKH13LVSRVWPRGLILHGE\DQ$'-3 ³ΔϴϜϳήϣϷ´ WKHQ E\ D 33 $FFRUGLQJ WR RXU OLQJXLVWLF DSSURDFK WKLV 33 LV FRPSRVHG RI 3+($'(5 DQG 3&203/(0(17 IXQFWLRQV 7KH ODWWHU LV UHDOL]HG E\ D FRRUGLQDWHG 13 $ IXOOODEHOOHG EUDFNHWVUHSUHVHQWDWLRQGHVFULELQJWKHZKROHVWUXFWXUHFDQEHVHHQLQD 7KLV VWUXFWXUH LV WHVWHG RYHU D IDPRXV $UDELF(QJOLVK WUDQVODWLRQ V\VWHP WKH UHVXOW ZDV DV UHSUHVHQWHGLQE ΡϼδϟωΰϧϭϠδΘϟϦϣΪΤϠϟΔϴϜϳήϣϷΔϟΎϛϮϟ
"DOZLNDODWX"DO"DPULNLMMDWXOLOÍDGGLPLQD"DOWDVDOXÍLZDQD]L"DOVLODÍL
D
E 3URJUDP 7KH$PHULFDQDJHQF\WRWKHUHVWULFWLRQLVIURPWKHDUPLQJDQGWKHZHDSRQ SXOOLQJRXW 3URJUDP 7KH$PHULFDQVWHZDUGVKLSWREHDQDWKHLVWIURPWKHDUPDPHQWDQGWKH GHPLOLWDULVDWLRQ F 7KH$PHULFDQDJHQF\IRUOLPLWLQJDUPVDQGGLVDUPLQJ 7KH ODEHOOHG WUHH UHSUHVHQWDWLRQ DERYH OLQNV WKH 1+($' ZLWK LWV PRGLILHUV LQ D V\QWDFWLF KDUPRQ\ WKDW FDQ OHDG WR DQ DFFHSWDEOH VHPDQWLF LQWHUSUHWDWLRQ RI WKH ZKROH VWUXFWXUH DV LW DSSHDULQWKHWUDQVODWLRQLQF
3.5 Conclusion 7KLVSDSHUKDVIRFXVHGRQ0DFKLQH7UDQVODWLRQRI$UDELFLQZKLFKWKH$UDELFODQJXDJHZLOO EHDVRXUFHODQJXDJH:KDWWKHSDSHUWULHGWRSURYHLVWKDWLQRUGHUWRDFKLHYHDJRRGPDFKLQH
Sameh Al-Ansary and Seham El-Kareh
WUDQVODWLRQ IRU $UDELF LQ SULQFLSOH $UDELF VWUXFWXUHV VKRXOG EH XQGHUVWRRG E\ WKH PDFKLQH 8VLQJWKHSDUVHUDGRSWHGLWZDVSRVVLEOHWRIRFXVRQEXLOGLQJDIRUPDOPRGXOHIRUGHVFULELQJ $UDELF VWUXFWXUHV DQG WUDQVIHU WKHP ZLWK WKH VDPH PHDQLQJ LQWR WKH WDUJHW ODQJXDJH 6RPH SUREOHPDWLF DUHDV LQ $UDELF(QJOLVK PDFKLQH WUDQVODWLRQ KDYH EHHQ VXUYH\HG 7R D FHUWDLQ H[WHQW WKH JDEV LQ WKH WUDQVODWLRQV KDYH EHHQ VKRZHG DQG KRZ WKH SDUVLQJ V\VWHP DGRSWHG FRXOG GHDO ZLWK WKHVH JDEV :RUN LQ WKLV GLUHFWLRQ ZLOO KHOS WR IRUP D QHZ JHQHUDWLRQ RI $UDELF VWXGLHV XVLQJ ,QIRUPDWLRQ DQG&RPPXQLFDWLRQ7HFKQRORJ\IRUUHVHDUFKDQGWHDFKLQJ SXUSRVHV
References $O$QVDU\ 6 $ &RPSDUDWLYH &RUSXVEDVHG 6WXG\ RI 6SRNHQ DQG :ULWWHQ 0RGHUQ 6WDQGDUG$UDELF06$ 3K'WKHVLV$OH[DQGULD8QLYHUVLW\(J\SW
'LWWHUV : ( A Formal Approach to Arabic syntax: The Noun Phrase and Verb Phrase3K'1LMPHJHQ8QLYHUVLW\
(O.DUHK 6 $O$QVDU\ 6 $Q ,QWHUDFWLYH 0XOWL)HDWXUHV 326 7DJJHU ,Q the Proceedings of the International Conference on Artificial and Computational Intelligence for Decision Control and Automation in Intelligence for Decision Control and Automation in Engineering and Industrial Applications, 1DWXUDO/DQJXDJH3URFHVVLQJ3DQHOSS 0DUFK0RQDVWLU7XQLVLD
0F(QHU\7RQ\$QGUHZ:LOVRQ Corpora and Translation: uses and future prospects, /DQFDVWHU8&5(/
1LUHQEXUJ 6HUJHL HW DO Machine Translation: A Knowledge-based Approach 6DQ 0DWHR&DO0RUJDQ.DXIPDQQ
1HZWRQ -RKQ HG Computers in Translation: A practical Appraisal /RQGRQ 5RXWOHGJH
2ZHQV-RQDWKDQ The Foundations of Grammar: An Introduction to Medieval Arabic Grammatical Theory,-RKQ%HQMDPLQSXEOLFDWLRQFRPSDQ\$PVWHUGDP
6KLKDGDK 0 3DXO 5RRKQLN /H[LFDO )XQFWLRQDO *UDPPDU DV D &RPSXWDWLRQDO /LQJXLVWLF 8QGHUSLQQLQJ WR $UDELF 0DFKLQH 7UDQVODWLRQ The proceedings of the 6th International Conference and Exhibition on Mutilingual Computing 8QLYHUVLW\ RI &DPEULGJH/RQGRQ$SULOSS±
6FKXEHUW.ODXV Contrastive Dependency Syntax for Machine Translation, 'RUGUHFKW )RULV3XEOLFDWLRQV
6LJXUG%HQJW Computerized Grammars for Analysis and Machine Translation/XQG /XQG8QLYHUVLW\3UHVV
7UXMLOOR$UWXUR Translation Engines: Techniques for Machine Translation,/RQGRQ
:KLWHORFN 3HWHU DQG NLHUDQ .LOE\ Linguistic and Computational Techniques in Machine Translation System Design/RQGRQ8&/3UHVV