Cohesive Constraints in a Beam Search Phrase-based Decoder Nguyen Bach, Stephan Vogel Carnegie Mellon University
Colin Cherry Microsoft Research
1
Overview • Apply cohesive constraints during decoding process to consider the source dependency structures • Introduce extensions of the cohesive constraints. • Analyze the impact of cohesive constraints across language pairs with different reordering models • Applied to English-Spanish , English-Iraqi and ChineseEnglish translation tasks – Significant improvements on English-Spanish – Stable improvements on other pairs 2
élection présidentielle commence demain des États Unis
7
Two Questions • How to determine the largest subtree that needs to be completed before the translation process can move elsewhere in the tree? – Interruption Check: use left and right most tokens of the previous translated source phrase and climb up the tree
• If a violation happens, how to constrain the decoder to penalize cohesion violated translation hypothesis? – Interruption Check : Binary event 8
Exhaustive Interruption Check • Interruption Check only penalizes the cohesion violation 1 time • Should penalties persist as long as violations remain unresolved? • Exhaustive Interruption Check keeps punishing a cohesion violation until it is fixed.
élection présidentielle commence demain des États Unis
10
Cohesion Violation Penalties • Interruption Check and Exhaustive Interruption Check: binary event • Are some violations worse than others? • Penalize a cohesion violation by the number of untranslated words under the largest subtree – Interruption Check -> Interruption Count – Exhaustive Interruption Check -> Exhaustive Interruption Count 11
Rich Interruption Constraints begins /VBZ
begins
OBJ
SBJ election
election /NN
tomorrow
NMOD the
presidential
states
the /DT
NMOD
tomorrow /NN NMOD
presidential /JJ PMOD
of
the
united
of /IN
the /DT
states /NNS NMOD
NMOD
united /VBN
• Penalize a cohesion violation by 4 constraints – – – –
Cohesive constraints obtained improvements over the standard phrase-based decoder. 15
How does the performance of the dependency parser affect cohesive constraints?
16
BLEU
The Role of Dependency Parser on English-Spanish 33.2 33 32.8 32.6 32.4 32.2 32 31.8 31.6 31.4 31.2
M1 M2
• Train 2 MALT dependency parser models: M1 with 10% of treebank and M2 with all treebank. • Performance on CoNLL-07 dependency test set – M1: 19.41% – M2: 86.21%
• Apply to MT – M2 is better than M1
17
• Are the improvements subsumed by a strong reordering model and system scale? • What if we translate from X->English?
Conclusions & Future Work • Conclusions – Cohesive constraints are helpful – The effectiveness was shown when using with a strong reordering model – Obtained improvements with 3 language pairs and also covered a wide range of training corpus sizes, ranging from 500K up to 11M sentence pairs
• Future work – A source side dependency reordering model: Learning reordering events of the phrases based on source subtree movements – A hierarchical source side dependency reordering model: extend Galley&Manning (2008).
1. establish a clear state-of-the-art on coarse-grained, generic propagation al- gorithms (Section ..... the generic Queue interface as defined in the Java 1.6 API.
ested in obtaining a solution, not simply knowing that one exists. Obtaining a ..... implemented using an easy algorithm, which filters out all values present in.
We study the asymptotic behavior of a linear elastic material lying in a thin tubular .... The limit energy we obtain is the one classically used in mechanics. Let us.
for the prosecution, including the victims of a crime, deny what they told the police during ..... consultant, a role which has earned him recognition from the judges.
precision of rock mechanics data, no progress was made in reducing the ...... Italy), using an ultra-violet laser ablation microprobe 40Ar â39Ar techni-.
The goal is to build a set of solutions approximating the ... build l elements1 Ï by tacking greedy decisions(line 7). ...... outperforms aggregation or efficient specialized multi- ... Optimization (EMO 2003), Lecture Notes in Computer Science, ...
Jun 30, 2016 - point of view, squeaking or squealing sounds may convey the sense of âeffortâ ... In line with this result, a damping law defined by a global and a ...
... in free-space communications [2] and high-contrast imaging in astronomy [3]. ... is a matrix product that takes less than 0.02 seconds on a personal computer.
and shear webs for wood or composite beams. In the general spar design process, the loads (bending moments, shear force, and axial force) will be calculated.
Lotus 1-2-3, Excel, etc., may require .... new or renewal membership card, you will find an application making it possible to join for a friend. Fill in your.
They must satisfy the paraxial Helmholtz equation derived in Sec. 2.2C. ...... matrix of a SELFOC graded-index slab with quadratic refractive index (see Sec.
A manager's job is adding business value to his or her organization. .... It can be argued convincingly that the nature of business value has evolved ..... pointed to the top line and continued: how can you convince sales managers that.
these vertical and horizontal movements should stay fairly close to the Donders' surface of the head (Glenn and Vilis, 1992;. Crawford et al., 1999). Indeed, when ...
healer called Boyboy) within a short period of time. Although this is typi- .... ment structures of two verbs into that of a new, composite 'macro-verb'. ...... in younger people's speech â but suddenly some archaic or literary phrase uses it as a
Be happy my heart, ... meaning of senet, so that the game can be ... Egyptians believed that in death they would .... death no matter what sins they committed in.
We bring new ideas to make both solvers cooperate through bi-directional constraint ... algorithm can send equalities, disequalities and Alldifferent constraints to fd, while .... on a union-find structure to represent the set of all equivalence clas
part of the work lies in the bi-directional communication mechanism be- tween both ... design ways to master the communication overhead. .... ware testing.
-REMF peter for E-ghoten Sound bearm infinity vs Life pulse professional ? no mare ... dees ( plasma tu bes ladrant 7 dts tan ce) vs plasma rod hold n hand?
Finally, preference-based constraint systems are defined and associated ... flexibility to represent and make complex decisions with computers. .... But for algorithmic efficiency they can be specialized. .... Lecture Notes in Computer Science,.
In this regard, ensuring that all the components provide the best possible privacy solutions while also ensuring optimal performance and efficiency for the ...
today's understanding of what it means to be human with the traditional concept of ...... First of all, nights spent, from the time that I remember, when I was three ...
General Remote Control Systems. ⢠Garage Door .... For further information on Antenna design please see our full product catalogue. Serial Data ... the KeeLoq packet there will be a minimum gap between each serial data string of 150mSec.