一、 Neural Machine Translation with RNNs

G

image.png

Effect of the masks on attention computation:

Why it is necessary:

H

image.png

I

image.png

i. Dot product attention vs. multiplicative attention


ii. Additive attention vs. multiplicative attention