NLP

[Attention] 6. Self-Attention

玄曄 2022. 1. 18. 09:33

1. List of Attention Lectures

- What is Attention?

- Bottom-Up vs. Top-Down Attention

- 1st Encounter with Arificial Neural Networks

- Neural Machine Translation with Attention

- Show, Attend and Tell(Soft vs. Hard Attention)

 

 

2. Typical way of calculating Attention

(1) Take the query and each key

(2) Compute the similarity between the two to obtain a weight

   - Similarity functions(dot product, splice, detector, etc)

(3) Softmax function to normalize these weights

(4) Weight these weights in conjunction with the corresponding values

 

 

3. Self-Attention(Intra-Attention)

- Attention mechanism relating different positions of a single sequence in order to compute a representation of the same sequence.

- It has been shown to be very useful in machine reading, abstractive summarization, or image description generation.