NLP
[Attention] 6. Self-Attention
玄曄
2022. 1. 18. 09:33
1. List of Attention Lectures
- What is Attention?
- Bottom-Up vs. Top-Down Attention
- 1st Encounter with Arificial Neural Networks
- Neural Machine Translation with Attention
- Show, Attend and Tell(Soft vs. Hard Attention)
2. Typical way of calculating Attention
(1) Take the query and each key
(2) Compute the similarity between the two to obtain a weight
- Similarity functions(dot product, splice, detector, etc)
(3) Softmax function to normalize these weights
(4) Weight these weights in conjunction with the corresponding values
3. Self-Attention(Intra-Attention)
- Attention mechanism relating different positions of a single sequence in order to compute a representation of the same sequence.
- It has been shown to be very useful in machine reading, abstractive summarization, or image description generation.