maomao88 commited on
Commit
e13cd84
·
1 Parent(s): e97f691

update readme

Browse files
Files changed (2) hide show
  1. README.md +14 -1
  2. app.py +2 -0
README.md CHANGED
@@ -19,4 +19,17 @@ tags:
19
  - encoder-decoder
20
  ---
21
 
22
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  - encoder-decoder
20
  ---
21
 
22
+ This app aims to help users better understand the behavior behind the attention layers in transformer models by visualizing the cross-attention and self-attention weights in an encoder-decoder model to see the alignment between and within the source and target tokens.
23
+
24
+ The app leverages the `Helsinki-NLP/opus-mt-en-zh` model to perform translation tasks from English to Chinese and by `output_attentions=True`, the attention weights are stored as follows:
25
+
26
+ Attention Type | Shape | Role
27
+ encoder_attentions | (layers, B, heads, src_len, src_len) | Encoder self-attention on source tokens
28
+ decoder_attentions | (layers, B, heads, tgt_len, tgt_len) | Decoder self-attention on generated tokens
29
+ cross_attentions | (layers, B, heads, tgt_len, src_len) | Decoder attention over source tokens (encoder outputs)
30
+
31
+ By taking the weights from the last encoder and decoder layers and calculating the mean over the 8 heads, the attention weights (avg over heads) are obtained to build attention visualizations
32
+
33
+ **Note :**
34
+ * `attn_weights = softmax(Q @ K.T / sqrt(d_k)) `
35
+ * `(layers, B, heads, src_len, src_len)` - e.g. `(6, 1, 8, 24, 18)`
app.py CHANGED
@@ -323,6 +323,8 @@ function showCrossAttFun(attn_scores, decoder_attn, encoder_attn) {
323
  with gr.Blocks(css=css) as demo:
324
  gr.Markdown("""
325
  ## 🕸️ Visualize Attentions in Translated Text (English to Chinese)
 
 
326
  After translating your English input to Chinese, you can check the cross attentions and self-attentions of the translation in the lower section of the page.
327
  """)
328
 
 
323
  with gr.Blocks(css=css) as demo:
324
  gr.Markdown("""
325
  ## 🕸️ Visualize Attentions in Translated Text (English to Chinese)
326
+ This app aims to help users better understand the behavior behind the attention layers in transformer models by visualizing the cross-attention and self-attention weights in an encoder-decoder model to see the alignment between and within the source and target tokens.
327
+
328
  After translating your English input to Chinese, you can check the cross attentions and self-attentions of the translation in the lower section of the page.
329
  """)
330