Spaces:

maomao88
/

visualize_attention_for_translation_en_zh

Running

maomao88 commited on Apr 16

Commit

e13cd84

1 Parent(s): e97f691

update readme

Files changed (2) hide show

README.md CHANGED Viewed

@@ -19,4 +19,17 @@ tags:
   - encoder-decoder
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

   - encoder-decoder
 ---
+This app aims to help users better understand the behavior behind the attention layers in transformer models by visualizing the cross-attention and self-attention weights in an encoder-decoder model to see the alignment between and within the source and target tokens.
+The app leverages the `Helsinki-NLP/opus-mt-en-zh` model to perform translation tasks from English to Chinese and by `output_attentions=True`, the attention weights are stored as follows:
+Attention Type | Shape | Role
+encoder_attentions | (layers, B, heads, src_len, src_len) | Encoder self-attention on source tokens
+decoder_attentions | (layers, B, heads, tgt_len, tgt_len) | Decoder self-attention on generated tokens
+cross_attentions | (layers, B, heads, tgt_len, src_len) | Decoder attention over source tokens (encoder outputs)
+By taking the weights from the last encoder and decoder layers and calculating the mean over the 8 heads, the attention weights (avg over heads) are obtained to build attention visualizations
+**Note :**
+* `attn_weights = softmax(Q @ K.T / sqrt(d_k)) `
+* `(layers, B, heads, src_len, src_len)` - e.g. `(6, 1, 8, 24, 18)`

app.py CHANGED Viewed

@@ -323,6 +323,8 @@ function showCrossAttFun(attn_scores, decoder_attn, encoder_attn) {
 with gr.Blocks(css=css) as demo:
     gr.Markdown("""
         ## 🕸️ Visualize Attentions in Translated Text (English to Chinese)
         After translating your English input to Chinese, you can check the cross attentions and self-attentions of the translation in the lower section of the page.
     """)

 with gr.Blocks(css=css) as demo:
     gr.Markdown("""
         ## 🕸️ Visualize Attentions in Translated Text (English to Chinese)
+        This app aims to help users better understand the behavior behind the attention layers in transformer models by visualizing the cross-attention and self-attention weights in an encoder-decoder model to see the alignment between and within the source and target tokens.
         After translating your English input to Chinese, you can check the cross attentions and self-attentions of the translation in the lower section of the page.
     """)