Spaces:

jedick
/

R-help-chat

Running on Zero

jedick commited on Nov 3

Commit

2890297

1 Parent(s): b00d122

Disable Flash Attention (build error)

Files changed (2) hide show

main.py CHANGED Viewed

@@ -157,7 +157,7 @@ def GetChatModel(compute_mode, ckpt_dir=None):
             # Enable FlashAttention (requires pip install flash-attn)
             # https://huggingface.co/docs/transformers/en/attention_interface
             # https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention
-            attn_implementation="flash_attention_2",
         )
         # For Flash Attention version of Qwen3
         tokenizer.padding_side = "left"

             # Enable FlashAttention (requires pip install flash-attn)
             # https://huggingface.co/docs/transformers/en/attention_interface
             # https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention
+            # attn_implementation="flash_attention_2",
         )
         # For Flash Attention version of Qwen3
         tokenizer.padding_side = "left"

requirements.txt CHANGED Viewed

@@ -5,7 +5,7 @@ chromadb==0.6.3
 #   ValueError('Could not connect to tenant default_tenant. Are you sure it exists?')
 # FlashAttention
-flash-attn==2.8.2
 # Stated requirements:
 #   Gemma 3: transformers>=4.50

 #   ValueError('Could not connect to tenant default_tenant. Are you sure it exists?')
 # FlashAttention
+#flash-attn==2.8.2
 # Stated requirements:
 #   Gemma 3: transformers>=4.50