|
Issue: Discrepancy Between Layer-Wise Density Plots vs. Mean Trajectory Plots in LLaVA-1.5 Attention Analysis
|
|
2
|
15
|
January 25, 2026
|
|
[Discussion] Validating Attention Map Visualization for Visual Fading in LLaVA-1.5
|
|
4
|
27
|
January 23, 2026
|
|
No fix for High Vulnerabilities in transformers latest package
|
|
2
|
28
|
January 22, 2026
|
|
How to disable caching in .from_pretrained()
|
|
4
|
1224
|
January 18, 2026
|
|
DetLLM – Deterministic Inference Checks
|
|
0
|
19
|
January 17, 2026
|
|
Distributed LLaMA Inference Engine Built from Scratch (KV Cache, GQA, RoPE)
|
|
0
|
25
|
January 16, 2026
|
|
Run name issue, different run name file in webpage & local
|
|
1
|
86
|
January 16, 2026
|
|
Transformers v5 timelines
|
|
0
|
27
|
January 15, 2026
|
|
Whisper fine-tuned with custom tokens works with model.generate but doesn't with a pipeline()
|
|
3
|
34
|
January 14, 2026
|
|
GPT 2 finetuning peaks at 8 GiB of VRAM
|
|
7
|
70
|
January 12, 2026
|
|
Model_accepts_loss_kwargs detection based on **kwargs is too permissive
|
|
2
|
259
|
January 5, 2026
|
|
Seeking Advice🔥🔥| Strategy for Embedding Multiple Subjective Reviews in One-time Event Domain Recommendations
|
|
2
|
38
|
January 23, 2026
|
|
TurboTensors: Optimizing CPU LLM Performance
|
|
0
|
22
|
December 31, 2025
|
|
Significant generation degradation and repetition loops when enabling KV-cache for Qwen3-VL
|
|
2
|
81
|
December 29, 2025
|
|
Injecting multi modal embeddings into a language model breaks the `generate` function
|
|
1
|
84
|
December 28, 2025
|
|
Transformers v4 or v5 for my new project?
|
|
1
|
57
|
December 27, 2025
|
|
Assistant model is not passed onto the custom_generate method
|
|
3
|
23
|
December 25, 2025
|
|
How can i get TRANSFORMERS_CACHE in transformers v5?
|
|
2
|
40
|
December 19, 2025
|
|
CDM-CTM Fusion: A Rigorous Framework for Depth-Aware Autoregressive Control
|
|
0
|
19
|
December 13, 2025
|
|
Tensor Dimension Mismatch when using TRL GKDTrainer
|
|
3
|
18
|
December 12, 2025
|
|
Transformers.js need for token to char mapping
|
|
3
|
31
|
December 11, 2025
|
|
[Pipelines] Mask Generation Parameters
|
|
2
|
114
|
December 10, 2025
|
|
Having trouble to configure trainer for T5 model evaluation
|
|
1
|
40
|
December 9, 2025
|
|
How do I speedup my callbacks and reduce stall before they start?
|
|
1
|
37
|
December 9, 2025
|
|
Getting 429 Too Many Request
|
|
3
|
117
|
December 8, 2025
|
|
How to add new language to NLLB tokenizer in Huggingface?
|
|
3
|
2045
|
December 6, 2025
|
|
Is it possible to remove all other language from NLLB200 except English and German?
|
|
2
|
768
|
December 6, 2025
|
|
How to use nllb1.3b model to fine-tune the English to German bidirectional translation task?
|
|
2
|
119
|
December 6, 2025
|
|
SAE for Codegemma
|
|
3
|
23
|
December 6, 2025
|
|
Obtain raw logits before decoding scaling is applied
|
|
1
|
41
|
December 5, 2025
|