Pham Van Linh's picture

Pham Van Linh

phamvanlinh143

·

AI & ML interests

OCR, AI, DL

Recent Activity

upvoted an article 6 days ago

Transformers Are Getting Old: Variants and Alternatives Exist!

upvoted an article 12 days ago

Design choices for Vision Language Models in 2024

upvoted a collection 12 days ago

ByteDance Papers

View all activity

Organizations

None yet

upvoted an article 6 days ago

Article

Transformers Are Getting Old: Variants and Alternatives Exist!

Jul 5

•

44

upvoted an article 12 days ago

Article

Design choices for Vision Language Models in 2024

Apr 16, 2024

•

34

upvoted 2 collections 12 days ago

ByteDance Papers

ByteDance papers collection • 127 items • Updated 4 days ago • 19

Deepseek Papers

Deepseek papers collection • 26 items • Updated 3 days ago • 286

upvoted an article 15 days ago

Article

Fine-Tuning SigLIP2 for Image Classification

Mar 5

•

17

upvoted 5 articles 17 days ago

Article

Preference Optimization for Vision Language Models

+2

Jul 10, 2024

•

88

Article

Fine-Tuning Gemma Models in Hugging Face

+2

Feb 23, 2024

•

41

Article

From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate

+2

Jun 13, 2024

•

61

Article

The Large Language Model Course

Jan 16

•

212

Article

VLM Visual Arts Analysis with DeepSeek Janus-1.3B

Oct 30, 2024

•

1

upvoted a paper 17 days ago

Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources

Paper • 2504.00595 • Published Apr 1 • 37

upvoted an article 17 days ago

Article

A Survey of Small Language Models in the Era of LLMs: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness

Jul 16

•

4

upvoted a paper 17 days ago

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1 • 240

upvoted 5 articles 18 days ago

Article

The 4 Things Qwen-3’s Chat Template Teaches Us

Apr 30

•

78

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

+2

Mar 12

•

473

Article

Visualizing How VLMs Work

Oct 7

•

46

Article

Welcome PaliGemma 2 – New vision language models by Google

+2

Dec 5, 2024

•

162

Article

ColPali: Efficient Document Retrieval with Vision Language Models 👀

Jul 5, 2024

•

303

upvoted 2 collections 18 days ago

Vision Language Models: 2025 Update

This collection includes all the models, datasets and Spaces mentioned in the blog Vision Language Models: 2025 Update • 67 items • Updated May 12 • 5

Qwen3-VL

37 items • Updated Nov 1 • 488