What is the best text embedding model for ecommerce product search (short, noisy user queries)?

sumit-raj-710 · December 3, 2025, 8:48am

I am integrating a vector-based semantic search system into a B2B ecommerce platform’s product search, and I want to select the right text embedding model.

Use Case

User queries are often:

Very short (1–4 words)
Ungrammatical
Misspelled
Contain specifications or abbreviations (e.g., “m12 nut”, “2hp pump”, “ss tank 1000l”)
Contain domain-specific technical terms

Each product has:

Title
Attribute fields (e.g., Material=SS, Voltage=220V)
Description text

I need embeddings that capture semantic meaning across these fields and match them with noisy, spec-heavy queries.

Constraints / Setup

English-only
Running on GPU (model size not a constraint)
Throughput: ~100 queries per second
Retrieval backend not yet decided but most likely Vespa
Fine-tuning will come later — I first need a strong base embedding model

Questions

Which open-source embedding models work best out of the box for ecommerce/product search?
Are there any models that are trained or tuned specifically for ecommerce data?
Should I embed (title + attributes + description) concatenated as a single document, or embed fields separately and combine?

Example queries

“2hp motor pump”
“ss nut m12”
“isi water tank 1000l”
“sewing macine” (misspelled)

Any guidance or practical experience with embedding models for ecommerce search would be appreciated.

John6666 · December 3, 2025, 2:44pm

I looked around for now. There might not be a definitive model.

sumit-raj-710 · December 5, 2025, 5:05am

Thanks a lot for such an informative reply!

Topic		Replies	Views
Choosing a model and embedding input structure for searching e-commerce products Models	0	1710	October 18, 2023
Vector search returns almost random results Models	3	504	February 10, 2024
Training for sentence vectors in niche domain Intermediate	18	3317	February 16, 2021
Which model to use for suggesting article to the user based on details provided? Beginners	7	1890	May 28, 2021
Good way to output embedding for search? Beginners	3	1592	August 19, 2020

What is the best text embedding model for ecommerce product search (short, noisy user queries)?

Use Case

Constraints / Setup

Questions

Example queries

Related topics