Update README.md
Browse files
README.md
CHANGED
|
@@ -25,9 +25,10 @@ license: apache-2.0
|
|
| 25 |
# bkai-foundation-models/vietnamese-bi-encoder
|
| 26 |
|
| 27 |
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
|
|
|
|
| 28 |
We train the model on a merged training dataset that consists of:
|
| 29 |
-
- MS Macro (translated
|
| 30 |
-
- SQuAD v2 (translated
|
| 31 |
- 80% of the training set from the Legal Text Retrieval Zalo 2021 challenge
|
| 32 |
|
| 33 |
We use [phobert-base-v2](https://github.com/VinAIResearch/PhoBERT) as the pre-trained backbone.
|
|
|
|
| 25 |
# bkai-foundation-models/vietnamese-bi-encoder
|
| 26 |
|
| 27 |
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
|
| 28 |
+
|
| 29 |
We train the model on a merged training dataset that consists of:
|
| 30 |
+
- MS Macro (translated into Vietnamese)
|
| 31 |
+
- SQuAD v2 (translated into Vietnamese)
|
| 32 |
- 80% of the training set from the Legal Text Retrieval Zalo 2021 challenge
|
| 33 |
|
| 34 |
We use [phobert-base-v2](https://github.com/VinAIResearch/PhoBERT) as the pre-trained backbone.
|