sslcheat commited on
Commit
18fdc68
·
verified ·
1 Parent(s): 5c8efce

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +638 -3
README.md CHANGED
@@ -1,3 +1,638 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - medical
5
+ - code
6
+ - math
7
+ - reasoning
8
+ - general
9
+ datasets:
10
+ - Raderspace/MATH_qCoT_LLMquery_questionasquery_lexicalquery
11
+ - reasonir/reasonir-data
12
+ - truehealth/medqa
13
+ - AQ-MedAI/PRGB-ZH
14
+ metrics:
15
+ - accuracy
16
+ - recall
17
+ base_model:
18
+ - Qwen/Qwen3-Embedding-4B
19
+ pipeline_tag: text-ranking
20
+ language:
21
+ - zh
22
+ - en
23
+ library_name: transformers
24
+ ---
25
+ # Diver-Retriever-4B-1020
26
+
27
+
28
+ ## HighLights
29
+ The Diver Retriever 4B model is a reasoning-intensive model designed to tackle the challenge of reasonIR and rader.
30
+ We combined data from the fields of mathematics, coding, and healthcare.
31
+ At the same time, we made precise matching in terms of the difficulty level of the samples, and uniquely
32
+ constructed negative samples corresponding to each field. Therefore, this model performs very well on the Bright LeaderBoard
33
+ as well as the Mteb-Medical Benchmark.
34
+
35
+
36
+
37
+ ### Model Description
38
+
39
+ <!-- Provide a longer summary of what this model is. -->
40
+
41
+
42
+ - **Model type:** Text Embedding
43
+ - **Language(s) (NLP):** Bilingual (Chinese & English)
44
+ - **Context Length:** 40k
45
+ - **Number of Paramaters:** 4B
46
+
47
+ For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our GitHub (https://github.com/AQ-MedAI/Diver).
48
+
49
+
50
+
51
+ | **Model** | **#Total Params** | **Context Length** | **Download** | **BRIGHT** |
52
+ | :------------------: | :---------------: | :----------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------: | :----------------: |
53
+ | DIVER-Retriever-4B-1020 | 4B | 40K | [🤗 HuggingFace]https://huggingface.co/AQ-MedAI/Diver-Retriever-4B-1020 <br>[🤖 ModelScope]https://www.modelscope.cn/models/AQ-MedAI/Diver-Retriever-4B-1020 | **31.9** |
54
+ | DIVER-Retriever-4B | 4B | 40K | [🤗 HuggingFace]https://huggingface.co/AQ-MedAI/Diver-Retriever-4B <br>[🤖 ModelScope]https://www.modelscope.cn/models/AQ-MedAI/Diver-Retriever-4B | **28.9** |
55
+ | DIVER-Retriever-1.7B | 1.7B | 40K | [🤗 HuggingFace]https://huggingface.co/AQ-MedAI/Diver-Retriever-1.7B <br>[🤖 ModelScope]https://www.modelscope.cn/models/AQ-MedAI/Diver-Retriever-1.7B | **27.3** |
56
+ | DIVER-Retriever-0.6B | 0.6B | 32K | [🤗 HuggingFace]https://huggingface.co/AQ-MedAI/Diver-Retriever-0.6B <br>[🤖 ModelScope]https://www.modelscope.cn/models/AQ-MedAI/Diver-Retriever-0.6B | **25.2** |
57
+
58
+
59
+ ## Evaluation
60
+
61
+ <table>
62
+ <thead>
63
+ <tr>
64
+ <th>Method</th>
65
+ <th style="text-align:right">Avg.</th>
66
+ <th style="text-align:right">Bio.</th>
67
+ <th style="text-align:right">Earth.</th>
68
+ <th style="text-align:right">Econ.</th>
69
+ <th style="text-align:right">Psy.</th>
70
+ <th style="text-align:right">Rob.</th>
71
+ <th style="text-align:right">Stack.</th>
72
+ <th style="text-align:right">Sus.</th>
73
+ <th style="text-align:right">Leet.</th>
74
+ <th style="text-align:right">Pony</th>
75
+ <th style="text-align:right">AoPS</th>
76
+ <th style="text-align:right">TheoQ.</th>
77
+ <th style="text-align:right">TheoT.</th>
78
+ </tr>
79
+ </thead>
80
+ <tbody>
81
+ <tr>
82
+ <td colspan=12 style="text-align:center"><strong>Evaluate Retriever with Original Query</strong></td>
83
+ </tr>
84
+ <tr>
85
+ <td>BM25</td>
86
+ <td style="text-align:right">14.5</td>
87
+ <td style="text-align:right">18.9</td>
88
+ <td style="text-align:right">27.2</td>
89
+ <td style="text-align:right">14.9</td>
90
+ <td style="text-align:right">12.5</td>
91
+ <td style="text-align:right">13.6</td>
92
+ <td style="text-align:right">18.4</td>
93
+ <td style="text-align:right">15.0</td>
94
+ <td style="text-align:right">24.4</td>
95
+ <td style="text-align:right">7.9</td>
96
+ <td style="text-align:right">6.2</td>
97
+ <td style="text-align:right">10.4</td>
98
+ <td style="text-align:right">4.9</td>
99
+ </tr>
100
+ <tr>
101
+ <td>SBERT</td>
102
+ <td style="text-align:right">14.9</td>
103
+ <td style="text-align:right">15.1</td>
104
+ <td style="text-align:right">20.4</td>
105
+ <td style="text-align:right">16.6</td>
106
+ <td style="text-align:right">22.7</td>
107
+ <td style="text-align:right">8.2</td>
108
+ <td style="text-align:right">11.0</td>
109
+ <td style="text-align:right">15.3</td>
110
+ <td style="text-align:right">26.4</td>
111
+ <td style="text-align:right">7.0</td>
112
+ <td style="text-align:right">5.3</td>
113
+ <td style="text-align:right">20.0</td>
114
+ <td style="text-align:right">10.8</td>
115
+ </tr>
116
+ <tr>
117
+ <td>gte-Qwen1.5-7B</td>
118
+ <td style="text-align:right">22.5</td>
119
+ <td style="text-align:right">30.6</td>
120
+ <td style="text-align:right">36.4</td>
121
+ <td style="text-align:right">17.8</td>
122
+ <td style="text-align:right">24.6</td>
123
+ <td style="text-align:right">13.2</td>
124
+ <td style="text-align:right">22.2</td>
125
+ <td style="text-align:right">14.8</td>
126
+ <td style="text-align:right">25.5</td>
127
+ <td style="text-align:right">9.9</td>
128
+ <td style="text-align:right">14.4</td>
129
+ <td style="text-align:right">27.8</td>
130
+ <td style="text-align:right">32.9</td>
131
+ </tr>
132
+ <tr>
133
+ <td>Qwen3-4B</td>
134
+ <td style="text-align:right">5.6</td>
135
+ <td style="text-align:right">3.5</td>
136
+ <td style="text-align:right">8.0</td>
137
+ <td style="text-align:right">2.3</td>
138
+ <td style="text-align:right">2.0</td>
139
+ <td style="text-align:right">1.6</td>
140
+ <td style="text-align:right">1.0</td>
141
+ <td style="text-align:right">4.4</td>
142
+ <td style="text-align:right">2.1</td>
143
+ <td style="text-align:right">0.1</td>
144
+ <td style="text-align:right">4.9</td>
145
+ <td style="text-align:right">18.0</td>
146
+ <td style="text-align:right">19.2</td>
147
+ </tr>
148
+ <tr>
149
+ <td>OpenAI</td>
150
+ <td style="text-align:right">17.9</td>
151
+ <td style="text-align:right">23.3</td>
152
+ <td style="text-align:right">26.7</td>
153
+ <td style="text-align:right">19.5</td>
154
+ <td style="text-align:right">27.6</td>
155
+ <td style="text-align:right">12.8</td>
156
+ <td style="text-align:right">14.3</td>
157
+ <td style="text-align:right">20.5</td>
158
+ <td style="text-align:right">23.6</td>
159
+ <td style="text-align:right">2.4</td>
160
+ <td style="text-align:right">8.5</td>
161
+ <td style="text-align:right">23.5</td>
162
+ <td style="text-align:right">11.7</td>
163
+ </tr>
164
+ <tr>
165
+ <td>Google</td>
166
+ <td style="text-align:right">20.0</td>
167
+ <td style="text-align:right">22.7</td>
168
+ <td style="text-align:right">34.8</td>
169
+ <td style="text-align:right">19.6</td>
170
+ <td style="text-align:right">27.8</td>
171
+ <td style="text-align:right">15.7</td>
172
+ <td style="text-align:right">20.1</td>
173
+ <td style="text-align:right">17.1</td>
174
+ <td style="text-align:right">29.6</td>
175
+ <td style="text-align:right">3.6</td>
176
+ <td style="text-align:right">9.3</td>
177
+ <td style="text-align:right">23.8</td>
178
+ <td style="text-align:right">15.9</td>
179
+ </tr>
180
+ <tr>
181
+ <td>ReasonIR-8B</td>
182
+ <td style="text-align:right">24.4</td>
183
+ <td style="text-align:right">26.2</td>
184
+ <td style="text-align:right">31.4</td>
185
+ <td style="text-align:right">23.3</td>
186
+ <td style="text-align:right">30.0</td>
187
+ <td style="text-align:right">18.0</td>
188
+ <td style="text-align:right"><strong>23.9</strong></td>
189
+ <td style="text-align:right">20.5</td>
190
+ <td style="text-align:right">35.0</td>
191
+ <td style="text-align:right">10.5</td>
192
+ <td style="text-align:right"><strong>14.7</strong></td>
193
+ <td style="text-align:right">31.9</td>
194
+ <td style="text-align:right">27.2</td>
195
+ </tr>
196
+ <tr>
197
+ <td>RaDeR-7B</td>
198
+ <td style="text-align:right">25.5</td>
199
+ <td style="text-align:right">34.6</td>
200
+ <td style="text-align:right">38.9</td>
201
+ <td style="text-align:right">22.1</td>
202
+ <td style="text-align:right">33.0</td>
203
+ <td style="text-align:right">14.8</td>
204
+ <td style="text-align:right">22.5</td>
205
+ <td style="text-align:right">23.7</td>
206
+ <td style="text-align:right">37.3</td>
207
+ <td style="text-align:right">5.0</td>
208
+ <td style="text-align:right">10.2</td>
209
+ <td style="text-align:right">28.4</td>
210
+ <td style="text-align:right">35.1</td>
211
+ </tr>
212
+ <tr>
213
+ <td>Seed1.5-Embedding</td>
214
+ <td style="text-align:right">27.2</td>
215
+ <td style="text-align:right">34.8</td>
216
+ <td style="text-align:right"><strong>46.9</strong></td>
217
+ <td style="text-align:right"><strong>23.4</strong></td>
218
+ <td style="text-align:right">31.6</td>
219
+ <td style="text-align:right">19.1</td>
220
+ <td style="text-align:right">25.4</td>
221
+ <td style="text-align:right">21.0</td>
222
+ <td style="text-align:right"><strong>43.2</strong></td>
223
+ <td style="text-align:right">4.9</td>
224
+ <td style="text-align:right">12.2</td>
225
+ <td style="text-align:right">33.3</td>
226
+ <td style="text-align:right">30.5</td>
227
+ </tr>
228
+ <tr>
229
+ <td>DIVER-Retriever</td>
230
+ <td style="text-align:right"><strong>28.9</strong></td>
231
+ <td style="text-align:right"><strong>41.8</strong></td>
232
+ <td style="text-align:right">43.7</td>
233
+ <td style="text-align:right">21.7</td>
234
+ <td style="text-align:right"><strong>35.3</strong></td>
235
+ <td style="text-align:right"><strong>21.0</strong></td>
236
+ <td style="text-align:right">21.2</td>
237
+ <td style="text-align:right"><strong>25.1</strong></td>
238
+ <td style="text-align:right">37.6</td>
239
+ <td style="text-align:right"><strong>13.2</strong></td>
240
+ <td style="text-align:right">10.7</td>
241
+ <td style="text-align:right"><strong>38.4</strong></td>
242
+ <td style="text-align:right"><strong>37.3</strong></td>
243
+ </tr>
244
+ <tr>
245
+ <td colspan=12 style="text-align:center"><strong>Evaluate Retriever with GPT-4 REASON-query</strong></td>
246
+ </tr>
247
+ <tr>
248
+ <td>BM25</td>
249
+ <td style="text-align:right">27.0</td>
250
+ <td style="text-align:right"><strong>53.6</strong></td>
251
+ <td style="text-align:right"><strong>54.1</strong></td>
252
+ <td style="text-align:right">24.3</td>
253
+ <td style="text-align:right">38.7</td>
254
+ <td style="text-align:right">18.9</td>
255
+ <td style="text-align:right">27.7</td>
256
+ <td style="text-align:right">26.3</td>
257
+ <td style="text-align:right">19.3</td>
258
+ <td style="text-align:right">17.6</td>
259
+ <td style="text-align:right">3.9</td>
260
+ <td style="text-align:right">19.2</td>
261
+ <td style="text-align:right">20.8</td>
262
+ </tr>
263
+ <tr>
264
+ <td>SBERT</td>
265
+ <td style="text-align:right">17.8</td>
266
+ <td style="text-align:right">18.5</td>
267
+ <td style="text-align:right">26.3</td>
268
+ <td style="text-align:right">17.5</td>
269
+ <td style="text-align:right">27.2</td>
270
+ <td style="text-align:right">8.8</td>
271
+ <td style="text-align:right">11.8</td>
272
+ <td style="text-align:right">17.5</td>
273
+ <td style="text-align:right">24.3</td>
274
+ <td style="text-align:right">10.3</td>
275
+ <td style="text-align:right">5.0</td>
276
+ <td style="text-align:right">22.3</td>
277
+ <td style="text-align:right">23.5</td>
278
+ </tr>
279
+ <tr>
280
+ <td>gte-Qwen1.5-7B</td>
281
+ <td style="text-align:right">24.8</td>
282
+ <td style="text-align:right">35.5</td>
283
+ <td style="text-align:right">43.1</td>
284
+ <td style="text-align:right">24.3</td>
285
+ <td style="text-align:right">34.3</td>
286
+ <td style="text-align:right">15.4</td>
287
+ <td style="text-align:right">22.9</td>
288
+ <td style="text-align:right">23.9</td>
289
+ <td style="text-align:right">25.4</td>
290
+ <td style="text-align:right">5.2</td>
291
+ <td style="text-align:right">4.6</td>
292
+ <td style="text-align:right">28.7</td>
293
+ <td style="text-align:right">34.6</td>
294
+ </tr>
295
+ <tr>
296
+ <td>Qwen3-4B</td>
297
+ <td style="text-align:right">5.5</td>
298
+ <td style="text-align:right">1.3</td>
299
+ <td style="text-align:right">17.3</td>
300
+ <td style="text-align:right">2.5</td>
301
+ <td style="text-align:right">6.2</td>
302
+ <td style="text-align:right">1.0</td>
303
+ <td style="text-align:right">4.8</td>
304
+ <td style="text-align:right">4.5</td>
305
+ <td style="text-align:right">3.0</td>
306
+ <td style="text-align:right">5.9</td>
307
+ <td style="text-align:right">0.0</td>
308
+ <td style="text-align:right">7.2</td>
309
+ <td style="text-align:right">12.5</td>
310
+ </tr>
311
+ <tr>
312
+ <td>OpenAI</td>
313
+ <td style="text-align:right">23.3</td>
314
+ <td style="text-align:right">35.2</td>
315
+ <td style="text-align:right">40.1</td>
316
+ <td style="text-align:right">25.1</td>
317
+ <td style="text-align:right">38.0</td>
318
+ <td style="text-align:right">13.6</td>
319
+ <td style="text-align:right">18.2</td>
320
+ <td style="text-align:right">24.2</td>
321
+ <td style="text-align:right">24.5</td>
322
+ <td style="text-align:right">6.5</td>
323
+ <td style="text-align:right">7.7</td>
324
+ <td style="text-align:right">22.9</td>
325
+ <td style="text-align:right">23.8</td>
326
+ </tr>
327
+ <tr>
328
+ <td>Google</td>
329
+ <td style="text-align:right">26.2</td>
330
+ <td style="text-align:right">36.4</td>
331
+ <td style="text-align:right">45.6</td>
332
+ <td style="text-align:right">25.6</td>
333
+ <td style="text-align:right">38.2</td>
334
+ <td style="text-align:right">18.7</td>
335
+ <td style="text-align:right"><strong>29.5</strong></td>
336
+ <td style="text-align:right">17.9</td>
337
+ <td style="text-align:right">31.1</td>
338
+ <td style="text-align:right">3.7</td>
339
+ <td style="text-align:right">10.0</td>
340
+ <td style="text-align:right">27.8</td>
341
+ <td style="text-align:right">30.4</td>
342
+ </tr>
343
+ <tr>
344
+ <td>ReasonIR-8B</td>
345
+ <td style="text-align:right">29.9</td>
346
+ <td style="text-align:right">43.6</td>
347
+ <td style="text-align:right">42.9</td>
348
+ <td style="text-align:right"><strong>32.7</strong></td>
349
+ <td style="text-align:right">38.8</td>
350
+ <td style="text-align:right">20.9</td>
351
+ <td style="text-align:right">25.8</td>
352
+ <td style="text-align:right"><strong>27.5</strong></td>
353
+ <td style="text-align:right">31.5</td>
354
+ <td style="text-align:right"><strong>19.6</strong></td>
355
+ <td style="text-align:right">7.4</td>
356
+ <td style="text-align:right">33.1</td>
357
+ <td style="text-align:right">35.7</td>
358
+ </tr>
359
+ <tr>
360
+ <td>RaDeR-7B</td>
361
+ <td style="text-align:right">29.2</td>
362
+ <td style="text-align:right">36.1</td>
363
+ <td style="text-align:right">42.9</td>
364
+ <td style="text-align:right">25.2</td>
365
+ <td style="text-align:right">37.9</td>
366
+ <td style="text-align:right">16.6</td>
367
+ <td style="text-align:right">27.4</td>
368
+ <td style="text-align:right">25.0</td>
369
+ <td style="text-align:right"><strong>34.8</strong></td>
370
+ <td style="text-align:right">11.9</td>
371
+ <td style="text-align:right"><strong>12.0</strong></td>
372
+ <td style="text-align:right">37.7</td>
373
+ <td style="text-align:right"><strong>43.4</strong></td>
374
+ </tr>
375
+ <tr>
376
+ <td>DIVER-Retriever</td>
377
+ <td style="text-align:right"><strong>32.1</strong></td>
378
+ <td style="text-align:right">51.9</td>
379
+ <td style="text-align:right">53.5</td>
380
+ <td style="text-align:right">29.5</td>
381
+ <td style="text-align:right"><strong>41.2</strong></td>
382
+ <td style="text-align:right"><strong>21.4</strong></td>
383
+ <td style="text-align:right">27.5</td>
384
+ <td style="text-align:right">26.1</td>
385
+ <td style="text-align:right">33.5</td>
386
+ <td style="text-align:right">11.7</td>
387
+ <td style="text-align:right">9.5</td>
388
+ <td style="text-align:right"><strong>39.3</strong></td>
389
+ <td style="text-align:right">39.7</td>
390
+ </tr>
391
+ <tr>
392
+ <td colspan=12 style="text-align:center"><strong>Evaluate retriever with DIVER-QExpand query</strong></td>
393
+ </tr>
394
+ <tr>
395
+ <td>ReasonIR-8B</td>
396
+ <td style="text-align:right">32.6</td>
397
+ <td style="text-align:right">49.4</td>
398
+ <td style="text-align:right">44.7</td>
399
+ <td style="text-align:right">32.4</td>
400
+ <td style="text-align:right">44.0</td>
401
+ <td style="text-align:right">26.6</td>
402
+ <td style="text-align:right">31.8</td>
403
+ <td style="text-align:right">29.0</td>
404
+ <td style="text-align:right">32.3</td>
405
+ <td style="text-align:right">12.8</td>
406
+ <td style="text-align:right">9.1</td>
407
+ <td style="text-align:right"><strong>40.7</strong></td>
408
+ <td style="text-align:right">38.4</td>
409
+ </tr>
410
+ <tr>
411
+ <td>+BM25 (Hybrid)</td>
412
+ <td style="text-align:right">35.7</td>
413
+ <td style="text-align:right">56.8</td>
414
+ <td style="text-align:right">53.5</td>
415
+ <td style="text-align:right"><strong>33.0</strong></td>
416
+ <td style="text-align:right"><strong>48.5</strong></td>
417
+ <td style="text-align:right"><strong>29.4</strong></td>
418
+ <td style="text-align:right"><strong>34.2</strong></td>
419
+ <td style="text-align:right"><strong>32.0</strong></td>
420
+ <td style="text-align:right"><strong>35.2</strong></td>
421
+ <td style="text-align:right">16.8</td>
422
+ <td style="text-align:right">12.9</td>
423
+ <td style="text-align:right">39.3</td>
424
+ <td style="text-align:right">36.8</td>
425
+ </tr>
426
+ <tr>
427
+ <td>DIVER-Retriever</td>
428
+ <td style="text-align:right"><strong>33.9</strong></td>
429
+ <td style="text-align:right">54.5</td>
430
+ <td style="text-align:right">52.7</td>
431
+ <td style="text-align:right">28.8</td>
432
+ <td style="text-align:right">44.9</td>
433
+ <td style="text-align:right">25.1</td>
434
+ <td style="text-align:right">27.4</td>
435
+ <td style="text-align:right">29.5</td>
436
+ <td style="text-align:right">34.5</td>
437
+ <td style="text-align:right">10.0</td>
438
+ <td style="text-align:right">14.5</td>
439
+ <td style="text-align:right"><strong>40.7</strong></td>
440
+ <td style="text-align:right">44.7</td>
441
+ </tr>
442
+ <tr>
443
+ <td>+BM25 (Hybrid)</td>
444
+ <td style="text-align:right"><strong>37.2</strong></td>
445
+ <td style="text-align:right"><strong>60.0</strong></td>
446
+ <td style="text-align:right"><strong>55.9</strong></td>
447
+ <td style="text-align:right">31.8</td>
448
+ <td style="text-align:right">47.9</td>
449
+ <td style="text-align:right">27.1</td>
450
+ <td style="text-align:right">33.9</td>
451
+ <td style="text-align:right">31.9</td>
452
+ <td style="text-align:right">35.1</td>
453
+ <td style="text-align:right"><strong>23.1</strong></td>
454
+ <td style="text-align:right"><strong>16.8</strong></td>
455
+ <td style="text-align:right">36.9</td>
456
+ <td style="text-align:right"><strong>46.6</strong></td>
457
+ </tr>
458
+ </tbody>
459
+ </table>
460
+
461
+
462
+ ## Usage
463
+
464
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
465
+
466
+ ### Inference
467
+
468
+ #### Sentence Transformers Usage
469
+
470
+ ```bash
471
+ # Requires transformers>=4.51.0
472
+ # Requires sentence-transformers>=2.7.0
473
+
474
+
475
+ from sentence_transformers import SentenceTransformer
476
+
477
+ # Load the model
478
+ model = SentenceTransformer("AQ-MedAI/Diver-Retriever-4B-1020")
479
+
480
+
481
+ # The queries and documents to embed
482
+ queries = [
483
+ "What is the capital of China?",
484
+ "Explain gravity",
485
+ ]
486
+ documents = [
487
+ "The capital of China is Beijing.",
488
+ "Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun.",
489
+ ]
490
+
491
+ # Encode the queries and documents. Note that queries benefit from using a prompt
492
+ # Here we use the prompt called "query" stored under `model.prompts`, but you can
493
+ # also pass your own prompt via the `prompt` argument
494
+ query_embeddings = model.encode(queries, prompt_name="query")
495
+ document_embeddings = model.encode(documents)
496
+
497
+ # Compute the (cosine) similarity between the query and document embeddings
498
+ similarity = model.similarity(query_embeddings, document_embeddings)
499
+ print(similarity)
500
+
501
+ ```
502
+ #### Transformers Usage
503
+
504
+ ```bash
505
+ # Requires transformers>=4.51.0
506
+ import torch
507
+ import torch.nn.functional as F
508
+
509
+ from torch import Tensor
510
+ from transformers import AutoTokenizer, AutoModel
511
+
512
+
513
+ def last_token_pool(last_hidden_states: Tensor,
514
+ attention_mask: Tensor) -> Tensor:
515
+ left_padding = (attention_mask[:, -1].sum() == attention_mask.shape[0])
516
+ if left_padding:
517
+ return last_hidden_states[:, -1]
518
+ else:
519
+ sequence_lengths = attention_mask.sum(dim=1) - 1
520
+ batch_size = last_hidden_states.shape[0]
521
+ return last_hidden_states[torch.arange(batch_size, device=last_hidden_states.device), sequence_lengths]
522
+
523
+
524
+ def get_detailed_instruct(task_description: str, query: str) -> str:
525
+ return f'Instruct: {task_description}\nQuery:{query}'
526
+
527
+ # Each query must come with a one-sentence instruction that describes the task
528
+ task = 'Given a web search query, retrieve relevant passages that answer the query'
529
+
530
+ queries = [
531
+ get_detailed_instruct(task, 'What is the capital of China?'),
532
+ get_detailed_instruct(task, 'Explain gravity')
533
+ ]
534
+ # No need to add instructions for retrieval documents
535
+ documents = [
536
+ "The capital of China is Beijing.",
537
+ "Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun."
538
+ ]
539
+ input_texts = queries + documents
540
+
541
+ tokenizer = AutoTokenizer.from_pretrained('AQ-MedAI/Diver-Retriever-4B-1020', padding_side='left')
542
+ model = AutoModel.from_pretrained('AQ-MedAI/Diver-Retriever-4B-1020')
543
+
544
+
545
+ max_length = 8192
546
+
547
+ # Tokenize the input texts
548
+ batch_dict = tokenizer(
549
+ input_texts,
550
+ padding=True,
551
+ truncation=True,
552
+ max_length=max_length,
553
+ return_tensors="pt",
554
+ )
555
+ batch_dict.to(model.device)
556
+ outputs = model(**batch_dict)
557
+ embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask'])
558
+
559
+ # normalize embeddings
560
+ embeddings = F.normalize(embeddings, p=2, dim=1)
561
+ scores = (embeddings[:2] @ embeddings[2:].T)
562
+ print(scores.tolist())
563
+ # [[0.9319270849227905, 0.5878604054450989], [0.639923095703125, 0.7950234413146973]]
564
+
565
+ ```
566
+
567
+
568
+
569
+ ### Finetuning
570
+ We recommend you to use [swift](https://github.com/modelscope/ms-swift) to finetune our DIVER-Retriever-4B-1020 with infonce.
571
+
572
+ Before starting training, please ensure your environment is properly configured.
573
+
574
+ ```bash
575
+ pip install ms-swift -U
576
+ # Install from source
577
+ pip install git+https://github.com/modelscope/ms-swift.git
578
+
579
+ pip install transformers -U
580
+
581
+ # Optional packages
582
+ pip install deepspeed # multi-GPU training
583
+ pip install liger-kernel # save GPU memory resources
584
+ pip install flash-attn --no-build-isolation
585
+ ```
586
+
587
+ #### Training Command
588
+
589
+ Using the infonce loss as an example, the complete training command is as follows:
590
+
591
+ ```bash
592
+ nproc_per_node=8
593
+ NPROC_PER_NODE=$nproc_per_node \
594
+ swift sft \
595
+ --model DIVER/DIVER-Retriever-4B-1020 \
596
+ --task_type embedding \
597
+ --model_type qwen3_emb \
598
+ --train_type full \
599
+ --dataset your_dataset \
600
+ --split_dataset_ratio 0.05 \
601
+ --eval_strategy steps \
602
+ --output_dir output \
603
+ --eval_steps 20 \
604
+ --num_train_epochs 5 \
605
+ --save_steps 20 \
606
+ --per_device_train_batch_size 4 \
607
+ --per_device_eval_batch_size 4 \
608
+ --gradient_accumulation_steps 4 \
609
+ --learning_rate 6e-6 \
610
+ --loss_type infonce \
611
+ --label_names labels \
612
+ --dataloader_drop_last true \
613
+ --deepspeed zero3
614
+ ```
615
+
616
+
617
+
618
+
619
+
620
+
621
+
622
+
623
+ ## Citation
624
+
625
+ <!-- If a paper or blog post is introducing the model, the APA and BibTeX information for that should go in this section. -->
626
+ If you find our work helpful, feel free to cite it.
627
+
628
+ ```
629
+ @misc{long2025divermultistageapproachreasoningintensive,
630
+ title={DIVER: A Multi-Stage Approach for Reasoning-intensive Information Retrieval},
631
+ author={Meixiu Long and Duolin Sun and Dan Yang and Junjie Wang and Yue Shen and Jian Wang and Peng Wei and Jinjie Gu and Jiahai Wang},
632
+ year={2025},
633
+ eprint={2508.07995},
634
+ archivePrefix={arXiv},
635
+ primaryClass={cs.IR},
636
+ url={https://arxiv.org/abs/2508.07995},
637
+ }
638
+ ```