whitphx HF Staff commited on
Commit
ed9213e
Β·
verified Β·
1 Parent(s): c1d88f6

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### βœ… Based on `decoder_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_bnb4.onnx` (added)

### βœ… Based on `decoder_with_past_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_with_past_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_with_past_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_with_past_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_with_past_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_with_past_model_bnb4.onnx` (added)

### ❌ Based on `decoder_model_merged.onnx` *with* slimming

**The base model `decoder_model_merged.onnx` has been renamed to `model.onnx`.**

```
None
```
↳ ❌ `fp16`: `` (added but JS-based E2E test failed)
```
2025-08-29 07:17:35.684735812 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_fp16.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ ❌ `int8`: `` (added but JS-based E2E test failed)
```
2025-08-29 07:17:46.341049068 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_int8.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ ❌ `uint8`: `` (added but JS-based E2E test failed)
```
2025-08-29 07:17:57.546248750 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_uint8.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ ❌ `q4`: `` (added but JS-based E2E test failed)
```
2025-08-29 07:18:06.398696360 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_q4.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ ❌ `q4f16`: `` (added but JS-based E2E test failed)
```
2025-08-29 07:18:12.608451741 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_q4f16.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ ❌ `bnb4`: `` (added but JS-based E2E test failed)
```
2025-08-29 07:18:21.406956044 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_bnb4.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```

.gitattributes CHANGED
@@ -38,3 +38,5 @@ Constant_37_attr__value filter=lfs diff=lfs merge=lfs -text
38
  onnx/decoder_model_merged.onnx_data filter=lfs diff=lfs merge=lfs -text
39
  onnx/decoder_model.onnx_data filter=lfs diff=lfs merge=lfs -text
40
  onnx/decoder_with_past_model.onnx_data filter=lfs diff=lfs merge=lfs -text
 
 
 
38
  onnx/decoder_model_merged.onnx_data filter=lfs diff=lfs merge=lfs -text
39
  onnx/decoder_model.onnx_data filter=lfs diff=lfs merge=lfs -text
40
  onnx/decoder_with_past_model.onnx_data filter=lfs diff=lfs merge=lfs -text
41
+ onnx/decoder_model_fp16.onnx_data filter=lfs diff=lfs merge=lfs -text
42
+ onnx/decoder_with_past_model_fp16.onnx_data filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -5,4 +5,20 @@ library_name: transformers.js
5
 
6
  https://huggingface.co/bigcode/starcoderbase-1b with ONNX weights to be compatible with Transformers.js.
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
5
 
6
  https://huggingface.co/bigcode/starcoderbase-1b with ONNX weights to be compatible with Transformers.js.
7
 
8
+ ## Usage (Transformers.js)
9
+
10
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
11
+ ```bash
12
+ npm i @huggingface/transformers
13
+ ```
14
+
15
+ **Example:** Text generation.
16
+
17
+ ```js
18
+ import { pipeline } from '@huggingface/transformers';
19
+
20
+ const generator = await pipeline('text-generation', 'Xenova/starcoderbase-1b');
21
+ const output = await generator('Once upon a time, there was', { max_new_tokens: 10 });
22
+ ```
23
+
24
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
onnx/decoder_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:41ce7f958b4ea71a7f47343181743d9b73091a74884a6cad21d5d8c49560734e
3
+ size 1112821306
onnx/decoder_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0093903358660e2de176867be0bdfce98392ffe514ba9f2c39ffa8bf36a13bab
3
+ size 445813
onnx/decoder_model_fp16.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:60122c3a6ba64eeba38d9cebfe7cb691e73233faad3d371aa08b0276ee070c48
3
+ size 2341523456
onnx/decoder_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:29f8763cf36c657ef6e98f1efcbed326024d5e2e888a38091d73d26268b162b9
3
+ size 1609154750
onnx/decoder_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e81d2359584d9193dd607649c13b5927f6bfb8b9be4f540062e5366affa14e7e
3
+ size 1176521554
onnx/decoder_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:453c6080a866013bda898709c9f53a8341aadff0b75169945c1e05275bfafa14
3
+ size 876838625
onnx/decoder_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:16f9755cc340b2919210c16b724636e951168669b39183ede016f0f18beedc96
3
+ size 1609154793
onnx/decoder_with_past_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:de48b11ee187e2420804e34147f90585e9b1ab21aa22c0f2038fd4623692b8f5
3
+ size 1112832406
onnx/decoder_with_past_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7d3330cd6c0b3ad07c91bc428109244f7388cea43dbb414527c3ba760b1c05d0
3
+ size 464777
onnx/decoder_with_past_model_fp16.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1ef891e379f308ffb662843c68ee016097f0a8f36fd8fca78d9436cbe1abf1a
3
+ size 2341523456
onnx/decoder_with_past_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eb8f054437fbc4400d98d62bee50890a55538254bd831f33d54eaf8ec88db1c0
3
+ size 1609165850
onnx/decoder_with_past_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:89c1794a8fd98b4b943685019dfab4259ad35bc038d0094957f7d2d6389a586f
3
+ size 1176532654
onnx/decoder_with_past_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f304ab5eb3d42c23e36a0ed037f77fb685df82effaee7d1d59c1a02460af47b9
3
+ size 876854563
onnx/decoder_with_past_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:af5e8378c8e4eb009d9cf2c51259a7dd1f4e094948e8d0e6e37ffb80848d7b33
3
+ size 1609165893