Add/update the quantized ONNX model files and README.md for Transformers.js v3
Browse files## Applied Quantizations
### β
Based on `decoder_model.onnx` *with* slimming
β³ β
`fp16`: `decoder_model_fp16.onnx` (added)
β³ β
`int8`: `decoder_model_int8.onnx` (added)
β³ β
`uint8`: `decoder_model_uint8.onnx` (added)
β³ β
`q4`: `decoder_model_q4.onnx` (added)
β³ β
`q4f16`: `decoder_model_q4f16.onnx` (added)
β³ β
`bnb4`: `decoder_model_bnb4.onnx` (added)
### β
Based on `decoder_with_past_model.onnx` *with* slimming
β³ β
`fp16`: `decoder_with_past_model_fp16.onnx` (added)
β³ β
`int8`: `decoder_with_past_model_int8.onnx` (added)
β³ β
`uint8`: `decoder_with_past_model_uint8.onnx` (added)
β³ β
`q4`: `decoder_with_past_model_q4.onnx` (added)
β³ β
`q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
β³ β
`bnb4`: `decoder_with_past_model_bnb4.onnx` (added)
### β Based on `decoder_model_merged.onnx` *with* slimming
**The base model `decoder_model_merged.onnx` has been renamed to `model.onnx`.**
```
None
```
β³ β `fp16`: `` (added but JS-based E2E test failed)
```
[0;93m2025-08-29 07:17:35.684735812 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_fp16.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
```
β³ β `int8`: `` (added but JS-based E2E test failed)
```
[0;93m2025-08-29 07:17:46.341049068 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_int8.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
```
β³ β `uint8`: `` (added but JS-based E2E test failed)
```
[0;93m2025-08-29 07:17:57.546248750 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_uint8.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
```
β³ β `q4`: `` (added but JS-based E2E test failed)
```
[0;93m2025-08-29 07:18:06.398696360 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_q4.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
```
β³ β `q4f16`: `` (added but JS-based E2E test failed)
```
[0;93m2025-08-29 07:18:12.608451741 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_q4f16.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
```
β³ β `bnb4`: `` (added but JS-based E2E test failed)
```
[0;93m2025-08-29 07:18:21.406956044 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp8a26ml89/c1d88f6743334a1200e4948ef46f545acfa96568/onnx/model_bnb4.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
```
- .gitattributes +2 -0
- README.md +16 -0
- onnx/decoder_model_bnb4.onnx +3 -0
- onnx/decoder_model_fp16.onnx +3 -0
- onnx/decoder_model_fp16.onnx_data +3 -0
- onnx/decoder_model_int8.onnx +3 -0
- onnx/decoder_model_q4.onnx +3 -0
- onnx/decoder_model_q4f16.onnx +3 -0
- onnx/decoder_model_uint8.onnx +3 -0
- onnx/decoder_with_past_model_bnb4.onnx +3 -0
- onnx/decoder_with_past_model_fp16.onnx +3 -0
- onnx/decoder_with_past_model_fp16.onnx_data +3 -0
- onnx/decoder_with_past_model_int8.onnx +3 -0
- onnx/decoder_with_past_model_q4.onnx +3 -0
- onnx/decoder_with_past_model_q4f16.onnx +3 -0
- onnx/decoder_with_past_model_uint8.onnx +3 -0
|
@@ -38,3 +38,5 @@ Constant_37_attr__value filter=lfs diff=lfs merge=lfs -text
|
|
| 38 |
onnx/decoder_model_merged.onnx_data filter=lfs diff=lfs merge=lfs -text
|
| 39 |
onnx/decoder_model.onnx_data filter=lfs diff=lfs merge=lfs -text
|
| 40 |
onnx/decoder_with_past_model.onnx_data filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
| 38 |
onnx/decoder_model_merged.onnx_data filter=lfs diff=lfs merge=lfs -text
|
| 39 |
onnx/decoder_model.onnx_data filter=lfs diff=lfs merge=lfs -text
|
| 40 |
onnx/decoder_with_past_model.onnx_data filter=lfs diff=lfs merge=lfs -text
|
| 41 |
+
onnx/decoder_model_fp16.onnx_data filter=lfs diff=lfs merge=lfs -text
|
| 42 |
+
onnx/decoder_with_past_model_fp16.onnx_data filter=lfs diff=lfs merge=lfs -text
|
|
@@ -5,4 +5,20 @@ library_name: transformers.js
|
|
| 5 |
|
| 6 |
https://huggingface.co/bigcode/starcoderbase-1b with ONNX weights to be compatible with Transformers.js.
|
| 7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [π€ Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
|
|
|
| 5 |
|
| 6 |
https://huggingface.co/bigcode/starcoderbase-1b with ONNX weights to be compatible with Transformers.js.
|
| 7 |
|
| 8 |
+
## Usage (Transformers.js)
|
| 9 |
+
|
| 10 |
+
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
|
| 11 |
+
```bash
|
| 12 |
+
npm i @huggingface/transformers
|
| 13 |
+
```
|
| 14 |
+
|
| 15 |
+
**Example:** Text generation.
|
| 16 |
+
|
| 17 |
+
```js
|
| 18 |
+
import { pipeline } from '@huggingface/transformers';
|
| 19 |
+
|
| 20 |
+
const generator = await pipeline('text-generation', 'Xenova/starcoderbase-1b');
|
| 21 |
+
const output = await generator('Once upon a time, there was', { max_new_tokens: 10 });
|
| 22 |
+
```
|
| 23 |
+
|
| 24 |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [π€ Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:41ce7f958b4ea71a7f47343181743d9b73091a74884a6cad21d5d8c49560734e
|
| 3 |
+
size 1112821306
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0093903358660e2de176867be0bdfce98392ffe514ba9f2c39ffa8bf36a13bab
|
| 3 |
+
size 445813
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:60122c3a6ba64eeba38d9cebfe7cb691e73233faad3d371aa08b0276ee070c48
|
| 3 |
+
size 2341523456
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:29f8763cf36c657ef6e98f1efcbed326024d5e2e888a38091d73d26268b162b9
|
| 3 |
+
size 1609154750
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e81d2359584d9193dd607649c13b5927f6bfb8b9be4f540062e5366affa14e7e
|
| 3 |
+
size 1176521554
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:453c6080a866013bda898709c9f53a8341aadff0b75169945c1e05275bfafa14
|
| 3 |
+
size 876838625
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:16f9755cc340b2919210c16b724636e951168669b39183ede016f0f18beedc96
|
| 3 |
+
size 1609154793
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:de48b11ee187e2420804e34147f90585e9b1ab21aa22c0f2038fd4623692b8f5
|
| 3 |
+
size 1112832406
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7d3330cd6c0b3ad07c91bc428109244f7388cea43dbb414527c3ba760b1c05d0
|
| 3 |
+
size 464777
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b1ef891e379f308ffb662843c68ee016097f0a8f36fd8fca78d9436cbe1abf1a
|
| 3 |
+
size 2341523456
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:eb8f054437fbc4400d98d62bee50890a55538254bd831f33d54eaf8ec88db1c0
|
| 3 |
+
size 1609165850
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:89c1794a8fd98b4b943685019dfab4259ad35bc038d0094957f7d2d6389a586f
|
| 3 |
+
size 1176532654
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f304ab5eb3d42c23e36a0ed037f77fb685df82effaee7d1d59c1a02460af47b9
|
| 3 |
+
size 876854563
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:af5e8378c8e4eb009d9cf2c51259a7dd1f4e094948e8d0e6e37ffb80848d7b33
|
| 3 |
+
size 1609165893
|