You need to agree to share your contact information to access this model
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
Access Request Terms:
By requesting access to the SensAI model, you confirm that you:
- will use the materials solely for research and non-commercial purposes;
- will cite the SensAI project and respect the CC-BY-NC-4.0 License;
- will not attempt to extract, infer, or reconstruct data from the model or dataset;
- will ensure that your downstream use complies with applicable laws, regulations, and ethical AI principles.
Log in or Sign Up to review the conditions and access this model content.
Qwen3-0.6B QLoRA Adapter
This repository contains a QLoRA adpater for Qwen3-0.6B.
Model was fine-tuned for instruction-following task ("extract metadata") using synthetic dataset called synthetic-queries-and-ml-instructions.
Description
- Base model: Qwen3-0.6B
- Adapter: QLoRA
- Task: extract metadata
- Quantization: 4-bit
- GPU: NVIDIA RTX 3090
- Dataset: synthetic-queries-and-ml-instructions.
LoRA Configuration
| Parameter | Value |
|---|---|
r |
32 |
lora_alpha |
16 |
lora_dropout |
0.05 |
bias |
none |
task_type |
CAUSAL_LM |
target_modules |
q_proj, k_proj, v_proj, o_proj |
Training Parameters
| Parameter | Value |
|---|---|
| Max sequence length | 4096 |
| Epochs | 2 |
| Learning rate | 7e-4 |
| Train batch size per device | 4 |
| Eval batch size per device | 8 |
| Gradient accumulation steps | 4 |
| Optimizer | paged_adamw_8bit |
| Scheduler | cosine |
| Warmup ratio | 0.03 |
| FP16 | True |
| Save & Eval steps | 100 |
| Early stopping patience | 3 |
Evaluation Results
We evaluated the Qwen3-0.6B model and our fine-tuned model on the test part (1.5K rows) of the dataset mentioned in the description.
| Metric | Qwen3-0.6B | Qwen3-0.6B + QLoRA (Fine-tuned) |
|---|---|---|
| Invalidly parsed (%) | 47.8 | 0.27 |
| Complete accuracy (%) | 0.47 | 80.6 |
| Missing attributes (%) | 41.73 | 7.93 |
| Extra attributes (%) | 32.27 | 6.53 |
| Incorrect attributes (%) | 41.4 | 5.4 |
- Invalidly parsed: The percentage of examples where the model output had invalid/missing JSON format
- Complete accuracy: The percentage of examples where all attributes in the output matched the ground truth attributes
- Missing attributes: The percentage of examples where the model output is missing at least one attribute that is present in the ground truth example
- Extra attributes: The percentage of examples where the model output contains attributes which are not present in ground truth example
- Incorrect attributes: The percentage of examples where the model output has incorrect attributes compared to ground truth example.
The percentages for missing, extra and incorrect attributes may exceed 100% in total, since a single example can fall into multiple categories simultaneously. For instance, a model output could omit a required attribute (missing) while also adding an irrelevant one (extra).
How to use
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
base_model_name = "Qwen/Qwen3-0.6B"
ft_model_name = "kinit/qwen3-0.6B-extract-ml-instructions"
tokenizer = AutoTokenizer.from_pretrained(ft_model_name)
# Load base model with specific quantization set up first
quant_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
base_model = AutoModelForCausalLM.from_pretrained(
base_model_name,
quantization_config=quant_config,
device_map="auto"
)
# Load LoRA adapter weights into the base model
model = PeftModel.from_pretrained(base_model, ft_model_name).eval()
# Preprocess input
prompt = "Chcem realizovať klasifikáciu ŠPZ čísel áut pomocou CNN architektúry za pomoci datasetu 'LicencePlates_ImageDataset'."
messages = [
{"role": "user", "content": f"User query: {prompt}"}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
# Perform inference
generated_ids = model.generate(
**model_inputs,
max_new_tokens=1_024,
temperature=0.25
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
# Model response containing a tool call that needs to be parsed next
# Tool call arguments represent the extracted metadata
model_response = tokenizer.decode(output_ids, skip_special_tokens=True)
print(model_response)
Additional Information
This work was supported by the Výskumná Agentúra grant within the project SensAI - Morálna citlivosť a ľudské práva pre spracovanie jazykov s obmedzenými zdrojmi (Grant No. 09I01-03-V04-00100/2025/VA).
License & Attribution
This model was created within the SensAI project and is released under the CC-BY-NC-4.0 License. It is derivative of Qwen3-0.6B model with license: Apache license 2.0
Access Request Terms
Access to this repository is restricted. Please review and agree to the following terms before requesting access.