Qwen3-8B SFT LMSYS (Baseline)
This is the SFT baseline model for comparison with the DPO version.
Training Details
- Base Model: unsloth/Qwen3-8B-4bit
- Training Method: Supervised Fine-Tuning (SFT)
- Dataset: LMSYS Arena Human Preference 55k (chosen responses only)
- Training Steps: 60
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("subbuc/qwen3-8b-sft-lmsys")
tokenizer = AutoTokenizer.from_pretrained("subbuc/qwen3-8b-sft-lmsys")
# Your inference code here
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support