How can I best use Hugging Face’s transformer models to auto-generate quiz questions or reading comprehension tasks for students in an education app?

Suhebmultani · September 29, 2025, 9:55am

How can I use Hugging Face AI models (like BERT, GPT, T5, etc.) to automatically create quiz questions or reading comprehension exercises for students inside an education app?

John6666 · September 29, 2025, 10:35am

For example, if you were to use the T5 model, it would look something like this.

"""
Simple QG → QA-verify → MCQ demo using Hugging Face Transformers.

Models:
- QG (end-to-end): valhalla/t5-base-e2e-qg
  https://huggingface.co/valhalla/t5-base-e2e-qg
- QA verifier (extractive, SQuAD2): deepset/roberta-base-squad2
  https://huggingface.co/deepset/roberta-base-squad2
- Distractor generator (optional): voidful/bart-distractor-generation-both
  https://huggingface.co/voidful/bart-distractor-generation-both

Transformers pipelines docs:
https://huggingface.co/docs/transformers/en/main_classes/pipelines
"""

from typing import List, Dict
from transformers import pipeline

# 1) Load pipelines (cache by default). Adjust device_map if you have a GPU.
qg = pipeline("text2text-generation",
              model="valhalla/t5-base-e2e-qg",            # e2e QG from paragraph
              max_new_tokens=64)

qa = pipeline("question-answering",
              model="deepset/roberta-base-squad2")        # verifies answerability

# Optional distractor generator. Comment out if you prefer retrieval-based distractors.
try:
    dg = pipeline("text2text-generation",
                  model="voidful/bart-distractor-generation-both",
                  max_new_tokens=24)
except Exception:
    dg = None  # falls back to simple distractors later


def _split_e2e_output(text: str) -> List[str]:
    """
    Many e2e QG checkpoints return multiple questions separated by tokens/newlines.
    This splitter is conservative. Tweak for your chosen model’s formatting.
    """
    parts = []
    for sep in ["<sep>", "\n", " ? ", "? "]:
        if sep in text:
            parts = [p.strip() for p in text.split(sep)]
            break
    if not parts:
        parts = [text.strip()]
    # Re-append '?' where missing.
    return [p if p.endswith("?") else (p + "?") for p in parts if p]


def generate_items(context: str, max_q: int = 5) -> List[Dict]:
    """
    Input: passage string.
    Output: list of dicts with {question, answer, distractors}.
    """
    # Stage 1: propose questions
    out = qg(context, do_sample=False)
    raw = out[0]["generated_text"] if isinstance(out, list) else out["generated_text"]
    questions = _split_e2e_output(raw)[:max_q]

    items = []
    for q in questions:
        # Stage 2: verify with extractive reader
        pred = qa(question=q, context=context)
        ans = pred.get("answer", "").strip()
        conf = float(pred.get("score", 0.0))

        # Keep only confident, non-empty answers
        if not ans or conf < 0.35:  # tune threshold per your QA model
            continue

        # Stage 3: distractors (seq2seq) or a trivial fallback
        distractors = []
        if dg is not None:
            d_in = f"question: {q}  answer: {ans}  context: {context}"
            try:
                d = dg(d_in, do_sample=False)[0]["generated_text"].strip()
                if d and d.lower() != ans.lower():
                    distractors.append(d)
            except Exception:
                pass

        if not distractors:
            # naive fallback: take distinct words from context far from the answer
            # Replace with retrieval-based candidates in production.
            tokens = [t.strip(",.;:()") for t in context.split() if len(t) > 3]
            distractors = list({t for t in tokens if t.lower() not in ans.lower()})[:3]

        items.append({
            "question": q,
            "answer": ans,
            "distractors": distractors[:3]
        })
    return items


if __name__ == "__main__":
    sample = (
        "The Moon is Earth's only natural satellite. It formed about 4.5 billion years ago, "
        "likely after a giant impact. The average distance to the Moon is about 384,400 kilometers, "
        "and its gravitational influence causes ocean tides on Earth."
    )
    mcqs = generate_items(sample, max_q=4)
    for i, it in enumerate(mcqs, 1):
        print(f"\nQ{i}. {it['question']}")
        print(f"   A: {it['answer']}")
        for j, d in enumerate(it['distractors'], 1):
            print(f"   D{j}: {d}")
"""
Q1. What is Earth's only natural satellite?
   A: The Moon
   D1: C E n the E , , G G G

Q2. How long ago did the Moon form?
   A: 4.5 billion years ago
   D1: C E n n n , , G G G

Q3. What is the average distance to the Moon?
   A: 384,400 kilometers
   D1: The E n n n , , , G
"""

Topic		Replies	Views
Answer template generation from question 🤗Transformers	0	231	November 11, 2023
Create your LLM model Beginners	1	2573	December 9, 2024
Simple generative question answering with context Beginners	5	3184	August 16, 2024
Generate questions from a given context Beginners	0	629	October 19, 2023
Train a t5 model 🤗Transformers	1	263	September 4, 2023

How can I best use Hugging Face’s transformer models to auto-generate quiz questions or reading comprehension tasks for students in an education app?

Related topics