OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models Paper • 2505.21347 • Published May 27, 2025 • 1
Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers Paper • 2506.10887 • Published Jun 12, 2025 • 1