Superposition Yields Robust Neural Scaling → https://neurips.cc/virtual/2025/loc/san-diego/poster/116346
Controlling superposition in toy models and checking real LLMs, researchers show that strong superposition naturally creates the familiar “bigger model = lower loss” power laws, explaining when scaling laws work and when they might failWhy Diffusion Models Don’t Memorize: The Role of Implicit Dynamical Regularization in Training → https://neurips.cc/virtual/2025/loc/san-diego/poster/119372
Shows diffusion models hit an early “good samples” phase and a later memorization phase. Larger datasets widen the "generalization window," avoiding overfitting much longer and revealing implicit regularizationDoes RL Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? → https://neurips.cc/virtual/2025/loc/san-diego/poster/119944
Explains that while RLVR makes models better at finding correct answers efficiently, it doesn’t create really new reasoning abilities. RLVR models mostly reuse patterns already present in the base model, highlighting the need for better RL to unlock reasoning gains1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities → https://neurips.cc/virtual/2025/loc/san-diego/poster/115731
Simply making RL models way deeper, up to 1024 layers, can massively improve self-supervised RL, letting agents learn far better behaviors from scratch and boosting performance by 2-50× on locomotion and manipulation tasksTitans + MIRAS: Helping AI have long-term memory → https://research.google/blog/titans-miras-helping-ai-have-long-term-memory/
Titans is a new architecture with a deep MLP memory that updates itself during inference using a “surprise” signal, letting the model keep important info, forget noise, and handle million-token contexts with RNN-like speed and Transformer-like accuracyGenerative Data Augmentation via Diffusion Distillation, Adversarial Alignment, and Importance Reweighting → https://neurips.cc/virtual/2025/loc/san-diego/poster/116854
Introduces DAR-GDA, which distills diffusion models into a fast one-step generator, aligns them with real data via adversarial training, and reweights synthetic samples to remove biasSlow Transition to Low-Dimensional Chaos in Heavy-Tailed RNNs → https://arxiv.org/abs/2505.09816
Shows that RNNs with brain-like heavy-tailed weights don’t behave like Gaussian ones. They shift and widen the edge-of-chaos transition but reduce the system’s effective dimensionality.Evaluating multiple models using labeled and unlabeled data → https://arxiv.org/abs/2501.11866
Introduces Semi-Supervised Model Evaluation (SSME), a way to evaluate classifiers using both labeled and unlabeled data by modeling how predictions relate to true labels, giving far more accurate performance estimates when labeled data is limitedRiemannian Consistency Model → https://arxiv.org/abs/2510.00983
Extends consistency models to curved spaces, enabling few-step generation that stays on the manifold, using exponential maps and covariant derivatives, and works well on spheres, tori, and 3D rotationsBioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model → https://arxiv.org/abs/2505.23579
BioReason links a DNA model with an LLM so the LLM can reason over genomic data, yielding clear biological explanations and strong accuracy gains on pathway and variant prediction tasksNFL-BA: Near-Field Light Bundle Adjustment for SLAM in Dynamic Lighting → https://asdunnbe.github.io/NFL-BA/NeurIPS2025_NFL_BA.pdf
Introduces NFL-BA, a SLAM loss that models near-field lighting so systems work better in settings like endoscopy or dark indoor scenes, yielding large improvements in camera tracking and mapping
Ksenia Se
Kseniase
AI & ML interests
None yet
Recent Activity
replied to
their
post
about 4 hours ago
15 Outstanding Research Papers from NeurIPS 2025
NeurIPS 2025, as a premier annual event in machine learning and computational neuroscience, tackles major topics like the future of AI, current research, and the most difficult challenges. While we’re not attending this year, we’re closely following the updates and today we pull together a quick, easy-to-digest roundup of a few standout papers so you can jump in without getting overwhelmed.
Here is a list of 15 papers from NeurIPS 2025, including 8 top research papers that received awards, along with 7 others that caught our attention:
1. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks → https://neurips.cc/virtual/2025/loc/san-diego/test-of-time/128328
Test of Time Award winner. Introduces the RPN, a small convnet that predicts objectness and boxes on shared features, enabling Faster R-CNN to share computation and run around 5 fps on a GPU
2. Artificial Hivemind: The Open-Ended Homogeneity of LMs (and Beyond) → https://neurips.cc/virtual/2025/loc/san-diego/poster/121421
Releases a huge open-ended prompt dataset, showing that LLMs often fall into an “artificial hivemind” – generate surprisingly similar answers – and measuring diversity collapse
3. Optimal Mistake Bounds for Transductive Online Learning → https://neurips.cc/virtual/2025/loc/san-diego/poster/119098
Settles a 30-year-old question by showing how much unlabeled data helps in online learning – it gives a precise quadratic advantage with tight matching bounds
4. Gated Attention for LLMs: Non-linearity, Sparsity, and Attention-Sink-Free → https://neurips.cc/virtual/2025/loc/san-diego/poster/120216
Demonstrates how gating actually affects attention: a simple sigmoid gate after Scaled Dot-Product Attention (SDPA) boosts performance, stability, and long-context behavior by adding useful nonlinearity and sparse modulation
Read further below ⬇️
Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe
posted
an
update
about 4 hours ago
15 Outstanding Research Papers from NeurIPS 2025
NeurIPS 2025, as a premier annual event in machine learning and computational neuroscience, tackles major topics like the future of AI, current research, and the most difficult challenges. While we’re not attending this year, we’re closely following the updates and today we pull together a quick, easy-to-digest roundup of a few standout papers so you can jump in without getting overwhelmed.
Here is a list of 15 papers from NeurIPS 2025, including 8 top research papers that received awards, along with 7 others that caught our attention:
1. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks → https://neurips.cc/virtual/2025/loc/san-diego/test-of-time/128328
Test of Time Award winner. Introduces the RPN, a small convnet that predicts objectness and boxes on shared features, enabling Faster R-CNN to share computation and run around 5 fps on a GPU
2. Artificial Hivemind: The Open-Ended Homogeneity of LMs (and Beyond) → https://neurips.cc/virtual/2025/loc/san-diego/poster/121421
Releases a huge open-ended prompt dataset, showing that LLMs often fall into an “artificial hivemind” – generate surprisingly similar answers – and measuring diversity collapse
3. Optimal Mistake Bounds for Transductive Online Learning → https://neurips.cc/virtual/2025/loc/san-diego/poster/119098
Settles a 30-year-old question by showing how much unlabeled data helps in online learning – it gives a precise quadratic advantage with tight matching bounds
4. Gated Attention for LLMs: Non-linearity, Sparsity, and Attention-Sink-Free → https://neurips.cc/virtual/2025/loc/san-diego/poster/120216
Demonstrates how gating actually affects attention: a simple sigmoid gate after Scaled Dot-Product Attention (SDPA) boosts performance, stability, and long-context behavior by adding useful nonlinearity and sparse modulation
Read further below ⬇️
Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe
replied to
their
post
7 days ago
9 Recent advances in Multi-Agent Systems (all open-source)
The idea to split tasks across multiple agents instead of relying on one universal agent is now seen as one of the most effective ways to build an AI stack. Concepts like “agent swarms” were highlighted at the AI Engineer Code Summit in NYC (Nov 20–21) as the winning architecture. And this trend is not only about coding and software. It applies across all AI domains.
So here is some recent research that helps keep multi-agent systems (MAS) better and up-to-date:
1. LatentMAS → https://huggingface.co/papers/2511.20639
AI agents share their hidden "thoughts" directly in latent space instead of talking through text. This makes collaboration and reasoning way faster and accurate (no extra training needed)
2. Puppeteer → https://huggingface.co/papers/2505.19591
Uses a “puppeteer” LLM that dynamically decides which agents (“puppets”) to call and in what order. By learning this orchestration with reinforcement learning (RL), the system solves complex tasks more efficiently and with fewer compute costs
3. MADD → https://huggingface.co/papers/2511.08217
A MAS with 4 agents for drug discovery. It lets researchers describe a drug discovery task in plain language. Then MADD automatically builds and runs the full hit-identification pipeline, making AI-driven drug design a simple end-to-end workflow
4. Multi-Agent Tool-Integrated Policy Optimization (MATPO) → https://huggingface.co/papers/2510.04678
Lets one LLM act as multiple agents (like a planner and a worker) by using different prompts and training them together with RL. So you get the benefits of a multi-agent system without needing multiple models
If you're interested in trends in multi-agent for software development of the future, explore my article with the emergent playbook. This is super interesting → https://www.turingpost.com/p/aisoftwarestack
Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe
Read further below ⬇️