1 100 3

Ksenia Se

Kseniase

https://www.turingpost.com/

AI & ML interests

None yet

Recent Activity

replied to their post about 4 hours ago

15 Outstanding Research Papers from NeurIPS 2025 NeurIPS 2025, as a premier annual event in machine learning and computational neuroscience, tackles major topics like the future of AI, current research, and the most difficult challenges. While we’re not attending this year, we’re closely following the updates and today we pull together a quick, easy-to-digest roundup of a few standout papers so you can jump in without getting overwhelmed. Here is a list of 15 papers from NeurIPS 2025, including 8 top research papers that received awards, along with 7 others that caught our attention: 1. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks → https://neurips.cc/virtual/2025/loc/san-diego/test-of-time/128328 Test of Time Award winner. Introduces the RPN, a small convnet that predicts objectness and boxes on shared features, enabling Faster R-CNN to share computation and run around 5 fps on a GPU 2. Artificial Hivemind: The Open-Ended Homogeneity of LMs (and Beyond) → https://neurips.cc/virtual/2025/loc/san-diego/poster/121421 Releases a huge open-ended prompt dataset, showing that LLMs often fall into an “artificial hivemind” – generate surprisingly similar answers – and measuring diversity collapse 3. Optimal Mistake Bounds for Transductive Online Learning → https://neurips.cc/virtual/2025/loc/san-diego/poster/119098 Settles a 30-year-old question by showing how much unlabeled data helps in online learning – it gives a precise quadratic advantage with tight matching bounds 4. Gated Attention for LLMs: Non-linearity, Sparsity, and Attention-Sink-Free → https://neurips.cc/virtual/2025/loc/san-diego/poster/120216 Demonstrates how gating actually affects attention: a simple sigmoid gate after Scaled Dot-Product Attention (SDPA) boosts performance, stability, and long-context behavior by adding useful nonlinearity and sparse modulation Read further below ⬇️ Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe

posted an update about 4 hours ago

replied to their post 7 days ago

9 Recent advances in Multi-Agent Systems (all open-source) The idea to split tasks across multiple agents instead of relying on one universal agent is now seen as one of the most effective ways to build an AI stack. Concepts like “agent swarms” were highlighted at the AI Engineer Code Summit in NYC (Nov 20–21) as the winning architecture. And this trend is not only about coding and software. It applies across all AI domains. So here is some recent research that helps keep multi-agent systems (MAS) better and up-to-date: 1. LatentMAS → https://huggingface.co/papers/2511.20639 AI agents share their hidden "thoughts" directly in latent space instead of talking through text. This makes collaboration and reasoning way faster and accurate (no extra training needed) 2. Puppeteer → https://huggingface.co/papers/2505.19591 Uses a “puppeteer” LLM that dynamically decides which agents (“puppets”) to call and in what order. By learning this orchestration with reinforcement learning (RL), the system solves complex tasks more efficiently and with fewer compute costs 3. MADD → https://huggingface.co/papers/2511.08217 A MAS with 4 agents for drug discovery. It lets researchers describe a drug discovery task in plain language. Then MADD automatically builds and runs the full hit-identification pipeline, making AI-driven drug design a simple end-to-end workflow 4. Multi-Agent Tool-Integrated Policy Optimization (MATPO) → https://huggingface.co/papers/2510.04678 Lets one LLM act as multiple agents (like a planner and a worker) by using different prompts and training them together with RL. So you get the benefits of a multi-agent system without needing multiple models If you're interested in trends in multi-agent for software development of the future, explore my article with the emergent playbook. This is super interesting → https://www.turingpost.com/p/aisoftwarestack Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe Read further below ⬇️

View all activity

Organizations

replied to their post about 4 hours ago

Superposition Yields Robust Neural Scaling → https://neurips.cc/virtual/2025/loc/san-diego/poster/116346
Controlling superposition in toy models and checking real LLMs, researchers show that strong superposition naturally creates the familiar “bigger model = lower loss” power laws, explaining when scaling laws work and when they might fail
Why Diffusion Models Don’t Memorize: The Role of Implicit Dynamical Regularization in Training → https://neurips.cc/virtual/2025/loc/san-diego/poster/119372
Shows diffusion models hit an early “good samples” phase and a later memorization phase. Larger datasets widen the "generalization window," avoiding overfitting much longer and revealing implicit regularization
Does RL Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? → https://neurips.cc/virtual/2025/loc/san-diego/poster/119944
Explains that while RLVR makes models better at finding correct answers efficiently, it doesn’t create really new reasoning abilities. RLVR models mostly reuse patterns already present in the base model, highlighting the need for better RL to unlock reasoning gains
1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities → https://neurips.cc/virtual/2025/loc/san-diego/poster/115731
Simply making RL models way deeper, up to 1024 layers, can massively improve self-supervised RL, letting agents learn far better behaviors from scratch and boosting performance by 2-50× on locomotion and manipulation tasks
Titans + MIRAS: Helping AI have long-term memory → https://research.google/blog/titans-miras-helping-ai-have-long-term-memory/
Titans is a new architecture with a deep MLP memory that updates itself during inference using a “surprise” signal, letting the model keep important info, forget noise, and handle million-token contexts with RNN-like speed and Transformer-like accuracy
Generative Data Augmentation via Diffusion Distillation, Adversarial Alignment, and Importance Reweighting → https://neurips.cc/virtual/2025/loc/san-diego/poster/116854
Introduces DAR-GDA, which distills diffusion models into a fast one-step generator, aligns them with real data via adversarial training, and reweights synthetic samples to remove bias
Slow Transition to Low-Dimensional Chaos in Heavy-Tailed RNNs → https://arxiv.org/abs/2505.09816
Shows that RNNs with brain-like heavy-tailed weights don’t behave like Gaussian ones. They shift and widen the edge-of-chaos transition but reduce the system’s effective dimensionality.
Evaluating multiple models using labeled and unlabeled data → https://arxiv.org/abs/2501.11866
Introduces Semi-Supervised Model Evaluation (SSME), a way to evaluate classifiers using both labeled and unlabeled data by modeling how predictions relate to true labels, giving far more accurate performance estimates when labeled data is limited
Riemannian Consistency Model → https://arxiv.org/abs/2510.00983
Extends consistency models to curved spaces, enabling few-step generation that stays on the manifold, using exponential maps and covariant derivatives, and works well on spheres, tori, and 3D rotations
BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model → https://arxiv.org/abs/2505.23579
BioReason links a DNA model with an LLM so the LLM can reason over genomic data, yielding clear biological explanations and strong accuracy gains on pathway and variant prediction tasks
NFL-BA: Near-Field Light Bundle Adjustment for SLAM in Dynamic Lighting → https://asdunnbe.github.io/NFL-BA/NeurIPS2025_NFL_BA.pdf
Introduces NFL-BA, a SLAM loss that models near-field lighting so systems work better in settings like endoscopy or dark indoor scenes, yielding large improvements in camera tracking and mapping

posted an update about 4 hours ago

Post

15 Outstanding Research Papers from NeurIPS 2025

NeurIPS 2025, as a premier annual event in machine learning and computational neuroscience, tackles major topics like the future of AI, current research, and the most difficult challenges. While we’re not attending this year, we’re closely following the updates and today we pull together a quick, easy-to-digest roundup of a few standout papers so you can jump in without getting overwhelmed.

Here is a list of 15 papers from NeurIPS 2025, including 8 top research papers that received awards, along with 7 others that caught our attention:

1. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks → https://neurips.cc/virtual/2025/loc/san-diego/test-of-time/128328
Test of Time Award winner. Introduces the RPN, a small convnet that predicts objectness and boxes on shared features, enabling Faster R-CNN to share computation and run around 5 fps on a GPU

2. Artificial Hivemind: The Open-Ended Homogeneity of LMs (and Beyond) → https://neurips.cc/virtual/2025/loc/san-diego/poster/121421
Releases a huge open-ended prompt dataset, showing that LLMs often fall into an “artificial hivemind” – generate surprisingly similar answers – and measuring diversity collapse

3. Optimal Mistake Bounds for Transductive Online Learning → https://neurips.cc/virtual/2025/loc/san-diego/poster/119098
Settles a 30-year-old question by showing how much unlabeled data helps in online learning – it gives a precise quadratic advantage with tight matching bounds

4. Gated Attention for LLMs: Non-linearity, Sparsity, and Attention-Sink-Free → https://neurips.cc/virtual/2025/loc/san-diego/poster/120216
Demonstrates how gating actually affects attention: a simple sigmoid gate after Scaled Dot-Product Attention (SDPA) boosts performance, stability, and long-context behavior by adding useful nonlinearity and sparse modulation

Read further below ⬇️
Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe

1 reply

replied to their post 7 days ago

QuantAgent → https://huggingface.co/papers/2509.09995
A multi-agent LLM system for high-frequency trading in real time. It splits the job between 4 agents – Indicator, Pattern, Trend, and Risk – to make quick, precise decisions, based on short-term market signals
MAC-Flow → https://huggingface.co/papers/2511.05005
Learns complex multi-agent coordination with a flow model and distills it into fast one-step policies, providing diffusion-level coordination with Gaussian-level real-time speed
MrlX → https://github.com/AQ-MedAI/MrlX
A multi-agent RL framework where 2 agents talk through a multi-turn dialogue (Agent A initiates it, Agent B engages in responses), learn from each other, and update their models in a continuous “generate → train → sync” loop. The agents co-evolve and get better at collaborative decision-making over time
M-GRPO for Multi-Agent Deep Research → https://huggingface.co/papers/2511.13288
This training method lets different agents in a MAS use their own specialized LLMs while still learning together. It gives each agent its own local reward signal and aligns their uneven trajectories, so they stay coordinated even when running at different speeds or on different servers
MarsRL→ https://huggingface.co/papers/2511.11373
Trains the Solver, Verifier, and Corrector agents together with separate rewards for each and a pipeline-style RL setup, which makes them better at catching mistakes and refining answers and reaching much higher accuracy on math benchmarks

posted an update 7 days ago

Post

6144

9 Recent advances in Multi-Agent Systems (all open-source)

The idea to split tasks across multiple agents instead of relying on one universal agent is now seen as one of the most effective ways to build an AI stack. Concepts like “agent swarms” were highlighted at the AI Engineer Code Summit in NYC (Nov 20–21) as the winning architecture. And this trend is not only about coding and software. It applies across all AI domains.

So here is some recent research that helps keep multi-agent systems (MAS) better and up-to-date:

1. LatentMAS → Latent Collaboration in Multi-Agent Systems (2511.20639)
AI agents share their hidden "thoughts" directly in latent space instead of talking through text. This makes collaboration and reasoning way faster and accurate (no extra training needed)

2. Puppeteer → Multi-Agent Collaboration via Evolving Orchestration (2505.19591)
Uses a “puppeteer” LLM that dynamically decides which agents (“puppets”) to call and in what order. By learning this orchestration with reinforcement learning (RL), the system solves complex tasks more efficiently and with fewer compute costs

3. MADD → MADD: Multi-Agent Drug Discovery Orchestra (2511.08217)
A MAS with 4 agents for drug discovery. It lets researchers describe a drug discovery task in plain language. Then MADD automatically builds and runs the full hit-identification pipeline, making AI-driven drug design a simple end-to-end workflow

4. Multi-Agent Tool-Integrated Policy Optimization (MATPO) → Multi-Agent Tool-Integrated Policy Optimization (2510.04678)
Lets one LLM act as multiple agents (like a planner and a worker) by using different prompts and training them together with RL. So you get the benefits of a multi-agent system without needing multiple models

If you're interested in trends in multi-agent for software development of the future, explore my article with the emergent playbook. This is super interesting → https://www.turingpost.com/p/aisoftwarestack
Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe

Read further below ⬇️

2 replies

replied to their post 14 days ago

A Survey of Large Language Model-Powered Spatial Intelligence Across Scales: Advances in Embodied Agents, Smart Cities, and Earth Science -> https://arxiv.org/abs/2504.09848
Explores spatial memory and reasoning in LLMs, and compares spatial intelligence across scales – from agents to cities to the planet – offering a framework and insights for future
SITE: Towards Spatial Intelligence Thorough Evaluation → https://arxiv.org/abs/2505.05456
Introduces the SITE benchmark to evaluate spatial intelligence in vision-language models across modalities and scales

posted an update 14 days ago

Post

1939

6 Essential Reads on Spatial Intelligence

In AI, spatial intelligence is basically the model’s “sense of space” – its ability to understand where things are, how they relate, and how they move. It lets an AI models navigate a room, interpret a scene, or figure out how objects fit together, like giving it a built-in mental map. For example, world models can't live without spatial intelligence.

Here are 6 good reads to explore what spatial intelligence is and how it's evolving:

1. From Words to Worlds: Spatial Intelligence is AI’s Next Frontier by Fei-Fei Li → https://drfeifei.substack.com/p/from-words-to-worlds-spatial-intelligence
Fei-Fei Li, the godmother of AI, is a key figure in spatial intelligence, since her work in computer vision, especially ImageNet, helped AI learn to recognize and understand objects in space. She's recently started a blog, and this post, in particular, argues that true intelligence requires grounding in space, understanding geometry, motion and consequences in the real world

2. Spatial Reasoning in Multimodal LLMs: A Survey of
Tasks, Benchmarks and Methods → https://arxiv.org/abs/2511.15722
Breaks down how AI models handle spatial reasoning from a cognitive angle, maps all the existing tasks and benchmarks to that framework

3. What is Spatial Intelligence? → https://www.turingpost.com/p/cvhistory5
Our special article easily explains what spatial intelligence actually is, why it matters, and how researchers are trying to boost it so machines can better understand and navigate the physical world

4. From 2D to 3D Cognition: A Brief Survey of General World
Models → https://arxiv.org/pdf/2506.20134
Shows how AI world models are evolving from simple 2D perception to full-on 3D understanding, explaining the tech behind it, what new 3D abilities these models gain, and where they’re used in the real world

Read further below ⬇️
If you like it, also subscribe to the Turing Post: https://www.turingpost.com/subscribe

1 reply

replied to their post 21 days ago

TD-JEPA (Temporal difference JEPA) → https://huggingface.co/papers/2510.00739
An unsupervised RL method that uses TD learning to model long-term latent dynamics, training encoders and a policy-conditioned predictor for zero-shot reward optimization

5 Iconic JEPA types:

I-JEPA (Image-based) → https://huggingface.co/papers/2301.08243
Masks out parts of an image and predicts their latent representation from the remaining context region. Uses Vision Transformers; no pixel-level reconstruction needed
V-JEPA (Video-based) → https://huggingface.co/papers/2404.08471
Predicts future or missing frame embeddings from observed frames. Learns temporal dynamics without contrastive negatives or text supervision

V-JEPA 2 trained on 1M+ hours of internet videos and a little bit of robot interaction data. It can watch, understand, answer questions, and help robots plan and act in physical world → https://huggingface.co/papers/2506.09985

MC-JEPA (Motion-Content) → https://huggingface.co/papers/2307.12698
Jointly learns motion (optical flow) and content features with a shared encoder. It combines a flow prediction task with a standard image representation task (VICReg) in one model
A-JEPA (Audio-based) → https://huggingface.co/papers/2311.15830
Extends JEPA to audio spectrograms. Masks time-frequency patches of the spectrogram (with a curriculum strategy) and predicts their latent features from the unmasked context
TI-JEPA (Text-Image) → https://huggingface.co/papers/2503.06380
Aligns text and image embeddings in a shared latent space via an energy-based predictive objective

We break down how JEPA works and its main ideas in this comprehensive article: https://www.turingpost.com/p/jepa

Check out more JEPA types here:

posted an update 21 days ago

Post

6029

12 Types of JEPA

Since Yann LeCun together with Randall Balestriero released a new paper on JEPA (Joint-Embedding Predictive Architecture), laying out its theory and introducing an efficient practical version called LeJEPA, we figured you might need even more JEPA. Here are 7 recent JEPA variants plus 5 iconic ones:

1. LeJEPA → LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics (2511.08544)
Explains a full theory for JEPAs, defining the “ideal” JEPA embedding as an isotropic Gaussian, and proposes the SIGReg objective to push JEPA toward this ideal, resulting in practical LeJEPA

2. JEPA-T → JEPA-T: Joint-Embedding Predictive Architecture with Text Fusion for Image Generation (2510.00974)
A text-to-image model that tokenizes images and captions with a joint predictive Transformer, enhances fusion with cross-attention and text embeddings before training loss, and generates images by iteratively denoising visual tokens conditioned on text

3. Text-JEPA → Speaking in Words, Thinking in Logic: A Dual-Process Framework in QA Systems (2507.20491)
Converts natural language into first-order logic, with a Z3 solver handling reasoning, enabling efficient, explainable QA with far lower compute than large LLMs

4. N-JEPA (Noise-based JEPA) → Improving Joint Embedding Predictive Architecture with Diffusion Noise (2507.15216)
Connects self-supervised learning with diffusion-style noise by using noise-based masking and multi-level schedules, especially improving visual classification

5. SparseJEPA → SparseJEPA: Sparse Representation Learning of Joint Embedding Predictive Architectures (2504.16140)
Adds sparse representation learning to make embeddings more interpretable and efficient. It groups latent variables by shared semantic structure using a sparsity penalty while preserving accuracy

6. TS-JEPA (Time Series JEPA) → Joint Embeddings Go Temporal (2509.25449)
Adapts JEPA to time-series by learning latent self-supervised representations and predicting future latents for robustness to noise and confounders

Read further below ↓
It you like it, also subscribe to the Turing Post: https://www.turingpost.com/subscribe

1 reply

replied to their post 28 days ago

FP4 → https://arxiv.org/abs/2310.16836 (4-bit Transformer); https://arxiv.org/abs/2305.14314 (QLoRA)
Experimental format for ultra-compact inference. It's used in research and quantization-aware inference, including 4-Bit Floating-Point Quantized Transformers and 4-bit NormalFloat (NF4) in QLoRA
INT8/INT4 → https://arxiv.org/abs/2004.09602
Integer low-precision formats that use 8 or 4 bits. Primary used in inference. The model's weights and activations are converted into integer values that can be processed efficiently on hardware optimized for integer arithmetic
2-bit (ternary or binary quantization) → https://research.ibm.com/blog/low-precision-computing
Experimental ultra-low precision for computation in ultra-efficient AI accelerators. Uses values like {-1, 0, 1}. It turns multiplications into additions/subtractions - extremely cheap operations

posted an update 28 days ago

Post

4080

7+ Main precision formats used in AI:

Precision is very important in AI as it shapes how accurate and efficient models are. It controls how finely numbers are represented, approximating real-world values with formats like fixed-point and floating-point. A recent BF16 → FP16 study renewed attention to precision impact.
Here are the main precision types used in AI, from full precision for training to ultra-low precision for inference:

1. FP32 (Float32):
Standard full-precision float used in most training: 1 sign bit, 8 exponent bits, 23 mantissa bits. Default for backward-compatible training and baseline numerical stability

2. FP16 (Float16) → https://arxiv.org/abs/2305.10947v6
Half-precision float. It balances accuracy and efficiency. 1 sign bit, 5 exponent bits, 10 mantissa bits. Common on NVIDIA Tensor Cores and mixed-precision setups. There’s now a new wave of using it in reinforcement learning: https://www.turingpost.com/p/fp16

3. BF16 (BFloat16) → https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus
Same dynamic range as FP32 but fewer mantissa bits: 1 sign bit, 8 exponent bits (same as FP32), 7 mantissa bits. It was developed by the research group Google Brain as part of their AI/ML infrastructure work at Google. Preferred on TPUs and modern GPUs

4. FP8 (E4M3 / E5M2) → https://proceedings.neurips.cc/paper_files/paper/2018/file/335d3d1cd7ef05ec77714a215134914c-Paper.pdf
Emerging standard for training and inference on NVIDIA Hopper (H100) and Blackwell (B200) tensor cores and AMD MI300. Also supported in NVIDIA’s Transformer Engine: https://developer.nvidia.com/blog/floating-point-8-an-introduction-to-efficient-lower-precision-ai-training/
E4M3 = 4 exponent, 3 mantissa bits
E5M2 = 5 exponent, 2 mantissa bits

Read further below ⬇️
If you like this, also subscribe to the Turing post: https://www.turingpost.com/subscribe

1 reply

replied to their post about 1 month ago

Agentic Entropy-Balanced Policy Optimization (AEPO) → https://huggingface.co/papers/2510.14545
Keeps web agents from collapsing during training by balancing entropy in data collection and policy updates, and adjusting gradients on high-uncertainty steps
Agent- and Turn-wise Grouped Reinforcement Policy Optimization (AT-GRPO) → https://huggingface.co/papers/2510.11062
PO for multi-agent LLM systems. It groups training by agent roles and dialogue turns, allowing each agent to learn more effectively within its context
Direct Group Preference Optimization (DGPO) → https://huggingface.co/papers/2510.08425
RL method made for diffusion models. Learns directly from group-level preferences between samples, allowing it to use fast deterministic ODE samplers instead of noisy stochastic ones
Entropy-regularized Policy Optimization (EPO) → https://huggingface.co/papers/2509.22576
Controls entropy and adapts it across training phases, encouraging exploration early on and steady convergence later
Multiplayer Nash Preference Optimization (MNPO) → https://huggingface.co/papers/2509.23102
Extends human feedback alignment to a multiplayer game setup. Each policy competes with a population of others, capturing more complex and realistic human preference patterns while keeping stable Nash equilibria

posted an update about 1 month ago

Post

11128

11 Fascinating new Policy Optimization techniques

Policy optimization (PO) algorithms are central to training AI models with preference-based feedback. In recent weeks, numerous new PO methods have emerged that build on or replace the popular PPO and GRPO, solving their issues. Here are 11 of them:

1. BAlanced Policy Optimization (BAPO) → BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping (2510.18927)
Dynamically adjusting the clipping bounds in PPO-style updates to balance positive and negative gradients and prevent entropy collapse

2. Training-Free GRPO → Training-Free Group Relative Policy Optimization (2510.08191)
Instead of using numeric rewards, it compares rollouts semantically to distill useful knowledge as a token prior, which is then applied during inference to guide the model’s behavior

3. Asymmetric Importance Sampling Policy Optimization (ASPO) → ASPO: Asymmetric Importance Sampling Policy Optimization (2510.06062)
Fixes imbalanced token weighting in LLM training. It flips the importance sampling ratios for positive tokens to correct over- and under-updates, and adds a soft dual-clipping step to keep gradients stable

4. In-Context Steered Policy Optimization (ICPO) → https://arxiv.org/abs/2510.26519
Uses a model’s own in-context learning ability to guide training with existing data. It combines Mixed-Policy GRPO with Implicit Expert Forcing to expand exploration and adds Expert Region Reject Sampling and Annealed Expert-Bonus Reward Shaping to ensure stability and balanced expert influence

5. Graph-Enhanced Policy Optimization (GEPO) → https://arxiv.org/abs/2510.26270
Builds a graph of an agent’s experiences to understand how different states connect, guide exploration and assign rewards more effectively

6. Information Gain-based Policy Optimization (IGPO) → Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents (2510.14967)
Uses the model’s own belief updates to create dense, informative feedback for smoother multi-turn learning

Read further below ⬇️
If you like this, also subscribe to the Turing post: https://www.turingpost.com/subscribe

2 replies

replied to their post about 1 month ago

DeepCode → https://github.com/HKUDS/DeepCode
A platform where multiple AI agents work together to turn research papers or natural language descriptions into full, production-ready applications
AutoGPT https://github.com/Significant-Gravitas/AutoGPT
A platform for building, deploying, and running continuous AI agents for complex workflows – available as a free self-hosted setup and a soon-to-launch cloud service
Kilo Code https://github.com/Kilo-Org/kilocode
AI coding agent for VS Code, powered by all top models like GPT-5 and Claude 4. It turns your editor into a self-checking, multi-mode AI coworker that streamlines development from planning to debugging
CodeGeeX → https://github.com/zai-org/CodeGeeX (the later update CodeGeeX4 https://github.com/zai-org/CodeGeeX4)
Streamlines global software development by enabling seamless cross-language coding, faster prototyping, and intelligent code assistance across multiple platforms and IDEs

posted an update about 1 month ago

Post

846

12 Awesome GitHub repos to upgrade your AI coding

Coding is the field where AI is welcomed with open arms. Here’s a collection to help you take your AI-assisted coding workflows to the next level of convenience and efficiency:

1. Smol Developer → https://github.com/smol-ai/developer
A lightweight AI “junior dev” that takes your product spec and automatically scaffolds or helps you build full codebases

2. Tabby → https://github.com/TabbyML/tabby
A self-hosted AI coding assistant that runs locally as an alternative to GitHub Copilot. Easy to integrate, GPU-friendly, and doesn’t rely on the cloud

3. Beads (bd) Issue Tracker → https://github.com/steveyegge/beads
Gives coding agents long-term memory, letting them organize, plan, and execute complex tasks reliably across sessions

4. MetaGPT → https://github.com/FoundationAgents/MetaGPT
A multi-agent framework that imitates a software company team using LLMs. It assigns AI agents roles like PM, Architect, and Developer to produce user stories, designs, specs, and final code

5. Open Interpreter → https://github.com/openinterpreter/open-interpreter
Gives you ChatGPT’s coding power with full local control – no limits, no sandbox – so you can automate, analyze, and create anything right from your desktop through a chat interface

6. OpenSpec → https://github.com/Fission-AI/OpenSpec
A lightweight, spec-driven development tool that helps humans and AI agree on what to build before any code is written

7. PR-Agent → https://github.com/qodo-ai/pr-agent
An AI code reviewer that automatically reviews, describes, and improves pull requests across GitHub, GitLab, and other platforms

8. BabyAGI → https://github.com/yoheinakajima/babyagi
A self-building AI framework that gives agents the ability to write, manage, and refine their own functions, turning them from passive tools into active, self-building systems

9 ...⬇️

Subscribe to the Turing Post: https://www.turingpost.com/subscribe – your shortcut to deep, clear AI analysis

2 replies

posted an update about 2 months ago

Post

4043

5 Lectures and keynotes defining AI right now

If you want to understand the multifaceted AI landscape in 2025 and see where the field is heading – start with (or revisit) these legendary talks. They can help you capture what’s happening in AI from multiple angles:

1. Andrej Karpathy: Software Is Changing (Again) → https://www.youtube.com/watch?v=LCEmiRjPEtQ
Unveils Software 3.0 – a paradigm where LLMs are the new computers, programmed with prompts instead of code. The key: developers must now master coding, training, and prompting as AI becomes the heart of software building

2. Richard Sutton, The OaK Architecture: A Vision of SuperIntelligence from Experience → https://www.youtube.com/watch?v=gEbbGyNkR2U
Unveils the OaK (Options and Knowledge) architecture – a model-based RL framework for continual intelligence, where every component learns, meta-learns & builds hierarchical abstractions

3. GTC March 2025 Keynote with NVIDIA CEO Jensen Huang → https://www.youtube.com/watch?v=_waPvOwL9Z8
Dives into the accelerated computing and the importance of Physical AI. From the Blackwell GPU architecture & AI factories to breakthroughs in agentic AI & robotics, Jensen Huang explains how NVIDIA aims to power every layer of the AI ecosystem

4. Yann LeCun "Mathematical Obstacles on the Way to Human-Level AI" → https://www.youtube.com/watch?v=ETZfkkv6V7
Yann LeCun always argues we need a new path to machines that reason about the world – not LLMs or RL. So this lecture is about self-supervised systems with world models, planning, memory and energy-based learning

5. Andrew Ng: State of AI Agents → https://www.youtube.com/watch?v=4pYzYmSdSH4
Highlights one of the most pressing topics of 2025 – agents, explaining why most effective AI agents rely on simple, linear workflows built from modular “Lego-brick” tasks + what predicts AI startup success in the new agent era

Subscribe to the Turing Post: https://www.turingpost.com/subscribe –your shortcut to deep, clear AI analysis

replied to their post about 2 months ago

Synthesia → https://www.synthesia.io
Generates digital realistic avatars that narrate your script. Choose from virtual actors or customize. They feature natural expressions, lip-sync + you can select dozens of languages/accents for the AI voice
HeyGen → https://www.heygen.com
Offers interactive avatars that can answer questions or have branching dialogues, powered by a knowledge base you provide. Useful for FAQ bots or personalized marketing
Kaiber → https://kaibarai.com
Stylized, artistic video generation from text or images. The output is more painterly or animated rather than photorealistic. It’s especially popular for music videos, digital art, and creative storytelling, like visualizing song lyrics with surreal animations
InVideo (AI Video Creator) → https://invideo.io/
AI video creator for marketing, adds, social content and explainers. Turns scripts, prompts, or even blog posts and articles into ready-made videos with stock clips and voice-overs

posted an update about 2 months ago

Post

3180

9 Powerful AI Video Generation Tools

Since Sora 2 is on fire these weeks, reminding us what high-quality video generation should look like, we decided you really need this list of video generation tools – great alternatives or complements to it.

1. Sora 2 → https://openai.com/sora/
It needs no introduction, but this OpenAI’s text-to-video model produces short, ultra-realistic clips across styles (cinematic, photorealistic, animated, etc.) with synced audio

2. Google Veo 3 (Gemini Video Generation) → https://aistudio.google.com/models/veo-3
Part of Gemini AI. Generates 8-second high-fidelity videos from text or images with native sound: background soundtracks and realistic voices with near-perfect lip sync

3. Runway (Gen-4 by Runway ML) → https://runwayml.com/
Text, image, or video-to-video generation with advanced editing like changing lighting, weather, camera angles or replacing objects. Popular in AI filmmaking

4. Pika Labs → https://pollo.ai/m/pika-ai
Provides creative, often stylized short videos – from cinematic mini-scenes to cartoon-like animations. Ideal for social media clips and visual storytelling. Plus, you can add playful effects to manipulate objects in the generated videos

5. Luma’s Dream Machine → https://lumalabs.ai/dream-machine
Powered by Luma AI’s latest Ray 3 model, it quickly visualizes story ideas, animated concept art, or abstract motion videos. It supports consistent custom characters and seamless looping

Read further below ⬇️
If you like it, also subscribe to the Turing Post https://www.turingpost.com/subscribe

1 reply

replied to their post 2 months ago

MCTS-in-the-loop → https://huggingface.co/papers/2501.01478
Allows to score each reasoning step for correctness, retrain on the best ones, and repeats the cycle to steadily improve reasoning.
Plus, building MCTS into training broadens exploration in RLVR, hitting new reasoning SOTA with 5.7× less compute → https://huggingface.co/papers/2509.25454
Process-aware RL (like PRM-style GRPO) → https://huggingface.co/papers/2509.21154
Theory shows GRPO implicitly learns a process reward model (PRM), judging the quality of reasoning steps under the hood. Approaches like Posterior-GRPO makes this explicit by rewarding reasoning within correct answers to reduce reward hacking → https://huggingface.co/papers/2508.05170
Reinforcement Learning from AI Feedback (RLAIF) → https://huggingface.co/papers/2212.08073
It's like RLHF, but the reward signals come from a strong AI judge

posted an update 2 months ago

Post

3822

8 Emerging trends in Reinforcement Learning

Reinforcement learning is having a moment - and not just this week. Some of its directions are already showing huge promise, while others are still early but exciting. Here’s a look at what’s happening right now in RL:

1. Reinforcement Pre-Training (RPT) → Reinforcement Pre-Training (2506.08007)
Reframes next-token pretraining as RL with verifiable rewards, yielding scalable reasoning gains

2. Reinforcement Learning from Human Feedback (RLHF) → Deep reinforcement learning from human preferences (1706.03741)
The top approach. It trains a model using human preference feedback, building a reward model and then optimizing the policy to generate outputs people prefer

3. Reinforcement Learning with Verifiable Rewards (RLVR) → Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs (2506.14245)
Moves from subjective (human-labeled) rewards to objective ones that can be automatically verified, like in math, code, or rubrics as reward, for example → Reinforcement Learning with Rubric Anchors (2508.12790), Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains (2507.17746)

4. Multi-objective RL → Pareto Multi-Objective Alignment for Language Models (2508.07768)
Trains LMs to balance multiple goals at once, like being helpful but also concise or creative, ensuring that improving one goal doesn’t ruin another

5. Parallel thinking RL → Parallel-R1: Towards Parallel Thinking via Reinforcement Learning (2509.07980)
Trains parallel chains of thought, boosting math accuracy and final ceilings. It first teaches the model “parallel thinking” skill on easier problems, then uses RL to refine it on harder ones

Read further below ⬇️
And if you like this, subscribe to the Turing post: https://www.turingpost.com/subscribe

Also, check out our recent guide about the past, present and future of RL: https://www.turingpost.com/p/rlguide