AI for Scientific Discovery Won't Work Without Fixing How We Collaborate.
My co-author @cgeorgiaw and I just published a paper challenging a core assumption: that the main barriers to AI in science are technical. They're not. They're social.
Key findings:
🚨 The "AI Scientist" myth delays progress: Waiting for AGI devalues human expertise and obscures science's real purpose: cultivating understanding, not just outputs. 📊 Wrong incentives: Datasets have 100x longer impact than models, yet data curation is undervalued. ⚠️ Broken collaboration: Domain scientists want understanding. ML researchers optimize performance. Without shared language, projects fail. 🔍 Fragmentation costs years: Harmonizing just 9 cancer files took 329 hours.
Why this matters: Upstream bottlenecks like efficient PDE solvers could accelerate discovery across multiple sciences. CASP mobilized a community around protein structure, enabling AlphaFold. We need this for dozens of challenges.
Thus, we're launching Hugging Science! A global community addressing these barriers through collaborative challenges, open toolkits, education, and community-owned infrastructure. Please find all the links below!
One of the hardest challenges in AI safety is finding the right balance: how do we protect people from harm without undermining their agency? This tension is especially visible in conversational systems, where safeguards can sometimes feel more paternalistic than supportive.
In my latest piece for Hugging Face, I argue that open source and community-driven approaches offer a promising (though not exclusive) way forward.
✨ Transparency can make safety mechanisms into learning opportunities. ✨ Collaboration with diverse communities makes safeguards more relevant across contexts. ✨ Iteration in the open lets protections evolve rather than freeze into rigid, one-size-fits-all rules.
Of course, this isn’t a silver bullet. Top-down safety measures will still be necessary in some cases. But if we only rely on corporate control, we risk building systems that are safe at the expense of trust and autonomy.
We’re excited to share that Cosmos Reason has surpassed 1 million downloads on Hugging Face!
Cosmos Reason is an open, customizable, commercial-ready 7B-parameter reasoning vision language model (VLM) designed for physical AI. By combining physics understanding, prior knowledge, and common sense reasoning, Cosmos Reason empowers AI agents and robots to operate intelligently in real-world environments.
Key applications already unlocked include:
✅ Automating large-scale dataset curation and annotation
🤖 Powering robot planning and vision-language action (VLA) decision-making
📊 Driving advanced video analytics and actionable insight generation
We’re proud to see a global community of developers using Cosmos Reason to teach robots to think like humans—and we’re just getting started.
Many of 'em pinged me asking to make the nano-banana-aio to available on hf.co/spaces, so I’ve transferred the app’s tech stack to make it compatible for deployment on Spaces. (Can be accessed with your own Gemini API) 🤗⭐️
Nano Banana AIO (All-in-One) App, which offers seamless image manipulation features, including single/multiple image adaptation, a canvas for free-style drawing to creative image generation, and standard text-to-image generation.
💬 From Replika to everyday chatbots, millions of people are forming emotional bonds with AI, sometimes seeking comfort, sometimes seeking intimacy. But what happens when an AI tells you "I understand how you feel" and you actually believe it?
At Hugging Face, together with @frimelle and @yjernite, we dug into something we felt wasn't getting enough attention: the need to evaluate AI companionship behaviors. These are the subtle ways AI systems validate us, engage with us, and sometimes manipulate our emotional lives.
Here's what we found: 👉 Existing benchmarks (accuracy, helpfulness, safety) completely miss this emotional dimension. 👉 We mapped how leading AI systems actually respond to vulnerable prompts. 👉 We built the Interactions and Machine Attachment Benchmark (INTIMA): a first attempt at evaluating how models handle emotional dependency, boundaries, and attachment (with a full paper coming soon).
In case you missed it, Hugging Face expanded its collaboration with Azure a few weeks ago with a curated catalog of 10,000 models, accessible from Azure AI Foundry and Azure ML!
@alvarobartt cooked during these last days to prepare the one and only documentation you need, if you wanted to deploy Hugging Face models on Azure. It comes with an FAQ, great guides and examples on how to deploy VLMs, LLMs, smolagents and more to come very soon.
We need your feedback: come help us and let us know what else you want to see, which model we should add to the collection, which model task we should prioritize adding, what else we should build a tutorial for. You’re just an issue away on our GitHub repo!
AMD summer hackathons are here! A chance to get hands-on with MI300X GPUs and accelerate models. 🇫🇷 Paris - Station F - July 5-6 🇮🇳 Mumbai - July 12-13 🇮🇳 Bengaluru - July 19-20
Hugging Face and GPU Mode will be on site and on July 6 in Paris @ror will share lessons learned while building new kernels to accelerate Llama 3.1 405B on ROCm
Hugging Face just wrapped 4 months of deep work with AMD to push kernel-level optimization on their MI300X GPUs. Now, it's time to share everything we learned.
Join us in Paris at STATION F for a hands-on weekend of workshops and a hackathon focused on making open-source LLMs faster and more efficient on AMD.
Prizes, amazing host speakers, ... if you want more details, navigate to https://lu.ma/fmvdjmur!
Every language carries its own cultural values and worldviews. So, when we build AI systems, we're not just deciding how they speak but also whose perspectives they represent.
Even choosing which dialect to train on in Norway becomes a question of inclusion and power. In Kenya, will AI speak Swahili from Nairobi or coastal regions? What about indigenous languages with rich oral traditions but limited written text, like Quechua in Peru or Cherokee in North America?
The path forward? Building WITH communities, not just FOR them. Working with local partners (libraries, universities, civil society), testing for cultural alignment, and asking hard questions about representation.
Super excited to launch Hugging Face Sheets: Spreadsheets meet AI and unstructured data.
A few months ago, we started imagining new ways to build and transform datasets with the latest open-source models.
Today, I'm thrilled to introduce our first step in this direction.
In a nutshell:
📁 Effortlessly run prompts and models over your data. 🌐 Agentic search for accuracy and real-time information. 🖼️ Familiar, minimalistic interface for interacting with data. 🎯 Human feedback 2.0: Your input directly improves generated data. 💯 Access hundreds of open models and leading inference providers.
We have been working on a project called kernels. kernels makes it possible to load compute kernels directly from the Hub! 🚀
We plan to give kernels a more proper introduction soon. But for those who have been following along, we are happy to announce a new release:
- New layer API with torch.compile support. - Experimental support for loading Apple Silicon Metal 🤘 Kernels. - Generate wheels from Hub kernels for legacy deployments.