The Large Model Safety Workshop 2025 was an event that took place in Singapore on April 23rd 2025. I had the privilege to represent RAiD at this event, and it was pretty cool.
The workshop was composed of talks by nine distinguished researchers working in the AI Safety domain, including Prof Yoshua Bengio, Prof Christopher Manning and Prof Dawn Song.
They talked about various concepts, from formal verifications of AI Safety to various jailbreaking attacks and automated red-teaming approaches.
Here are a few of the notes I took from the various talks at the workshop:
- Towards Building Safe and Secure AI: Lessons & Open Challenges by Prof Dawn Song @ UC Berkeley
- Can We Provide Formal Guarantees for LLM Safety? by Prof Gagandeep Singh @ UIUC
- Large Model Safety: The Narrow Path between Cavalier Building and Paralyzing Fear by Prof Christopher Manning @ Stanford
- Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path? by Prof Yoshua Bengio @ University of Montreal