The Large Model Safety Workshop 2025 was an event that took place in Singapore on April 23rd 2025. I had the privilege to represent RAiD at this event, and it was pretty cool.

The workshop was composed of talks by nine distinguished researchers working in the AI Safety domain, including Prof Yoshua Bengio, Prof Christopher Manning and Prof Dawn Song.

They talked about various concepts, from formal verifications of AI Safety to various jailbreaking attacks and automated red-teaming approaches.

Here are a few of the notes I took from the various talks at the workshop: