Risk Assessment, Safety Alignment, and Guardrails for Generative Models

When:

Wednesday, June 5, 2024 10:00 am - 11:15 am

Where:

Online

Speaker:

Bo Li

Neubauer Associate Professor in the Department of Computer Science at the University of Chicago and the University of Illinois at Urbana-Champaign

Description:

Large language models (LLMs) have garnered widespread attention due to their impressive performance across a range of applications. However, our understanding of the trustworthiness and risks of these models remains limited. The temptation to deploy proficient foundation models in sensitive domains like healthcare and finance, where errors carry significant consequences, underscores the need for rigorous safety evaluations, enhancement, and guarantees. Recognizing the urgent need for developing safe and beneficial AI, our recent research seeks to design a unified platform to evaluate the safety of LLMs from diverse perspectives such as toxicity, stereotype bias, adversarial robustness, OOD robustness, ethics, privacy, and fairness; enhance LLM safety through knowledge integration; and provide safety guardrail and certifications.

This talk will outline foundational principles for safety evaluation, detail red teaming tactics, and share insights gleaned from applying the DecodingTrust platform to different models, such as proprietary and open-source models, as well as compressed models.

Contact:

Attend via Zoom