Exploring Trustworthy Foundation Models under Imperfect Data

Speaker:  Bo Han – Hong Kong, Hong Kong
Topic(s):  Artificial Intelligence, Machine Learning, Computer Vision, Natural language processing

Abstract

In the current landscape of machine learning, it is crucial to build trustworthy foundation models that can operate under imperfect conditions, since most real-world data, such as unexpected inputs, image artifacts, and adversarial inputs, are easily noisy. These models need to possess human-like capabilities to learn and reason in uncertainty. In this talk, I will focus on three recent research advancements, each shedding light on the reliability, robustness, and safety in this field. Specifically, the reliability will be explored through the enhancement of vision-language models by introducing negative labels, which effectively detect out-of-distribution samples. Meanwhile, robustness will be explored through our investigation into image interpolation using diffusion models, addressing the challenge of information loss to ensure consistency and quality of generated content. Then, safety will be highlighted by our study on hypnotizing large language models, DeepInception, which leverages the creation of a novel nested scenario to induce adaptive jailbreak behaviours, revealing vulnerabilities during interactive model engagement.

About this Lecture

Number of Slides:  65
Duration:  60 minutes
Languages Available:  Chinese (Simplified), English
Last Updated:  14/11/2025

Request this Lecture

To request this particular lecture, please complete this online form.

Request a Tour

To request a tour with this speaker, please complete this online form.

All requests will be sent to ACM headquarters for review.