Joint Statistics and Applied Mathematics Colloquium: Subho Majumdar (Vijil)
trustworthy AI--safety, security, reliability
Location
Sondheim Hall : 105
Date & Time
February 10, 2025, 12:00 pm – 1:00 pm
Description
Title: Towards Statistical Foundations for Reliable and Defendable Large Language Models
Abstract: The emergence of Large Language Models (LLMs) has brought in concomitant concerns about the security and reliability of generative AI systems. While LLMs promise powerful capabilities in diverse real-world applications, ensuring that their outputs are resilient to malicious attacks and consistent across similar inputs has significant methodological and computational challenges. This situation calls for the revisiting of modern deep learning architectures through a statistical lens.
I will present two interconnected themes in this area. First, I will introduce Representation Noising (RepNoise), a defense mechanism that protects the weights of open-source LLMs against malicious uses. RepNoise achieves this through controlled noise injection in the knowledge representations inside a model that makes it harder to recover harmful information later. Second, I discuss my work on the consistency problem—the equivalent of robustness in LLMs—concerned with measuring and minimizing the sensitivity of LLM outputs to input variations through a combination of controlled synthetic data generation and fine-tuning.
I will conclude by discussing open problems at the intersection of AI security and statistics, including the development of statistical bounds for the strength of defense mechanisms like RepNoise, and robustness frameworks for ensuring AI system reliability in high-stakes applications.