Mert Yuksekgonul

I'm a fourth year PhD student in Computer Science at Stanford University. I am lucky to be advised by James Zou and Carlos Guestrin.

I work on self-improving, controllable AI.

Supervision is the fundamental bottleneck to the progress of deep learning and solving complex problems, and thus models improving themselves with synthetic data / RL have been gaining increasing traction.

To this end, I develop algorithms [e.g., TextGrad (Nature)] to improve AI systems using themselves , use their internal representations [e.g., mechanistic error detectors (ICLR '24), concept bottlenecks (ICLR '23), concept-based counterfactuals (ICML '22)] to make them more reliable , and study how training shapes their failure modes [e.g., bag-of-wordness of VLMs (ICLR '23), atypicality and calibration (NeurIPS '23)].

Mert Yuksekgonul

Selected Publications

For a full list of publications, please see my Google Scholar.

Optimizing generative AI by backpropagating language model feedback
Nature
Mert Yuksekgonul*, Federico Bianchi*, Joseph Boen*, Sheng Liu*, Pan Lu*, Zhi Huang*, Carlos Guestrin, James Zou
When and why vision-language models behave like bags-of-words, and what to do about it?
Oral, ICLR '23 (Top 5% of all accepted papers)
Mert Yuksekgonul, Federico Bianchi, Pratyusha (Ria) Kalluri, Dan Jurafsky, James Zou
Beyond Confidence: Reliable Models Should Also Quantify Atypicality
NeurIPS '23, Contributed Talk - ICLR '23 Trustworthy ML
Mert Yuksekgonul, Linjun Zhang, James Zou, Carlos Guestrin
Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
ICLR '24
Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones, Suriya Gunasekar, Ranjita Naik, Hamid Palangi, Ece Kamar, Besmira Nushi
Post-hoc Concept Bottleneck Models
Spotlight, ICLR '23 (Top 25% of all accepted papers)
Mert Yuksekgonul, Maggie Wang, James Zou
A visual–language foundation model for pathology image analysis using medical Twitter
Nature Medicine '23, Cover
Zhi Huang*, Federico Bianchi*, Mert Yuksekgonul, Thomas J Montine, James Zou
Meaningfully debugging model mistakes using conceptual counterfactual explanations
ICML '22
Abubakar Abid*, Mert Yuksekgonul*, James Zou
Holistic Evaluation of Language Models
TMLR '23
Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, Yuta Koreeda