Will Cai

Hi, I’m currently a Research Fellow at Anthropic and am finishing my BA and MS at UC Berkeley, advised by Dawn Song.

I’m broadly interested in AI safety and security. Lately I’ve been thinking about defenses against adversarial distillation and automated interpretability research agents.