Sumeet Motwani
University of Oxford
Hi! I’m Sumeet, an ML Researcher studying Computer Science at UC Berkeley. I’m working on Multiagent Systems, Security, Reward Learning, and Foundation Model Safety.
My current research focus is on Language Agents at Berkeley Artificial Intelligence and on Agentic Collusion with research groups in the UK. Previously, I’ve done internships/residencies at Redwood Research, Solana Labs, Convergent Finance, and GEn1E Lifesciences.
This website is under construction.
LinkedIn / Google Scholar / X / Email
updates
Oct 7, 2024 | Started a PhD in Machine Learning at the University of Oxford, advised by Philip Torr and Christian Schroeder de Witt |
---|---|
Sep 20, 2024 | Two papers, Secret Collusion and Unelicitable Backdoors accepted at NeurIPS 2024 |
Sep 10, 2024 | Spent my summer working on agents and post-training at MultiOn as a Research Scientist Intern |
May 18, 2024 | Graduated early from UC Berkeley; Distinction/Dean’s List/EECS Honors |
Jan 20, 2024 | STARC accepted at ICLR 2024 |
Aug 15, 2023 | Completed an internship working on Deep Learning for Drug Discovery at GEn1E, a YC/Khosla Ventures company |
Jul 22, 2023 | Accepted to UC Berkeley’s EECS Honors Program |
Feb 10, 2023 | Completed an ML Research Residency at Redwood Research |
Sep 10, 2021 | Joined Berkeley AI Research. Advised by Dan Hendrycks and Dawn Song |
selected publications
- Preprint
- NeurIPS 2024Secret Collusion among AI Agents: Multi-Agent Deception via SteganographyThirty-Eighth Conference on Neural Information Processing Systems, Feb 2024
- Preprint
- NeurIPS 2024Unelicitable Backdoors in Language Models via Cryptographic Transformer CircuitsThirty-Eighth Conference on Neural Information Processing Systems, Jun 2024
- TMLRFoundational Challenges in Assuring Alignment and Safety of Large Language ModelsTransactions on Machine Learning Research, Apr 2024
- ICLR 2024STARC: A General Framework For Quantifying Differences Between Reward FunctionsThe Twelfth International Conference on Learning Representations, Sep 2023
- NeurIPS MASECA Perfect Collusion Benchmark: How can AI agents be prevented from colluding with information-theoretic undetectability?In Multi-Agent Security Workshop, NeurIPS’23, Oct 2023