Sumeet Motwani

University of Oxford

prof_pic.jpg

Hi! I’m Sumeet, an ML Researcher studying Computer Science at UC Berkeley. I’m working on Multiagent Systems, Security, Reward Learning, and Foundation Model Safety.

My current research focus is on Language Agents at Berkeley Artificial Intelligence and on Agentic Collusion with research groups in the UK. Previously, I’ve done internships/residencies at Redwood Research, Solana Labs, Convergent Finance, and GEn1E Lifesciences.

This website is under construction.

LinkedIn / Google Scholar / X / Email

updates

Oct 7, 2024 Started a PhD in Machine Learning at the University of Oxford, advised by Philip Torr and Christian Schroeder de Witt
Sep 20, 2024 Two papers, Secret Collusion and Unelicitable Backdoors accepted at NeurIPS 2024
Sep 10, 2024 Spent my summer working on agents and post-training at MultiOn as a Research Scientist Intern
May 18, 2024 Graduated early from UC Berkeley; Distinction/Dean’s List/EECS Honors
Jan 20, 2024 STARC accepted at ICLR 2024
Aug 15, 2023 Completed an internship working on Deep Learning for Drug Discovery at GEn1E, a YC/Khosla Ventures company
Jul 22, 2023 Accepted to UC Berkeley’s EECS Honors Program
Feb 10, 2023 Completed an ML Research Residency at Redwood Research
Sep 10, 2021 Joined Berkeley AI Research. Advised by Dan Hendrycks and Dawn Song

selected publications

  1. Preprint
    MALT: Improving Reasoning with Multi-Agent LLM Training
    Sumeet Ramesh Motwani, Chandler Smith, Rocktim Jyoti Das, Markian Rybchuk, Philip H. S. Torr, Ivan Laptev, Fabio Pizzati, and 2 more authors
    arXiv preprint, Dec 2024
  2. NeurIPS 2024
    Secret Collusion among AI Agents: Multi-Agent Deception via Steganography
    Sumeet Ramesh Motwani, Mikhail Baranchuk, Martin Strohmeier, Vijay Bolina, Philip H. S. Torr, Lewis Hammond, and Christian Schroeder Witt
    Thirty-Eighth Conference on Neural Information Processing Systems, Feb 2024
  3. Preprint
    Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
    Pranav Putta, Edmund Mills, Naman Garg, Sumeet Motwani, Chelsea Finn, Divyansh Garg, and Rafael Rafailov
    arXiv preprint, Aug 2024
  4. NeurIPS 2024
    Unelicitable Backdoors in Language Models via Cryptographic Transformer Circuits
    Andis Draguns, Andrew Gritsevskiy, Sumeet Ramesh Motwani, Charlie Rogers-Smith, Jeffrey Ladish, and Christian Schroeder Witt
    Thirty-Eighth Conference on Neural Information Processing Systems, Jun 2024
  5. TMLR
    Foundational Challenges in Assuring Alignment and Safety of Large Language Models
    Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, and 35 more authors
    Transactions on Machine Learning Research, Apr 2024
  6. ICLR 2024
    STARC: A General Framework For Quantifying Differences Between Reward Functions
    J. Skalse, L. Farnik, Sumeet Ramesh Motwani, E. Jenner, A. Gleave, and A. Abate
    The Twelfth International Conference on Learning Representations, Sep 2023
  7. NeurIPS MASEC
    A Perfect Collusion Benchmark: How can AI agents be prevented from colluding with information-theoretic undetectability?
    Sumeet Ramesh Motwani, M. Baranchuk, L. Hammond, and C. S. Witt
    In Multi-Agent Security Workshop, NeurIPS’23, Oct 2023