Vatsal Baherwani

Hello, welcome to my website!

About Me

I am a PhD student at New York University, advised by Pavel Izmailov and Andrew Gordon Wilson. I am grateful to be supported by the NSF Graduate Research Fellowship.

I previously graduated with a bachelor's degree from the University of Maryland, where I was fortunate to work with Pete Kyle (who first introduced me to academic research), Abhinav Shrivastava, Ashwinee Panda, and Tom Goldstein.

This summer I will be at Q Labs as a research intern in San Francisco. In summer 2025, I was a visiting researcher at the Center for Human-Compatible AI at UC Berkeley working with Raj Movva and Emma Pierson on practical applications for ML interpretability.

Check out my blog to learn more about me.

Email | Google Scholar | Twitter | GitHub | LinkedIn

Publications

Emergent Capabilities Arise Randomly from Learning Sparse Attention Patterns
Vatsal Baherwani, Zixi Chen, Shikai Qiu, Andrew Gordon Wilson, Pavel Izmailov
arXiv
paper | blog

What happens when a language model acquires a practical capability such as in-context learning? We show these behaviors correspond to abrupt changes in the attention mechanisms of the model, and various architectural interventions (including width scaling) can accelerate the emergence of these capabilities.

DynaGuard: A Dynamic Guardrail Model With User-Defined Policies
Monte Hoover, Vatsal Baherwani, Neel Jain, Khalid Saifullah, Joseph Vincent, Chirag Jain, Melissa Kazemi Rad, C. Bayan Bruss, Ashwinee Panda, Tom Goldstein
ICLR 2026
paper | code | models

Guardian models are useful for monitoring the safety and quality of deployed LLMs, but prior models fail to properly enforce domain-specific guardrails. We develop a dynamic guardian model that adapts to arbitrary user-specified rules and constraints at runtime.

Dense Backpropagation Improves Training for Sparse Mixture-of-Experts
Ashwinee Panda*, Vatsal Baherwani*, Zain Sarwar, Benjamin Therien, Supriyo Chakraborty, Tom Goldstein
NeurIPS 2025
paper | code

One of the main challenges in training a very sparse mixture-of-experts model is that you only get to update a small subset of your parameters in each optimization step. We develop a method to train inactive experts by estimating their gradients, which leads to significant improvement in training speed with negligible computational overhead.