|
DynaGuard: A Dynamic Guardrail Model With User-Defined Policies
Monte Hoover, Vatsal Baherwani, Neel Jain, Khalid Saifullah, Joseph Vincent, Chirag Jain, Melissa Kazemi Rad, C. Bayan Bruss, Ashwinee Panda, Tom Goldstein
ICLR 2026
paper
|
code
|
models
Guardian models are useful for monitoring the safety and quality of deployed LLMs, but prior models fail to properly enforce domain-specific guardrails. We develop a dynamic guardian model that adapts to arbitrary user-specified rules and constraints at runtime.
|
|
Dense Backpropagation Improves Training for Sparse Mixture-of-Experts
Ashwinee Panda*, Vatsal Baherwani*, Zain Sarwar, Benjamin Therien, Supriyo Chakraborty, Tom Goldstein
NeurIPS 2025
paper
|
code
One of the main challenges in training a very sparse mixture-of-experts model is that you only get to update a small subset of your parameters in each optimization step. We develop a method to train inactive experts by estimating their gradients, which leads to significant improvement in training speed with negligible computational overhead.
|
|