Nabla-Reasoner: Gradient Descent at Inference Time Makes LLMs Think Harder

A new ICLR 2026 method uses gradient-based optimization on token logits during inference, boosting math reasoning accuracy by over 20% while cutting model calls by up to 40%.
artificial-intelligence
Author

Kabui, Charles

Published

2026-03-14

Keywords

llm-reasoning, test-time-compute, gradient-descent, differentiable-optimization, mathematical-reasoning