How We Evaluate Conversational AI Agents
Defining what 'good' looks like is the hardest part of building AI agents. Here's the evaluation framework we built, how it works, and why it changed everything about how we ship.
Craig Certo
Senior Engineer at Indemn