Everyone is building AI agents - but at the core is the LLM, and choosing the right one is critical. With new models launching every week, how can we make informed, data-driven choices? In this session, we’ll dive into LLM selection. We’ll share results from a study testing 15 leading models on real-world code summarization tasks, using practical metrics like verbosity, latency, cost, accuracy, and information gain. Expect clear insights into how today’s models actually perform - beyond benchmarks and hype - and what that means for building coding assistants, developer copilots, and multi-modal agents.