Ramya Narasimha

Senior Software Engineer at Microsoft

"Architecting Reliability: Moving Beyond the 'Perfect Answer' in AI Evaluation"

Wed May 13 - 1:50 PM EDT/New York (See in local time)

Add to Calendar 05/13/2026 1:50 PM 05/13/2026 02:10 PM America/New_York #WTGC2026

"Architecting Reliability: Moving Beyond the 'Perfect Answer' in AI Evaluation"

#WTGC2026

"Architecting Reliability: Moving Beyond the 'Perfect Answer' in AI Evaluation"

https://www.womentech.net/ringcentral https://www.womentech.net/ringcentral

Get Tickets

Don’t miss out and join visionaries, innovators, and thought leaders from all over the world at the Women in Tech Global Conference.

Vote by Sharing

Unite 100 000 Women in Tech to Drive Change with Purpose and Impact.

Do you want to see this session? Help increase the sharing count and the session visibility. Sessions with +10 votes will be available to career ticket holders.
Please note that it might take some time until your share & vote is reflected.

Session: Architecting Reliability: Moving Beyond the 'Perfect Answer' in AI Evaluation

In 2026, the challenge with AI isn’t just generating correct answers—it’s knowing when to trust them. Most AI evaluation today still centers on a narrow set of metrics: accuracy, latency, and fluency. While these are easy to measure, they say very little about how an AI system actually behaves when information is incomplete, signals conflict, or verification fails. This talk explores why current evaluators are poorly equipped to assess reasoning quality and user trust, and why evaluating internal chains of thought is neither stable nor sufficient. Instead, we’ll examine a shift toward evaluating what the system checked: which assumptions were verified, what evidence was available, and where uncertainty was explicitly surfaced. By focusing evaluation on transparency and verification—rather than polished outputs alone—we can design AI systems that users trust even when answers are imperfect. This session offers practical, engineering‑driven insights into how evaluation strategies shape user confidence and will define the next generation of trustworthy AI systems.

Key Takeaways

Users trust AI systems not because the answers are perfect, but because the system makes its verification and uncertainty visible.

Bio

Ramya Narasimha is a Senior Software Engineer at Microsoft, where she builds large‑scale AI‑powered systems focused on reliability, evaluation, and user trust. Her work centers on designing AI platforms that operate in complex, real‑world environments—where correctness alone isn’t enough and transparency becomes essential. With a background in large scale cloud diagnostics, and integrating AI into the workflow, Ramya is passionate about rethinking how we measure AI quality and how those measurements shape user confidence.

Ramya Narasimha

Senior Software Engineer at Microsoft

"Architecting Reliability: Moving Beyond the 'Perfect Answer' in AI Evaluation"

Vote by Sharing

Session: Architecting Reliability: Moving Beyond the 'Perfect Answer' in AI Evaluation

Key Takeaways

Bio

Don't miss out on the latest Women in Tech events, updates and news!

Powered By

Women in Tech Network

Women in Tech Conference

Tech Women Impact Globally

Follow us

Ramya Narasimha

Senior Software Engineer at Microsoft

"Architecting Reliability: Moving Beyond the 'Perfect Answer' in AI Evaluation"

Vote by Sharing

Session: Architecting Reliability: Moving Beyond the 'Perfect Answer' in AI Evaluation

Key Takeaways

Bio

Don't miss out on the latest Women in Tech events, updates and news!

Powered By​​​​​​​

Women in Tech Network

Women in Tech Conference

Tech Women Impact Globally

Follow us

Powered By