Skip to main content
Featured: Women in Tech Global Conference 2026 Virtual-first
Fri, 12/19/2025 - 23:58

Popular
Demand!

⏰ Extended: Super Early Bird Tickets Until This Friday (Dec 19)

Days
Hours
Minutes
Seconds
Women in Tech Conference

12-15 May 2026
Virtual & In-Person*

Toggle menu
  • Why Attend
    • Overview
    • Meet Ambassadors
    • Media & Community Partners
    • Convince your manager
    • Code of Conduct
    • Register Interest
  • Program
    • Schedule
    • In-Person Networking Events
    • May 12 - Tuesday - Chief in Tech Summit
    • May 13 - Wednesday - AI & Key Tech Summit
    • May 14 - Startup & Innovation Summit
    • May 15 - Friday - Career Growth Summit
    • Tracks & Topics
  • Speakers
    • Overview
    • Apply to Speak
    • Executive Women
    • Women in AI and Data Science
    • Women in Product Development, UX & Design
  • Companies & Careers
    • Overview
    • Companies hiring at WTGC
    • Job Opportunities at WTGC
    • Career Profile
    • Mentoring Program
    • Career Growth Summit
  • Partner
    • 2024 Edition
    • 2023 Edition
    • 2022 Edition
    • 2021 Edition
    • 2020 Edition
    • Sponsor
  • 🎫 Tickets
    • Book Tickets
    • Group Tickets
    • Apply for Scholarship
    • Volunteers
  1. Speaker
  2. Vrinda
  3. Speakers
  4. Speakers
WOMEN IN TECH GLOBAL CONFERENCE 2026

Vrinda Bhatia

Senior Software Engineer at Block

headshot1.jpg


"Evaluating the Unpredictable: Observability for Production-grade LLM Agents"

Get Tickets


Don’t miss out and join visionaries, innovators, and thought leaders from all over the world at the Women in Tech Global Conference.


Vote by Sharing

Unite 100 000 Women in Tech to Drive Change with Purpose and Impact.



Do you want to see this session? Help increase the sharing count and the session visibility. Sessions with +10 votes will be available to career ticket holders.
Please note that it might take some time until your share & vote is reflected.

Session: Evaluating the Unpredictable: Observability for Production-grade LLM Agents

As LLM-powered applications transition from demos to production, teams encounter failure modes that traditional observability tools were never designed to handle: non-deterministic reasoning, tool misuse, silent hallucinations, evaluation blind spots, and unpredictable cost explosions.

This talk presents a production-first evaluation and observability framework for LLM agents, grounded in real world. We show how to define agent-specific evaluation criteria and wire them directly into tracing and feedback loops, turning evaluation from a one-time exercise into a continuous system.

Attendees will learn how to:

- Define evaluation criteria beyond accuracy, including tool correctness, reasoning validity, outcome relevancy, latency, cost efficiency; and decide which criteria to use when

- Detect hidden failure modes such as silent hallucinations, incorrect tool selection, partial task completion, and cascading agent errors

- Combine automated evals with human-in-the-loop signals to validate edge cases and continuously recalibrate scoring thresholds

- Use tracing and structured telemetry to correlate eval failures with specific prompts, tools, or reasoning steps

Additionally we will walk through a real world LLM agent use case, illustrating how evaluation signals interact with end-to-end traces to surface issues that would otherwise evade detection until user trust is lost.

By the end of the talk, attendees will have a clear blueprint for operationalizing evaluation in LLM agents, enabling teams to ship systems that are not only impressive in demos, but reliable in production.


Key Takeaways

  • A practical blueprint for operationalizing evaluation and observability, enabling teams to confidently ship reliable, production-grade LLM agents.
  • How to define evaluation criteria for LLM agents beyond accuracy, including reasoning correctness, tool usage, hallucinations, and cost efficiency.
  • Techniques to instrument LLM systems with end-to-end tracing and structured telemetry to surface hidden failures.
  • How to combine automated and human-in-the-loop evaluation to continuously improve agent behavior in production.


Bio

Vrinda Bhatia is a seasoned software engineer and AI builder with over a decade of experience in companies like AWS and Block. Most recently, she is working as a Senior Software Engineer at Block, developing infrastructure for ML inference, where her work helped prevent over $220M in fraud losses in 2024. Before that she was in AWS AppStream - a secure application streaming service. Her work was critical during the COVID-19 pandemic, empowering organizations like Washington State Pandemic Center, and Los Angeles County to transition thousands of students and employees to secure, remote environments. She is also a key contributor in an open source library for model distillation (https://github.com/horus-ai-labs/DistillFlow/) which has gotten over 150+ stars on Github. She is passionate about solving real-world problems at scale. Beyond the technical work she loves sharing knowledge with the developer community through talks, mentorship, and open collaboration.

Sujata Sridharan is a Senior Software Engineer at Bolt Financial, where she builds AI-driven commerce infrastructure that powers next-generation e-commerce experiences. With nearly a decade of experience spanning Microsoft, Amazon, and Bolt Financial, she specializes in architecting reliable, compliant, and human-centered AI systems, from large-scale identity and security platforms to production-grade LLM infrastructure powering over a billion dollars in transactions. Beyond her engineering work, Sujata is an active mentor and community builder, guiding emerging AI practitioners through workshops, hackathons, and speaking engagements such as DevFest DC. Her current focus is on developing practical frameworks that make trustworthy AI both measurable and scalable across organizations.


019b4581-bfcb-7211-af4b-bf9785137e3c_0_0.jpg

Don't miss out on the latest Women in Tech events, updates and news!

Stay in the loop by subscribing to our newsletter.

Powered By​​​​​​​

Women in Tech
Coding Girls

Women in Tech Network

About Women Tech
Career & Hiring
Membership
Women in Tech Statistics

Women in Tech Conference

Why Attend
Tickets
Sponsor
Contact

Tech Women Impact Globally 

Women in Tech New York
Women in Tech London
Women in Tech DC
Women in Tech Berlin

Women in Tech Barcelona
Women in Tech Toronto
Women in Tech San Francisco
All Women in Tech Countries

Privacy - Imprint  -  Sitemap - Terms & Conditions

Follow us

  • facebook
  • linkedin
  • instagram
  • twitter
  • youtube
sfy39587stp18