Skip to main content
Featured: Women in Tech Global Conference 2026 Virtual-first
Women in Tech Conference

18-21 May 2027
Virtual & In-Person*

Toggle menu
  • Why Attend
    • Overview
    • Meet Ambassadors
    • Media & Community Partners
    • Convince your manager
    • Code of Conduct
    • Register Interest
  • Program
    • Schedule
    • In-Person Networking Events
    • May 12 - Tuesday - Chief in Tech Summit
    • May 13 - Wednesday - AI & Key Tech Summit
    • May 14 - Thursday - Career Growth Summit
    • May 15 - Friday - Startup & Innovation Summit
    • Tracks & Topics
  • Speakers
    • Overview
    • Apply to Speak
    • Executive Women
    • Women in AI and Data Science
    • Women in Product Development, UX & Design
  • Companies & Careers
    • Overview
    • Companies hiring at WTGC
    • Job Opportunities at WTGC
    • Career Profile
    • Mentoring Program
    • Career Growth Summit
  • Partner
    • 2024 Edition
    • 2023 Edition
    • 2022 Edition
    • 2021 Edition
    • 2020 Edition
    • Sponsor
  • 🎫 Tickets
    • Book Tickets
    • Group Tickets
    • Apply for Scholarship
    • Volunteers
  1. Speaker
  2. Sowmya
  3. Speakers
  4. Speakers
WOMEN IN TECH GLOBAL CONFERENCE 2026

Sowmya Podila

Senior Applied AI Scientist at Fortune 50 Retail

img_2445_0.jpg


"Cutting LLM Inference Costs by 50–90%: Introduction to Caching in AI systems"

Wed May 13 - 12:10 PM EDT/New York (See in local time)
Add to Calendar 05/13/2026 12:10 PM 05/13/2026 12:30 PM America/New_York #WTGC2026

"Cutting LLM Inference Costs by 50–90%: Introduction to Caching in AI systems"
#WTGC2026

"Cutting LLM Inference Costs by 50–90%: Introduction to Caching in AI systems"
https://www.womentech.net/ringcentral
https://www.womentech.net/ringcentral
Get Tickets


Don’t miss out and join visionaries, innovators, and thought leaders from all over the world at the Women in Tech Global Conference.


Vote by Sharing

Unite 100 000 Women in Tech to Drive Change with Purpose and Impact.



Do you want to see this session? Help increase the sharing count and the session visibility. Sessions with +10 votes will be available to career ticket holders.
Please note that it might take some time until your share & vote is reflected.

Session: Cutting LLM Inference Costs by 50–90%: Introduction to Caching in AI systems

Enterprise AI systems are rapidly encountering scaling challenges rising costs, slower responses, and increasing complexity from long-context and multimodal inputs. Caching has emerged as one of the most effective strategies to address these issues, delivering dramatic improvements in both performance and cost efficiency.

In this session, we explore how modern caching techniques ranging from model-level KV caching to prompt and semantic caching are transforming LLM system design. Through real-world examples and vendor benchmarks, we demonstrate how organizations are achieving up to 90% cost savings and significant latency reductions.

We’ll also cover how these techniques can be implemented and integrated into existing AI pipelines, along with best practices for monitoring, evaluation, and production readiness.

Attendees will gain a clear understanding of where caching delivers the highest ROI and how to apply it effectively in enterprise AI environments.


Key Takeaways

  • Optimizing AI pipelines


Bio

Sowmya Podila is a Senior Applied AI Scientist currently with Target and has a decade of experience in AI/ML with organizations such as AWS and Gartner. She led the widely recognized TrendBrain initiative at Target, featured in RetailDive and CNBC for leveraging AI for fashion trend analysis to elevate the style and design of their Owned brands.

Beyond her industry work, Sowmya is an AI advisor to not-for-profits, Program Chair@RecSys 2026, IEEE Senior member, IEEE Access reviewer and an active voice in the AI community. She creates content to share practical insights and emerging trends in artificial intelligence, runs a LinkedIn-based mini podcast series exploring AI applications across sectors and an AI event host (Hosted Generative AI Summit, DC 2026).

Outside of her professional pursuits, Sowmya is a new mom and an avid travel enthusiast.

Connect with her:
LinkedIn: https://linkedin.com/in/sowmyapodila
Instagram: @indigirl.ai

019d8c89-6762-71ba-9069-0c22e828ef73_0_0_0_0_0.jpg

Don't miss out on the latest Women in Tech events, updates and news!

Stay in the loop by subscribing to our newsletter.

Powered By​​​​​​​

Women in Tech
Coding Girls

Women in Tech Network

About Women Tech
Career & Hiring
Membership
Women in Tech Statistics

Women in Tech Conference

Why Attend
Tickets
Sponsor
Contact

Tech Women Impact Globally 

Women in Tech New York
Women in Tech London
Women in Tech DC
Women in Tech Berlin

Women in Tech Barcelona
Women in Tech Toronto
Women in Tech San Francisco
All Women in Tech Countries

Privacy - Imprint  -  Sitemap - Terms & Conditions

Follow us

  • facebook
  • linkedin
  • instagram
  • twitter
  • youtube
sfy39587stp18