Hina Gandhi

Software Engineering Technical Leader at Cisco

"Beyond Manual Tuning: How RL Agents Learned to Optimize Apache Spark"

Wed May 13 - 10:50 AM EDT/New York (See in local time)

Add to Calendar 05/13/2026 10:50 AM 05/13/2026 11:30 AM America/New_York #WTGC2026

"Beyond Manual Tuning: How RL Agents Learned to Optimize Apache Spark"

#WTGC2026

"Beyond Manual Tuning: How RL Agents Learned to Optimize Apache Spark"

https://www.womentech.net/ringcentral https://www.womentech.net/ringcentral

Get Tickets

Don’t miss out and join visionaries, innovators, and thought leaders from all over the world at the Women in Tech Global Conference.

Vote by Sharing

Unite 100 000 Women in Tech to Drive Change with Purpose and Impact.

Do you want to see this session? Help increase the sharing count and the session visibility. Sessions with +10 votes will be available to career ticket holders.
Please note that it might take some time until your share & vote is reflected.

Session: Beyond Manual Tuning: How RL Agents Learned to Optimize Apache Spark

Apache Spark's performance depends heavily on configuration parameters like shuffle partitions, memory allocation, and parallelism settings. Data engineering teams typically rely on static defaults (which rarely match workload reality) or time-consuming manual tuning that doesn't adapt as data patterns evolve. A configuration optimized for small daily reports fails catastrophically on massive end-of-month aggregations—yet tuning each workload variant manually is unsustainable as organizations process increasingly diverse datasets.
This talk demonstrates how reinforcement learning transforms Spark configuration from a manual bottleneck into an autonomous, adaptive system. We built a Q-learning agent that observes dataset characteristics (size, cardinality, skew), experiments with different configurations, measures performance, and learns optimal settings for varying workload patterns—developing expertise comparable to experienced engineers but with perfect memory and systematic exploration.
Through comparative experiments, we show that combining our RL agent with Spark's Adaptive Query Execution (AQE) delivers 46-68% performance improvements over AQE alone. The RL agent provides pre-execution intelligence by selecting optimal initial configurations, while AQE handles runtime adaptations—addressing complementary optimization opportunities.
We then extend this to Multi-Agent Reinforcement Learning (MARL), where specialized agents independently optimize different domains: partitions, memory allocation, CPU cores, and caching strategies. Each agent becomes an expert in its area while collectively achieving comprehensive workload optimization, demonstrating a practical path toward intelligent, self-tuning big data infrastructure.
Key Takeaways:
1.How Q-learning agents learn from job execution feedback to build configuration policies
2. Why hybrid RL+AQE optimization outperforms either approach alone
3. Multi-agent architecture for scaling autonomous optimization across all Spark parameters
4. Practical deployment strategies for production environments

Key Takeaways

How Q-learning agents learn from job execution feedback to build configuration policies
Why hybrid RL+AQE optimization outperforms either approach alone
Multi-agent architecture for scaling autonomous optimization across all Spark parameters
Practical deployment strategies for production environments

Bio

Hina is a technical leader with extensive experience in designing and developing scalable, high-performance applications. She holds a Master’s degree in Information Systems and a Bachelor’s degree in Computer Science Engineering. Over the years, she has demonstrated her technical expertise through impactful roles at Cisco Systems, VMware, and CloudHealth Technologies, excelling in areas such as cloud-based microservices, big data platforms, and SaaS solution development.In recognition of her leadership and technical impact, Hina was named the 2025 Women in Tech – Software Engineering Leader of the Year. Beyond her industry contributions, she is passionate about giving back to the community by mentoring students and delivering guest lectures at universities, inspiring the next generation of technology professionals.

Hina Gandhi

Software Engineering Technical Leader at Cisco

"Beyond Manual Tuning: How RL Agents Learned to Optimize Apache Spark"

Vote by Sharing

Session: Beyond Manual Tuning: How RL Agents Learned to Optimize Apache Spark

Key Takeaways

Bio

Don't miss out on the latest Women in Tech events, updates and news!

Powered By

Women in Tech Network

Women in Tech Conference

Tech Women Impact Globally

Follow us

Hina Gandhi

Software Engineering Technical Leader at Cisco

"Beyond Manual Tuning: How RL Agents Learned to Optimize Apache Spark"

Vote by Sharing

Session: Beyond Manual Tuning: How RL Agents Learned to Optimize Apache Spark

Key Takeaways

Bio

Don't miss out on the latest Women in Tech events, updates and news!

Powered By​​​​​​​

Women in Tech Network

Women in Tech Conference

Tech Women Impact Globally

Follow us

Powered By