Most people don't go into work excited to update their old code to slightly newer versions of APIs and figuring out what's replaced this. This is complicated in Spark where the new version of Spark will be dropping support for older language releases. This talk will explore how we can use tools to semi-automatically upgrade our Scala Spark code. While we look at the tools we'll talk about limitations in the current tooling and how that impacts what we can do with auto upgrades.
We'll wrap up with talking about how to test that your upgraded Spark code is correct.
Holden is a transgender Canadian open source developer with a focus on Apache Spark, Airflow, and related “big data” tools. She is the co-author of Learning Spark, High Performance Spark, and another Spark book that's a bit more out of date. She is a committer and PMC on Apache Spark and committer on SystemML & Mahout projects. She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal.