SparkAI Summit is the largest big data event dedicated to Apache Spark and Artificial Intelligence. The conference is organized by databricks with over 2,300 attendees, customers, partners, and amazing guest speakers.
This year SparkAI Summit was held in Amsterdam (Europe). We attended this year, as part of our partnership with databricks.
The SparkAI Summit is used as a showcase for new Spark features, as well as relevant innovations in Spark to the world. This was carried out over three days of keynote talks, along with the deep technical talks ranging across a number of different topics.
The central focus was the introduction of Spark 3.0 which is having Spark with Graph. Spark Graph will bring property Graphs and Cypher to Spark 3.0, whereas Cypher is SQL for Graphs. Adding graphs made the spark results visually available, this was always an issue when we had to take spark results for graph visualization and needed to use additional tools. There were other important features also added this year like the Koalas project which makes the data scientist more productive by implementing the pandas DataFrame API on top of Apache Spark. Also, MLFlow with more features added to it like a model registry, which enables you to govern your models better. Delta format as in Delta lake will become a new standard format for Spark (Spark on ACID).
There were some interesting keynotes, such as that introducing Unified Data Analytics, which is helping Data Teams Solve the World’s Toughest Problems (https://www.youtube.com/watch?time_continue=3&v=7t4lhzTWM5I) and New Developments in the Open Source Ecosystem: Apache Spark 3.0, Delta Lake, and Koalas. There was an eye-catching Keynote which was the talk from Katie Bouman about the first photo of a Black Hole (https://www.youtube.com/watch?v=iiAi6Y3yaI4).
I have attended all the keynote talks but, I was focused more on the talks with detailed features on Koalas and Graphs were discussed for Spark 3.0. I believe Spark has become more usable by including the feature of graphs which will help to show the spark results in Graphical way. The sessions also included an AMA (Ask Me Anything) in which participants could directly connect to the keynote speakers and ask questions about the features, or personal doubts.
There were also some networking events accompanied by food and drinks where we got a chance to talk to different attendees and enjoy the music. This was one of the most insightful conferences where we got to know about new features of Spark and the collaboration of two technologies to achieve a better result. I am eager to apply these ideas to my upcoming project requirements where I can make most of it.
It is very exciting being partners with databricks and collaborate in the development of additional features in spark. I would like to grab such an opportunity in the future too to be part of such an insightful event.