This post will serve as a continuous knowledge dump regarding the “Learning Spark 2.0” book, where I’ll dump certain quotes that I find relevant (and hopefully you will too :]!)
In Spark’s supported languages, columns are objects with public methods (represented by the Column type).
Code example that uses expr(), withColumn, and col():
blogsDF.withColumn(“Big Hitters”, (expr(“Hits > 10000”))).show()
The above adds a new column, Big Hitters, based on the conditional expression, noting that expr(…) part can be changed with: col(“Hits”) > 1000