I gave a flash talk at Edinburgh Pydata yesterday. It covered the merits and pitfalls of PySpark and Databricks as a big data processing platform. I thought I would take a moment to share my slides online in case anyone else would like to take a look. They can be found here on Google Slides
TLDR: a promising set of tools that are still under rapid development. The learning curve can be a little steep here and there due to some departures from standard Python and dataframe design principles. Very useful now (I am using them in production at QueryClick) and likely to get even better over time.