I gave a flash talk at Edinburgh Pydata yesterday. It covered the merits and pitfalls of PySpark and Databricks as a big data processing platform.…
Tag: Big data
Recently the Apache Foundation have released a very useful new storage format for use with Spark called Delta. Delta is an extension to the parquet…
Databricks is a very handy cloud platform for large scale data processing and machine learning using Spark. However it does have some idiosyncrasies. Here are…