Categorisation of images has long been possible with a variety of visual machine learning models. However in almost all cases the categories have to be…
Category: Data Science
Sometimes a situation will crop up where you want access functionality in Databricks which is not readily accessible via Python. In these cases Databricks allows…
I recently came across a strange little problem with a satisfying solution which people building path based models might be interested in. The Problem Imagine…
The updates to graphing with PySpark onDatabricks have made it much nicer to work with. Options exist to aggregate data direct in the graph, handle…
With the rise of Large Language Models projects have arisen to try to make LLMs easier to use. One of the most prominent of these…
Recently Databricks made an exciting announcement. They have created a new library which allows you to use large language models to perform operations on PySpark…
Recently I noticed that the ArrayType in PySpark is missing some useful aggregation functions. Lets suppose you have a data frame created as follows: If…
A key concern when designing Machine Learning models is to try to avoid bias which might lead to unfair models. When thinking about how things…
When I moved from using mostly AWS to using mostly Azure at work, I did not find the transition too painful. However while I mostly…
Exploding arrays is often very useful in PySpark. However because row order is not guaranteed in PySpark Dataframes, it would be extremely useful to be…