{"id":957,"date":"2021-12-28T17:12:51","date_gmt":"2021-12-28T17:12:51","guid":{"rendered":"https:\/\/justinmatters.co.uk\/wp\/?p=957"},"modified":"2022-01-16T17:13:23","modified_gmt":"2022-01-16T17:13:23","slug":"using-stackoverflow-to-solve-common-pyspark-issues","status":"publish","type":"post","link":"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/","title":{"rendered":"Using StackOverflow to Solve Common PySpark Issues"},"content":{"rendered":"<p><a href=\"https:\/\/stackoverflow.com\/\">StackOverflow<\/a> is a wonderful source of solutions to common yet tricky programming issues. However there are certainly a few things to be aware of when refering to it for PySpark. This article will discuss those pitfalls and also point out a few commonly useful StackOverflow articles for PySpark.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-960\" src=\"https:\/\/justinmatters.co.uk\/wp\/wp-content\/uploads\/2022\/01\/stack-overflow-700x171.jpg\" alt=\"\" width=\"700\" height=\"171\" srcset=\"https:\/\/justinmatters.co.uk\/wp\/wp-content\/uploads\/2022\/01\/stack-overflow-700x171.jpg 700w, https:\/\/justinmatters.co.uk\/wp\/wp-content\/uploads\/2022\/01\/stack-overflow-300x73.jpg 300w, https:\/\/justinmatters.co.uk\/wp\/wp-content\/uploads\/2022\/01\/stack-overflow-768x188.jpg 768w, https:\/\/justinmatters.co.uk\/wp\/wp-content\/uploads\/2022\/01\/stack-overflow.jpg 900w\" sizes=\"auto, (max-width: 700px) 100vw, 700px\" \/><\/p>\n<h2>Things to Bear in Mind<\/h2>\n<p>The accepted or most upvoted solution on StackOverflow is not always the best. This is partly because PySpark is still under development and the best approach may change over time and partly because the best answer may be situational.<\/p>\n<p>Sometimes you will find the only solution you can find to a Spark problem on StackOverflow is in Scala or Java rather than Python. The Spark community is fairly small so question coverage is not complete. The good news is that\u00a0 Scala solutions tend to be similar in syntax to those of Python making translation easy. Even Java answers can often give you strong clues about approach if not syntax.<\/p>\n<p>Some elements of PySpark are hard to search for because they depend on symbols. These include negation of filter and when conditions where a tilde symbol ~ should be used and use of * syntax when accessing the contents of struct columns. In these cases the <a href=\"http:\/\/symbolhound.com\/\">SymbolHound<\/a> search engine may be your friend. Otherwise you may need to rephrase your question.<\/p>\n<p>There are inconsistencies around import calls for common functions like <code>pyspark.sql.functions.col<\/code>. Personally I like to use <code>from pyspark.sql import functions as fn, types as T<\/code> since this seems to be the most commonly used import convention, but functions as F or importing individual functions is also not uncommon. Just bear this in mind when searching.<\/p>\n<p>Finally if you can&#8217;t find an answer, perhaps you have a novel question. You could submit a query of your own. Be sure to follow the <a href=\"https:\/\/stackoverflow.com\/help\/how-to-ask\">best practise for asking questions on StackOverFlow<\/a>.<\/p>\n<h2>Useful Solutions<\/h2>\n<p>Here are some solutions I seem to keep using or refering other people to:<\/p>\n<h3>Summing a column to a python variable<\/h3>\n<p>Lets start with one of those little tricks you find youreself using all the time. The code required is<\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\npython_variable = df.groupBy().sum().collect()&#x5B;0]&#x5B;0]\r\n<\/pre>\n<p>The trick is to collect and then reach into the row object returned to extract the value by indexing. Credit where credit is due, this is where I first saw how to do it<\/p>\n<p><a href=\"https:\/\/stackoverflow.com\/questions\/47812526\/pyspark-sum-a-column-in-dataframe-and-return-results-as-int\">https:\/\/stackoverflow.com\/questions\/47812526\/pyspark-sum-a-column-in-dataframe-and-return-results-as-int<\/a><\/p>\n<p>You can also extract an entire column to a python list<\/p>\n<p><a href=\"https:\/\/stackoverflow.com\/questions\/38610559\/convert-spark-dataframe-column-to-python-list\">https:\/\/stackoverflow.com\/questions\/38610559\/convert-spark-dataframe-column-to-python-list<\/a><\/p>\n<h3>Handle importing data with unpredictable schema<\/h3>\n<p>Sometimes you cannot be sure of the schema of data you are importing in advance. Or it may be mismatched between source files. No problem, this solution has you covered<\/p>\n<p><a href=\"https:\/\/stackoverflow.com\/questions\/39083873\/spark-2-0-0-reading-json-data-with-variable-schema\">https:\/\/stackoverflow.com\/questions\/39083873\/spark-2-0-0-reading-json-data-with-variable-schema<\/a><\/p>\n<h3>Counting nulls by a grouping condition<\/h3>\n<p>Sometimes you may want to count the number of nulls present in groups defined by a column. This neat answer has you covered<\/p>\n<p><a href=\"https:\/\/stackoverflow.com\/questions\/55265954\/pyspark-dataframe-groupby-and-count-null-values\">https:\/\/stackoverflow.com\/questions\/55265954\/pyspark-dataframe-groupby-and-count-null-values<\/a><\/p>\n<h3>Cast structured columns to JSON strings<\/h3>\n<p>There can be a number of reasons to want to do this, most notably if you want to group or order by a map column, or if you want to export to a flat structure like a CSV. The simple answer is that you need to use to_json, but here is an interesting helper function (which can easily have MapType columns added).<\/p>\n<p><a href=\"https:\/\/stackoverflow.com\/questions\/41730369\/spark-cast-structtype-json-to-string\">https:\/\/stackoverflow.com\/questions\/41730369\/spark-cast-structtype-json-to-string<\/a><\/p>\n<h3>Extracting from map columns<\/h3>\n<p>We can extract leys and values from map type columns. The following StackOverflow article details how though the documentaion reference has now moved.<\/p>\n<p><a href=\"https:\/\/stackoverflow.com\/questions\/40602606\/how-to-get-keys-and-values-from-maptype-column-in-sparksql-dataframe\">https:\/\/stackoverflow.com\/questions\/40602606\/how-to-get-keys-and-values-from-maptype-column-in-sparksql-dataframe<\/a><\/p>\n<p>For the documentation on extracting using <a href=\"https:\/\/spark.apache.org\/docs\/latest\/api\/python\/reference\/api\/pyspark.sql.functions.map_keys.html\">map_keys see here <\/a>for getting values with <a href=\"https:\/\/spark.apache.org\/docs\/latest\/api\/python\/reference\/api\/pyspark.sql.functions.map_values.html\">map_values check\u00a0 here<\/a>. Note that both these methods extract in an unordered fashion so don&#8217;t expect that the two lists must neccessarily match up<\/p>\n<p>Another neat trick is the following answer on how to extract a specific key value from an array of maps to a new array column<\/p>\n<p><a href=\"https:\/\/stackoverflow.com\/questions\/61761524\/pyspark-extract-values-from-from-array-of-maps-in-structured-streaming\">https:\/\/stackoverflow.com\/questions\/61761524\/pyspark-extract-values-from-from-array-of-maps-in-structured-streaming<\/a><\/p>\n<h3>Extract a column of lists of tuples to separate lists in separate columns<\/h3>\n<p>A little similar to the key value extraction above but for a different situation<\/p>\n<p><a href=\"https:\/\/stackoverflow.com\/questions\/48446595\/unzip-list-of-tuples-in-pyspark-dataframe\/48448731#48448731\">https:\/\/stackoverflow.com\/questions\/48446595\/unzip-list-of-tuples-in-pyspark-dataframe\/48448731#48448731<\/a><\/p>\n<h3>Dropping and adding nested columns<\/h3>\n<p>Here is a neat approach to dropping specific nested columns<\/p>\n<p><a href=\"https:\/\/stackoverflow.com\/questions\/45061190\/dropping-nested-column-of-dataframe-with-pyspark\">https:\/\/stackoverflow.com\/questions\/45061190\/dropping-nested-column-of-dataframe-with-pyspark<\/a><\/p>\n<p>While here is how to add a column to a nested struct<\/p>\n<p><a href=\"https:\/\/stackoverflow.com\/questions\/48777993\/how-do-i-add-a-column-to-a-nested-struct-in-a-pyspark-dataframe\">https:\/\/stackoverflow.com\/questions\/48777993\/how-do-i-add-a-column-to-a-nested-struct-in-a-pyspark-dataframe<\/a><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>StackOverflow is a wonderful source of solutions to common yet tricky programming issues. However there are certainly a few things to be aware of when&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[11,5],"tags":[54,56],"class_list":["post-957","post","type-post","status-publish","format-standard","hentry","category-data-science","category-problem-solving","tag-pyspark","tag-spark"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.6 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Using StackOverflow to Solve Common PySpark Issues - Justin&#039;s Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Using StackOverflow to Solve Common PySpark Issues - Justin&#039;s Blog\" \/>\n<meta property=\"og:description\" content=\"StackOverflow is a wonderful source of solutions to common yet tricky programming issues. However there are certainly a few things to be aware of when&hellip;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/\" \/>\n<meta property=\"og:site_name\" content=\"Justin&#039;s Blog\" \/>\n<meta property=\"article:published_time\" content=\"2021-12-28T17:12:51+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-01-16T17:13:23+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/justinmatters.co.uk\/wp\/wp-content\/uploads\/2022\/01\/stack-overflow-700x171.jpg\" \/>\n<meta name=\"author\" content=\"justinmatters\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"justinmatters\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/using-stackoverflow-to-solve-common-pyspark-issues\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/using-stackoverflow-to-solve-common-pyspark-issues\\\/\"},\"author\":{\"name\":\"justinmatters\",\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/#\\\/schema\\\/person\\\/7c3e0740e1fef74f705c19f175f6f321\"},\"headline\":\"Using StackOverflow to Solve Common PySpark Issues\",\"datePublished\":\"2021-12-28T17:12:51+00:00\",\"dateModified\":\"2022-01-16T17:13:23+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/using-stackoverflow-to-solve-common-pyspark-issues\\\/\"},\"wordCount\":739,\"commentCount\":0,\"image\":{\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/using-stackoverflow-to-solve-common-pyspark-issues\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/wp-content\\\/uploads\\\/2022\\\/01\\\/stack-overflow-700x171.jpg\",\"keywords\":[\"PySpark\",\"Spark\"],\"articleSection\":[\"Data Science\",\"Problem Solving\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/using-stackoverflow-to-solve-common-pyspark-issues\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/using-stackoverflow-to-solve-common-pyspark-issues\\\/\",\"url\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/using-stackoverflow-to-solve-common-pyspark-issues\\\/\",\"name\":\"Using StackOverflow to Solve Common PySpark Issues - Justin&#039;s Blog\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/using-stackoverflow-to-solve-common-pyspark-issues\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/using-stackoverflow-to-solve-common-pyspark-issues\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/wp-content\\\/uploads\\\/2022\\\/01\\\/stack-overflow-700x171.jpg\",\"datePublished\":\"2021-12-28T17:12:51+00:00\",\"dateModified\":\"2022-01-16T17:13:23+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/#\\\/schema\\\/person\\\/7c3e0740e1fef74f705c19f175f6f321\"},\"breadcrumb\":{\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/using-stackoverflow-to-solve-common-pyspark-issues\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/using-stackoverflow-to-solve-common-pyspark-issues\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/using-stackoverflow-to-solve-common-pyspark-issues\\\/#primaryimage\",\"url\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/wp-content\\\/uploads\\\/2022\\\/01\\\/stack-overflow.jpg\",\"contentUrl\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/wp-content\\\/uploads\\\/2022\\\/01\\\/stack-overflow.jpg\",\"width\":900,\"height\":220},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/using-stackoverflow-to-solve-common-pyspark-issues\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Using StackOverflow to Solve Common PySpark Issues\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/#website\",\"url\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/\",\"name\":\"Justin's Blog\",\"description\":\"Justin&#039;s Coding and Geek Blog\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/#\\\/schema\\\/person\\\/7c3e0740e1fef74f705c19f175f6f321\",\"name\":\"justinmatters\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/27cf337940887c098b79716aa7025ce782bd51de3f6b07a9dcad710bbf576c59?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/27cf337940887c098b79716aa7025ce782bd51de3f6b07a9dcad710bbf576c59?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/27cf337940887c098b79716aa7025ce782bd51de3f6b07a9dcad710bbf576c59?s=96&d=mm&r=g\",\"caption\":\"justinmatters\"},\"description\":\"Data Scientist specialising in Python, PySpark, SQL and Machine Learning\",\"sameAs\":[\"https:\\\/\\\/justinmatters.co.uk\\\/wp\\\/\",\"https:\\\/\\\/uk.linkedin.com\\\/in\\\/justin-matters-edinburgh\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Using StackOverflow to Solve Common PySpark Issues - Justin&#039;s Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/","og_locale":"en_US","og_type":"article","og_title":"Using StackOverflow to Solve Common PySpark Issues - Justin&#039;s Blog","og_description":"StackOverflow is a wonderful source of solutions to common yet tricky programming issues. However there are certainly a few things to be aware of when&hellip;","og_url":"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/","og_site_name":"Justin&#039;s Blog","article_published_time":"2021-12-28T17:12:51+00:00","article_modified_time":"2022-01-16T17:13:23+00:00","og_image":[{"url":"https:\/\/justinmatters.co.uk\/wp\/wp-content\/uploads\/2022\/01\/stack-overflow-700x171.jpg","type":"","width":"","height":""}],"author":"justinmatters","twitter_card":"summary_large_image","twitter_misc":{"Written by":"justinmatters","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/#article","isPartOf":{"@id":"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/"},"author":{"name":"justinmatters","@id":"https:\/\/justinmatters.co.uk\/wp\/#\/schema\/person\/7c3e0740e1fef74f705c19f175f6f321"},"headline":"Using StackOverflow to Solve Common PySpark Issues","datePublished":"2021-12-28T17:12:51+00:00","dateModified":"2022-01-16T17:13:23+00:00","mainEntityOfPage":{"@id":"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/"},"wordCount":739,"commentCount":0,"image":{"@id":"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/#primaryimage"},"thumbnailUrl":"https:\/\/justinmatters.co.uk\/wp\/wp-content\/uploads\/2022\/01\/stack-overflow-700x171.jpg","keywords":["PySpark","Spark"],"articleSection":["Data Science","Problem Solving"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/","url":"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/","name":"Using StackOverflow to Solve Common PySpark Issues - Justin&#039;s Blog","isPartOf":{"@id":"https:\/\/justinmatters.co.uk\/wp\/#website"},"primaryImageOfPage":{"@id":"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/#primaryimage"},"image":{"@id":"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/#primaryimage"},"thumbnailUrl":"https:\/\/justinmatters.co.uk\/wp\/wp-content\/uploads\/2022\/01\/stack-overflow-700x171.jpg","datePublished":"2021-12-28T17:12:51+00:00","dateModified":"2022-01-16T17:13:23+00:00","author":{"@id":"https:\/\/justinmatters.co.uk\/wp\/#\/schema\/person\/7c3e0740e1fef74f705c19f175f6f321"},"breadcrumb":{"@id":"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/#primaryimage","url":"https:\/\/justinmatters.co.uk\/wp\/wp-content\/uploads\/2022\/01\/stack-overflow.jpg","contentUrl":"https:\/\/justinmatters.co.uk\/wp\/wp-content\/uploads\/2022\/01\/stack-overflow.jpg","width":900,"height":220},{"@type":"BreadcrumbList","@id":"https:\/\/justinmatters.co.uk\/wp\/using-stackoverflow-to-solve-common-pyspark-issues\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/justinmatters.co.uk\/wp\/"},{"@type":"ListItem","position":2,"name":"Using StackOverflow to Solve Common PySpark Issues"}]},{"@type":"WebSite","@id":"https:\/\/justinmatters.co.uk\/wp\/#website","url":"https:\/\/justinmatters.co.uk\/wp\/","name":"Justin's Blog","description":"Justin&#039;s Coding and Geek Blog","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/justinmatters.co.uk\/wp\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/justinmatters.co.uk\/wp\/#\/schema\/person\/7c3e0740e1fef74f705c19f175f6f321","name":"justinmatters","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/27cf337940887c098b79716aa7025ce782bd51de3f6b07a9dcad710bbf576c59?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/27cf337940887c098b79716aa7025ce782bd51de3f6b07a9dcad710bbf576c59?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/27cf337940887c098b79716aa7025ce782bd51de3f6b07a9dcad710bbf576c59?s=96&d=mm&r=g","caption":"justinmatters"},"description":"Data Scientist specialising in Python, PySpark, SQL and Machine Learning","sameAs":["https:\/\/justinmatters.co.uk\/wp\/","https:\/\/uk.linkedin.com\/in\/justin-matters-edinburgh"]}]}},"_links":{"self":[{"href":"https:\/\/justinmatters.co.uk\/wp\/wp-json\/wp\/v2\/posts\/957","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/justinmatters.co.uk\/wp\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/justinmatters.co.uk\/wp\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/justinmatters.co.uk\/wp\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/justinmatters.co.uk\/wp\/wp-json\/wp\/v2\/comments?post=957"}],"version-history":[{"count":7,"href":"https:\/\/justinmatters.co.uk\/wp\/wp-json\/wp\/v2\/posts\/957\/revisions"}],"predecessor-version":[{"id":966,"href":"https:\/\/justinmatters.co.uk\/wp\/wp-json\/wp\/v2\/posts\/957\/revisions\/966"}],"wp:attachment":[{"href":"https:\/\/justinmatters.co.uk\/wp\/wp-json\/wp\/v2\/media?parent=957"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/justinmatters.co.uk\/wp\/wp-json\/wp\/v2\/categories?post=957"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/justinmatters.co.uk\/wp\/wp-json\/wp\/v2\/tags?post=957"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}