Justin’s Blog

Justin's Coding and Geek Blog

Skip to content
Menu
  • Welcome
  • Data Science
  • Problem Solving
  • Computer Graphics
  • Contact Details

Tag: pandas

Data Science

Building a Custom Data Pipeline Using Curried Functions

Posted on July 22, 2019July 22, 2019 by justinmatters

If you work in data science you have probably come across the pipeline model for handling data transformations. It is used by many machine learning…

Data Science

Pandas to PySpark Conversion Cheatsheet

Posted on June 1, 2019July 9, 2019 by justinmatters

This is a follow on post from my last post about starting with PySpark and Databricks. Here is a link to a table I have…

Data Science

Using SQLAlchemy to Run SQL Procedures

Posted on April 4, 2019May 26, 2019 by justinmatters

Occasionally you may want to invoke a stored procedure from your python code in order to manipulate data as part of a larger task. Naively…

Data Science

Using SQLAlchemy to Export Data from Pandas

Posted on March 16, 2019April 4, 2019 by justinmatters

In the last blog post I discussed using SQL Alchemy to import SQL database data into pandas for data analysis. But what if you wish…

Data Science

Using SQLAlchemy to Import Data to Pandas

Posted on February 24, 2019April 4, 2019 by justinmatters

Sometimes may want to use Python to extract data from a SQL database to analyse using pandas. There are a couple of issues here. Firstly…

Data Science

Kaggle PUBG Competition Data Analysis

Posted on November 19, 2018December 6, 2018 by justinmatters

Currently there is a fun competition running over on the Kaggle Data Science website. The objective is to use metrics from a large data set…

Data Science

Edinburgh Bike Open Data – 2 of 4 – Data Cleaning

Posted on October 11, 2018December 6, 2018 by justinmatters

Now we have obtained our dataset from the Edinburgh Open Data store, we need to tidy it up and see if we need to transform…

Data Science

Edinburgh Bike Open Data – 1 of 4 – data acquisition

Posted on October 11, 2018December 6, 2018 by justinmatters

As a keen cyclist I thought I would take a look at Edinburgh Council’s Bike Counter dataset. The website states that “The dataset includes bike…

Recent Posts

  • SQL to PySpark Conversion Cheatsheet
  • Tech Meetups in Edinburgh
  • PySpark’s Delta Storage Format
  • Bulk Downloads from Jupyter
  • Some PySpark Gotchas

Recent Comments

  • justinmatters on Image Recognition 2 of 4 – Using Beautiful Soup to Extract Webpage Information for a Data Set
  • Huw Millington on Image Recognition 2 of 4 – Using Beautiful Soup to Extract Webpage Information for a Data Set
  • justinmatters on Memoization : Using Decorators to Speed Up Recursion
  • Madhur Gupta on Memoization : Using Decorators to Speed Up Recursion

Archives

  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018

Categories

  • Computer Graphics
  • Data Science
  • Problem Solving
  • Uncategorized

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
© Copyright 2019 – Justin's Blog
Wisteria Theme by WPFriendship ⋅ Powered by WordPress