Skip to content
Dytbaat Games
  • Home
  • Games
    • Solitaire Grove
    • Merge Solitaire
    • Flop Ball Jump
  • Jobs
  • Blog
python pandas Data Science

Pandas: Append and Concat

  • March 3, 2021

This article is part of a series of practical guides for using the Python data processing library pandas. To see view all the available parts, click here.

In this guide we will look at a few methods we can use to add pandas DataFrames together vertically, stacking them one on top of the other. This will include two pandas methods concat and append, and a third way where we make use of some simple python methods. This last method can often be much faster than working with DataFrames directly, especially if we want to repeatedly append one row at a time to a DataFrame.

If you are looking at joining tables, or adding two tables together horizontally, try the guide on joining tables.

Read more “Pandas: Append and Concat” →
python pandas Data Science

Pandas: Joining tables

  • February 8, 2021

This article is part of a series of practical guides for using the Python data processing library pandas. To see view all the available parts, click here.

One of the most fundamental concepts in data science and data work in general is joining two tables together based on some shared column or index. In SQL it is a JOIN. In Excel it is INDEX-MATCH or VLOOKUP. In pandas, two methods are available to join tables together: merge and join. We will look at both of those methods in this guide.

Read more “Pandas: Joining tables” →
python pandas Data Science

Pandas: How to Pivot data

  • February 1, 2021February 1, 2021

This article is part of a series of practical guides for using the Python data processing library pandas. To see view all the available parts, click here.

When I was starting out with pandas, I was coming from an Excel and SQL background. Having spent a solid 8 years with Excel as my primary data munging and modeling tool, I was very comfortable using pivot tables, a tool I found extremely powerful and later discovered are strangely controversial. My workflow started to involve pivot tables so regularly that my SQL queries were often written to extract data in a format that would make it simpler to aggregate in a pivot table.

Naturally, when I started learning pandas, one of the first things I wanted to learn was “how can I recreate the functionality of an Excel pivot table in pandas”? In this guide we will look at several ways to do just that.

Read more “Pandas: How to Pivot data” →
python pandas Data Science

Pandas: Advanced Aggregation

  • January 20, 2021February 1, 2021

This article is part of a series of practical guides for using the Python data processing library pandas. To see view all the available parts, click here.

Building on the basic aggregation guide, in this guide we will look at some more advanced ways we can aggregate data using pandas. We are going to cover three techniques:

  1. Aggregating using different methods at the same time, for example, summing one column and taking the average of another.
  2. Defining and using custom aggregation functions which we can use to calculate aggregates that are not available “out of the box”.
  3. The transform method which can be used to do some very useful things with aggregated values.
Read more “Pandas: Advanced Aggregation” →
python pandas Data Science

Pandas: Aggregation

  • January 8, 2021February 1, 2021

This article is part of a series of practical guides for using the Python data processing library pandas. To see view all the available parts, click here.

A fundamental tool for working in pandas and with tabular data more generally is the ability to aggregate data across rows. Thankfully pandas gives us some easy-to-use methods for aggregation, which includes a range of summary statistics such as sums, min and max values, means and medians, variances and standard deviations, or even quantiles. In this guide we will walk through the basics of aggregation in pandas, hopefully giving you the basic building blocks to go on to more complex aggregations.

Read more “Pandas: Aggregation” →
python pandas Data Science

Pandas: SettingWithCopyWarning

  • December 21, 2020

This article is part of a series of practical guides for using the Python data processing library pandas. To see view all the available parts, click here.

For many users starting out with pandas, a common and frustrating warning that pops up sooner or later is the following:

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.

Try using .loc[row_indexer,col_indexer] = value instead

To the uninitiated, it can be hard to know what it means or if it even matters. In this guide, we’ll walk through what the warning means, why you are seeing it, and what you can do to avoid it.

Read more “Pandas: SettingWithCopyWarning” →
python pandas Data Science

Pandas: Advanced booleans

  • December 16, 2020January 5, 2021

This article is part of a series of practical guides for using the Python data processing library pandas. To see view all the available parts, click here.

In other sections in this series, we’ve looked at how we can use booleans (a value that is either True or False) in pandas. Specifically, we’ve looked at how a list or array of booleans can be used to filter a DataFrame. In those examples we generated lists of booleans using simple comparisons like “are the values in the fixed acidity column > 12?” However, simple comparisons like this are only one of many ways we can create booleans. In this guide we are going to look at a range of methods that allow us to do more complex comparisons, while also making our code more concise and easier to understand.

Read more “Pandas: Advanced booleans” →
python pandas Data Science

Pandas: Filtering and segmenting

  • December 1, 2020December 13, 2020

This article is part of a series of practical guides for using the Python data processing library pandas. To see view all the available parts, click here.

One of the most common ways you will interact with a pandas DataFrame is by selecting different combinations of columns and rows. This can be done using the numerical positions of columns and rows in the DataFrame, column names and row indices, or by filtering the rows by applying some criteria to the data in the DataFrame. All of these options (and combinations of them) are available, so let’s dig in!

Read more “Pandas: Filtering and segmenting” →
python pandas Data Science

Pandas: Basic data interrogation

  • November 24, 2020February 1, 2021

This article is part of a series of practical guides for using the Python data processing library pandas. To see view all the available parts, click here.

Once we have our data in a pandas DataFrame, the basic table structure in pandas, the next step is how do we assess what we have? If you are coming from Excel or R Studio, you are probably used to being able to look at the data any time you want. In python/pandas, we don’t have a spreadsheet to work with, and we don’t even have an equivalent of R Studio (although Jupyter notebooks are a similar concept), but we do have several tools available that can help you get a handle on what your data looks like.

Read more “Pandas: Basic data interrogation” →
python pandas Data Science

Pandas: Reading in JSON data

  • November 23, 2020December 13, 2020

This article is part of a series of practical guides for using the Python data processing library pandas. To see view all the available parts, click here.

When we are working with data in software development or when the data comes from APIs, it is often not provided in a tabular form. Instead it is provided in some combination of key-value stores and arrays broadly denoted as JavaScript Object Notation (JSON). So how do we read this type of non-tabular data into a tabular format like a pandas DataFrame?

Read more “Pandas: Reading in JSON data” →

Posts navigation

1 2

Recent Posts

  • Pandas: Append and Concat
  • Pandas: Joining tables
  • Pandas: How to Pivot data
  • Pandas: Advanced Aggregation
  • Pandas: Aggregation
Want to get notifications from us on the latest news?
Loading

Cookie Policy

© Dytbaat 2019
Theme by Colorlib Powered by WordPress
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
Cookie settingsACCEPT
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.

Read More
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
SAVE & ACCEPT