The 6 AI Engineering Patterns, come build with Greg live:Β Starts Jan 6th, 2025
Leverage
Glossary

Exploratory Data Analysis (EDA)

Exploratory Data Analysis methods and techniques

Exploratory Data Analysis (EDA) is the act of getting intimate with your data.

This means you get a feeling for your data. You don’t simply know it’s characteristics (# rows, columns, distributions, etc.)…you actually feel it.

It may sound a bit corny, but after doing data for long enough, you gain the ability to understand a dataset on an intuition level.

EDA is the process of initial exploration. Imagine you are in a deep dark cave and all you have is a flash light. You illuminate sections of the walls, the ground, and head down passages. EDA is the same process for exploring data.

Whenever we do Exploratory Data Analysis, you can bet we are analyzing:

  • # rows, #columns
  • Column cardinality (how many unique elements are there in each group?)
  • Correlations, which columns relate to each other?
  • What are the min/max of each column?
  • What do outliers (if any) say about the data?

There isn’t a right answer when doing EDA. The goal is for you to have a launching point that will lead to more analysis. You’ll know when you are done when you are sufficiently inspired to take the next step in your analysis.

Let’s take a look at a python EDA sample.

Link to code

On this page

No Headings