Exploratory Data Analysis made Easy with Pandas Profiling

Jari El
2 min readDec 13, 2020

Often Data Scientists are left with a large dataset and are expected to bring insight and findings to light. This is not always easy to do by hand. This step is called Exploratory Data Analysis (EDA) and it is an essential step. There is a way to get a jump start on this step with only a few lines of code. There is an open-source Python package called Pandas Profiling. It can generate various helpful plots as well as gives you stats about the given dataset such as correlation, missing values, and colinearity.

How to use Pandas Profiling:

Importing Pandas Profiling

pip install pandas-profiling

Implementing:

from pandas_profiling import ProfileReport
prof = ProfileReport(data)
prof.to_file(output_file='output.html')

Fit your data and that's it! Let see some of the stuff you can do!

You will find this very useful and it will cut down the time time of your EDA time leaving you with more time for modeling. Enjoy.

--

--