Often Data Scientists are left with a large dataset and are expected to bring insight and findings to light. This is not always easy to do by hand. This step is called Exploratory Data Analysis (EDA) and it is an essential step. There is a way to get a jump start on this step with only a few lines of code. There is an open-source Python package called Pandas Profiling. It can generate various helpful plots as well as gives you stats about the given dataset such as correlation, missing values, and colinearity.
How to use Pandas Profiling:
Importing Pandas Profiling
pip install pandas-profiling
Implementing:
from pandas_profiling import ProfileReport
prof = ProfileReport(data)
prof.to_file(output_file='output.html')
Fit your data and that's it! Let see some of the stuff you can do!
You will find this very useful and it will cut down the time time of your EDA time leaving you with more time for modeling. Enjoy.