Introduction
In this guide, I will show you robust Exploratory Data Analysis (EDA) using Sweetviz. Exploratory data analysis is a process for analyzing data sets to get insights from data. It lets you summarize their important characteristics using visual methods.
This open source library was created by Francois Bertrand and few contributors. This package will help you to visualize your datasets in no-time.
Objective
One may wonder how to get started after collecting a dataset. EDA lets you discover data types, missing information, correlations, etc. In addition, we can also create insightful visualization to kickstart EDA.
During this process, we have to do the same work repeatedly to characterize a dataset. Sweetviz can solve such types of repetitive works. Target analysis, compare dataset, type inference, etc. are the main features of this library.
Here, we have provided everything that you need for starting a robust exploratory data analysis. So, you bookmark this short guide.
Prerequisites For Sweetviz
- Install Anaconda Distribution
2. Jupyter Notebook for coding
3. Install and Import Sweetviz package
Install Sweetviz for Robust Exploratory Data Analysis
Lets get started
Step 1: Installation
All you need to do is download the sweetviz library from here. This library works on Windows, macOS, and Linux.
Now, you install this package using pip command. Type the code and then press ENTER.
pip install sweetviz
Once the code executes, you will see this screenshot of the installation.
Step 2: Dataset Collection
After installation, you need to import sweetviz to work with the dataset. You also load the train and test datasets.
We shall be using the dataset (House Prices: Advanced Regression Techniques) from the Kaggle.
Here the problem statement is to analyze the “SalePrice” of the dataset.
Step 3: Import Library
Once you collect the dataset, then type the codes in a notebook and then press Run.
Step 4: Verify Dataset
Now, we will identify the number of rows and columns in the train dataset using the following code:
Step 5: Generating Report
We are going to create the report using the analyze() function.
You can also use compare_intra() and compare() function for the same purpose. Now, we will use the analyze() function to display the report.
We ran the below function show_html to save the report.
Step 6: Final Report
You will get the report in your default web browser like this.
Conclusion: Robust Exploratory Data Analysis
Well, now is the best time to start exploratory data analysis. The above steps are all you need to visualize your datasets.
In case if you wish to add any information, feel free to let me know in the comment section below.
For more details:
Powerful EDA using Sweetviz- Click Here
Sweetviz on the Github – Click Here
Do share this short guide with others who wanted to visualize data smoothly for some time.