Introduction to Matplotlib

Introduction to Matplotlib

What is Visualization?

Visualization is the ability to graphically represent data so that insights can be derived from the dataset, some of the insights that can be derived are the correlation between the various features, the count of distinct values for each feature, and if you’re looking at a time series data (Data that changes with time, such as daily temperatures), visualization helps in seeing the general direction of the data. In python, there are various libraries that enable data visualization. Some of the most commonly used ones are Matplotlib and Seaborn.

For the purpose of demonstration, I will be using the House Rent Prediction Dataset from Kaggle. To install both the mentioned libraries, if you’re using an online notebook such as Google Colab it should be pre-installed, else in your local terminal you can use the following commands if you already have python installed:

pip install matplotlib

Matplotlib

To import the Matplotlib library, the code below is used:

import matplotlib.pyplot as plt

From the code the import is used to following the following library to be imported. Now the next part is interesting, so we will generally use the pyplot component of matplotlib. pyplot essentially makes matplotlib act like matlab. Finally plt is a name we are giving to access the libary, so instead of using matplotlib.pyplot.etc we can just use plt.etc

The various graphs that are available in matplotlib are:

Line Charts
Bar Graphs
Pie Charts
Scatter Plots

We can also display images as well which will be covered at the end. It would also be helpful to note that matplotlib also supports 3D plots which won’t be covered in this post.

After importing the House Rent Dataset, we can see that the various features of the dataset are: Posted On, BHK, Rent, Size, Floor, Area Type,Area Locality, City, Furnishing Status, Tenant Preferred, Bathroom, and Point of Contact

Line Charts

Line Graphs in very simple terms, show how data moves over time or the relationship between two features. To plot a line graph, you can call the function plt.plot(). Let us plot a Line Graph where the x axis is the size of the apartment and the y axis is the Rent as shown below:

plt.plot(df[Size], df[Rent])

However you can see that the line is all over the place.

This is where we sort both the features, using one as reference. In this case, we will sort both with respect to size.

zl = zip(df[Size], df[Rent]) #combines lists into a tuple
sl = sorted(zl, key=lambda x: x[0]) #sort tuple based on first value
x,y= zip(sl) #Splits the values into x and y

Now we get a better graph.

Bar Graphs

Bar Graphs essentially show the numeric values of various categories, ie, the count for each category, the values associated with each category, etc. We will create a bar plot where the categorical values are the cities and the heights of each bar is based on the Rent. The code to create a bar graph is given below:

plt.bar(df[City],df[Rent])

Pie Charts

A Pie Chart is a circular representation of the total distribution of values of categories, ie, count of the categories. In this example, we will create a pie chart for the cities:

city_counts = df[City].value_counts() # to get unique values and their counts
plt.pie(city_counts, labels=city_counts.index) #we use the unique values (cities) as the label
plt.show()

Scatter Plots

A Scatter Plot essentially helps in visualizing the relationship between two features. In this example, we will create a scatter plot in which the x-axis is the Size and y-axis is the Rent:

zl = zip(df[Size], df[Rent]) #combines lists into a tuple
sl = sorted(zl, key=lambda x: x[0]) #sort tuple based on first value
x,y= zip(*sl) #Splits the values into x and y

plt.scatter(df[Size],df[Rent])

Finally, you can display images using the imgshow() function as shown below. To read an image we use the cv2 library which will be covered in another post.

import cv2 as cv

image = cv.imread(/kaggle/input/transformers transformers_cars-HD-2149224348.jpg) # Reads the image
plt.imshow(image) #displays the image

If you would like to go more in depth about matplotlib. Here are a few resources available:

Matplotlib Documentation
W3Schools

If you have any inquiries, or want to point out any corrections or have any feedback in general, feel free to let me know!

Leave a Reply

Your email address will not be published. Required fields are marked *