In the age of big data, the ability to visualize and communicate data findings is a crucial skill. Python, with its strong set of libraries, has become a popular platform to conduct an exploratory data analysis. One of the major tools that Python offers for data visualization is Matplotlib.
Data visualization is the discipline of trying to understand data by placing it in a visual context so that patterns, trends and correlations that might not otherwise be detected can be exposed. Python offers multiple great graphing libraries that come packed with lots of different features. No matter if you want to create interactive, live or highly customized plots python has an excellent library for you.
To begin with, Python and Matplotlib need to be installed and set up. Python can be downloaded from here, and Matplotlib can be installed using pip:
pip install matplotlib
Matplotlib consists of several plots like line, bar, scatter, histogram etc. Most of the Matplotlib lies under the pyplot submodule, and are usually imported under the plt
import matplotlib.pyplot as plt
For example, let's create a simple line plot.
import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y = [1, 4, 9, 16, 25]
# Create a figure and axis
fig, ax = plt.subplots()
# Plotting
ax.plot(x, y)
# Show the plot
plt.show()
Bar charts can be created using the `bar` function. We can also add error bars using the `yerr` parameter.
import matplotlib.pyplot as plt
# Sample data
languages = ['Python', 'Java', 'C', 'C++', 'JavaScript']
popularity = [100, 96, 85, 88, 91]
# Create a figure and axis
fig, ax = plt.subplots()
# Plotting
ax.bar(languages, popularity)
# Show the plot
plt.show()
Scatter plots’ primary uses are to observe and show relationships between two numeric variables. The dots in a scatter plot not only report the values of individual data points, they also create a pattern that can suggest a relationship between variables. A scatter chart can also show the relationship between three variables also known as 3-D scatter plot.
import matplotlib.pyplot as plt
# Sample data
weight = [67, 81, 72, 79, 87, 69, 72, 84]
height = [171, 185, 179, 192, 189, 175, 174, 193]
# Create a figure and axis
fig, ax = plt.subplots()
# Plotting
ax.scatter(weight, height)
# Show the plot
plt.show()
Customizing your plots by adding labels, legends, and title can greatly enhance the communicative power of your visualizations.
import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y = [1, 4, 9, 16, 25]
# Create a figure and axis
fig, ax = plt.subplots()
# Plotting
ax.plot(x, y)
# Add title and labels
ax.set_title('A Simple Line Plot')
ax.set_xlabel('X Values')
ax.set_ylabel('Y Values')
# Show the plot
plt.show()
Data visualization with Python and Matplotlib is commonly used in fields like data science and business intelligence. For example:
Ready to start learning? Start the quest now