matplotlib Plotting Cookbook
上QQ阅读APP看书,第一时间看更新

Plotting boxplots

Boxplot allows you to compare distributions of values by conveniently showing the median, quartiles, maximum, and minimum of a set of values.

How to do it...

The following script shows a boxplot for 100 random values drawn from a normal distribution:

import numpy as np
import matplotlib.pyplot as plt

data = np.random.randn(100)

plt.boxplot(data)
plt.show()

A boxplot will appear that represents the samples we drew from the random distribution. Since the code uses a randomly generated dataset, the resulting figure will change slightly every time the script is run.

The preceding script will display the following graph:

How to do it...

How it works...

The data = [random.gauss(0., 1.) for i in range(100)] variable generates 100 values drawn from a normal distribution. For demonstration purposes, such values are typically read from a file or computed from other data. The plot.boxplot() function takes a set of values and computes the mean, median, and other statistical quantities on its own. The following points describe the preceding boxplot:

  • The red bar is the median of the distribution.
  • The blue box includes 50 percent of the data from the lower quartile to the upper quartile. Thus, the box is centered on the median of the data.
  • The lower whisker extends to the lowest value within 1.5 IQR from the lower quartile.
  • The upper whisker extends to the highest value within 1.5 IQR from the upper quartile.
  • Values further from the whiskers are shown with a cross marker.

There's more...

To show more than one boxplot in a single graph, calling pyplot.boxplot() once for each boxplot is not going to work. It will simply draw the boxplots over each other, making a messy, unreadable graph. However, we can draw several boxplots with just one single call to pyplot.boxplot() as follows:

import numpy as np
import matplotlib.pyplot as plt

data = np.random.randn(100, 5)

plt.boxplot(data)
plt.show()

The preceding script displays the following graph:

There's more...

The pyplot.boxplot() function accepts a list of lists as the input, rendering a boxplot for each sublist.