Boxplots in Seaborn


This entry is part 5 of 8 in the series Matplotlib

How do you create boxplots in Python’s seaborn? Why would you want to create boxplots in the first place? Perhaps you are working on outlier detection. Seaborn is built on top of Matplotlib.

Suppose your DataFrame is called df and suppose you have a column called my_col. Below is the code.

import matplotlib.pyplot as plt 
import seaborn as sns
sns.boxplot(y="duration", data=df)

In the next example we’ll create three boxplots for three columns. We’ll set it up so that they are shown side by side. Suppose you have a DataFrame called df and in that DataFrame you have at least three columns called col_1, col_2 and col_3.

fig, axes = plt.subplots(1, 3, figsize=(15, 2))
fig.suptitle('Boxplots for outlier detection')
sns.boxplot(ax=axes[0], x=df['col_1'])
sns.boxplot(ax=axes[1], x=df['col_2'])
sns.boxplot(ax=axes[2], x=df['col_3'])
plt.show();

A Simple Example

import pandas as pd  # import the pandas library into Python
import matplotlib.pyplot as plt
import seaborn as sns
data = {'firstname': ['Bob', 'Sally', 'Suzie', 'Rohan', 'Sam', 'Linda', 'Susan', 'Gail'],
       'col_1': [12, 67, 33, 41, 17, 21, 23, 28],
       'col_2': [22, 57, 43, 44, 27, 38, 32, 35],
       'col_3': [30, 60, 44, 53, 32, 47, 49, 46]}
df = pd.DataFrame(data)
df

Click the screenshot below to see what it looks like in Jupyter notebook.

Click to Enlarge

The searborn website says: “A box plot (or box-and-whisker plot) shows the distribution of quantitative data in a way that facilitates comparisons between variables or across levels of a categorical variable. The box shows the quartiles of the dataset while the whiskers extend to show the rest of the distribution, except for points that are determined to be “outliers” using a method that is a function of the inter-quartile range.”

Series Navigation<< Random Histogram in MatplotlibPie Charts in matplotlib >>

Leave a Reply