如何给熊猫/matplotlib 条形图自定义颜色

我刚开始使用 Pandas/matplotlib 作为 Excel 的替代品来生成堆叠的条形图。我遇到麻烦了

(1)在默认的颜色图中只有5种颜色,所以如果我有5个以上的类别,然后颜色重复。我如何指定更多的颜色?理想情况下,一个渐变的开始颜色和结束颜色,并动态生成 n 个颜色之间的方法?

(2)颜色在视觉上不是很好看。如何指定一组自定义的 n 种颜色?或者,渐变也可以。

下面的例子说明了上述两点:

  4 from matplotlib import pyplot
5 from pandas import *
6 import random
7
8 x = [{i:random.randint(1,5)} for i in range(10)]
9 df = DataFrame(x)
10
11 df.plot(kind='bar', stacked=True)

结果是这样的:

enter image description here

188662 次浏览

You can specify the color option as a list directly to the plot function.

from matplotlib import pyplot as plt
from itertools import cycle, islice
import pandas, numpy as np  # I find np.random.randint to be better


# Make the data
x = [{i:np.random.randint(1,5)} for i in range(10)]
df = pandas.DataFrame(x)


# Make a list by cycling through the colors you care about
# to match the length of your data.
my_colors = list(islice(cycle(['b', 'r', 'g', 'y', 'k']), None, len(df)))


# Specify this list of colors as the `color` option to `plot`.
df.plot(kind='bar', stacked=True, color=my_colors)

To define your own custom list, you can do a few of the following, or just look up the Matplotlib techniques for defining a color item by its RGB values, etc. You can get as complicated as you want with this.

my_colors = ['g', 'b']*5 # <-- this concatenates the list to itself 5 times.
my_colors = [(0.5,0.4,0.5), (0.75, 0.75, 0.25)]*5 # <-- make two custom RGBs and repeat/alternate them over all the bar elements.
my_colors = [(x/10.0, x/20.0, 0.75) for x in range(len(df))] # <-- Quick gradient example along the Red/Green dimensions.

The last example yields the follow simple gradient of colors for me:

enter image description here

I didn't play with it long enough to figure out how to force the legend to pick up the defined colors, but I'm sure you can do it.

In general, though, a big piece of advice is to just use the functions from Matplotlib directly. Calling them from Pandas is OK, but I find you get better options and performance calling them straight from Matplotlib.

I found the easiest way is to use the colormap parameter in .plot() with one of the preset color gradients:

df.plot(kind='bar', stacked=True, colormap='Paired')

enter image description here

You can find a large list of preset colormaps here.

colormaps

For a more detailed answer on creating your own colormaps, I highly suggest visiting this page

If that answer is too much work, you can quickly make your own list of colors and pass them to the color parameter. All the colormaps are in the cm matplotlib module. Let's get a list of 30 RGB (plus alpha) color values from the reversed inferno colormap. To do so, first get the colormap and then pass it a sequence of values between 0 and 1. Here, we use np.linspace to create 30 equally-spaced values between .4 and .8 that represent that portion of the colormap.

from matplotlib import cm
color = cm.inferno_r(np.linspace(.4, .8, 30))
color


array([[ 0.865006,  0.316822,  0.226055,  1.      ],
[ 0.851384,  0.30226 ,  0.239636,  1.      ],
[ 0.832299,  0.283913,  0.257383,  1.      ],
[ 0.817341,  0.270954,  0.27039 ,  1.      ],
[ 0.796607,  0.254728,  0.287264,  1.      ],
[ 0.775059,  0.239667,  0.303526,  1.      ],
[ 0.758422,  0.229097,  0.315266,  1.      ],
[ 0.735683,  0.215906,  0.330245,  1.      ],
.....

Then we can use this to plot, using the data from the original post:

import random
x = [{i: random.randint(1, 5)} for i in range(30)]
df = pd.DataFrame(x)
df.plot(kind='bar', stacked=True, color=color, legend=False, figsize=(12, 4))

enter image description here