我要怎么用巨蟒找到沃利?

厚颜无耻地跟风: -)

受到 我怎么才能用 Mathematica 找到 Waldo和后续 如何用 R 找到 Waldo的启发,作为一个新的 python 用户,我很想看看这是如何做到的。Python 似乎比 R 更适合这一点,而且我们不必像使用 Mathematica 或 Matlab 那样担心许可证问题。

在下面这个例子中,显然简单地使用条纹是行不通的。如果能够采用一种简单的基于规则的方法来处理诸如此类的困难示例,那将是非常有意义的。

At the beach

我添加了[机器学习]标签,因为我相信正确的答案将不得不使用机器学习技术,比如 Gregory Klopper 在最初的帖子中提倡的受限玻尔兹曼机(RBM)方法。有一些 Python 中可用的 RBM 代码可能是一个很好的起点,但显然这种方法需要训练数据。

2009年 IEEE 信号处理机器学习国际研讨会(MLSP 2009)他们做了 数据分析比赛: Wally 在哪?。培训数据以 matlab 格式提供。请注意,该网站上的链接是死的,但数据(连同来源的一种方法所采取的 Sean McLoone和同事可以找到 给你(见 SCM 链接)。看起来是个好的开始。

20241 次浏览

You could try template matching, and then taking down which produced the highest resemblance, and then using machine learning to narrow it more. That is also very difficult, and with the accuracy of template matching, it may just return every face or face-like image. I am thinking you will need more than just machine learning if you hope to do this consistently.

maybe you should start with breaking the problem into two smaller ones:

  1. create an algorithm that separates people from the background.
  2. train a neural network classifier with as many positive and negative examples as possible.

those are still two very big problems to tackle...

BTW, I would choose c++ and open CV, it seems much more suited for this.

This is not impossible but very difficult because you really have no example of a successful match. There are often multiple states(in this case, more examples of find walleys drawings), you can then feed multiple pictures into an image reconization program and treat it as a hidden markov model and use something like the viterbi algorithm for inference ( http://en.wikipedia.org/wiki/Viterbi_algorithm ).

Thats the way I would approach it, but assuming you have multiple images that you can give it examples of the correct answer so it can learn. If you only have one picture, then I'm sorry there maybe another approach you need to take.

I recognized that there are two main features which are almost always visible:

  1. the red-white striped shirt
  2. dark brown hair under the fancy cap

So I would do it the following way:

search for striped shirts:

  • filter out red and white color (with thresholds on the HSV converted image). That gives you two mask images.
  • add them together -> that's the main mask for searching striped shirts.
  • create a new image with all the filtered out red converted to pure red (#FF0000) and all the filtered out white converted to pure white (#FFFFFF).
  • now correlate this pure red-white image with a stripe pattern image (i think all the waldo's have quite perfect horizontal stripes, so rotation of the pattern shouldn't be necessary). Do the correlation only inside the above mentioned main mask.
  • try to group together clusters which could have been resulted from one shirt.

If there are more than one 'shirts', to say, more than one clusters of positive correlation, search for other features, like the dark brown hair:

search for brown hair

  • filter out the specific brown hair color using the HSV converted image and some thresholds.
  • search for a certain area in this masked image - not too big and not too small.
  • now search for a 'hair area' that is just above a (before) detected striped shirt and has a certain distance to the center of the shirt.

Here's an implementation with mahotas

from pylab import imshow
import numpy as np
import mahotas
wally = mahotas.imread('DepartmentStore.jpg')


wfloat = wally.astype(float)
r,g,b = wfloat.transpose((2,0,1))

Split into red, green, and blue channels. It's better to use floating point arithmetic below, so we convert at the top.

w = wfloat.mean(2)

w is the white channel.

pattern = np.ones((24,16), float)
for i in xrange(2):
pattern[i::4] = -1

Build up a pattern of +1,+1,-1,-1 on the vertical axis. This is wally's shirt.

v = mahotas.convolve(r-w, pattern)

Convolve with red minus white. This will give a strong response where the shirt is.

mask = (v == v.max())
mask = mahotas.dilate(mask, np.ones((48,24)))

Look for the maximum value and dilate it to make it visible. Now, we tone down the whole image, except the region or interest:

wally -= .8*wally * ~mask[:,:,None]
imshow(wally)

And we get waldo!

Here's a solution using neural networks that works nicely.

The neural network is trained on several solved examples that are marked with bounding boxes indicating where Wally appears in the picture. The goal of the network is to minimize the error between the predicted box and the actual box from training/validation data.

The network above uses Tensorflow Object Detection API to perform training and predictions.