照片中纸张角点的检测算法

检测照片中发票/收据/纸张角落的最佳方法是什么?这将用于后续的透视校正,在 OCR 之前。

我目前的做法是:

RGB > 灰色 > 精明的边缘检测与阈值 > 扩张(1) > 删除小对象(6) > 清除边界对象 > 挑选大型博客基于凸面积[角检测-未实施]

我不禁认为,必须有一个更强大的“智能”/统计方法来处理这种类型的分割。我没有很多训练的例子,但我大概可以得到100张图片在一起。

更广泛的背景:

我正在使用 matlab 进行原型设计,并计划在 OpenCV 和 Tessnest-OCR 中实现该系统。这是我需要为这个特定应用程序解决的许多图像处理问题中的第一个。因此,我期待滚动我自己的解决方案,并重新熟悉自己与图像处理算法。

这里有一些样本图像,我希望算法来处理: 如果你想接受挑战的大图像是在 http://madteckhead.com/tmp

case 1
(来源: Madteckhead.com)

case 2
(来源: Madteckhead.com)

case 3
(来源: Madteckhead.com)

case 4
(来源: Madteckhead.com)

最好的情况是:

case 1 - canny
(来源: Madteckhead.com)

case 1 - post canny
(来源: Madteckhead.com)

case 1 - largest blog
(来源: Madteckhead.com)

然而,在其它情况下,这种做法很容易失败:

case 2 - canny
(来源: Madteckhead.com)

case 2 - post canny
(来源: Madteckhead.com)

case 2 - largest blog
(来源: Madteckhead.com)

预先感谢所有伟大的想法! 我爱所以!

编辑: 霍夫变革进展

问: 什么样的算法会聚集 Hough 线来寻找角点? 根据从答案中得到的建议,我能够使用霍夫变换,挑选线条,并过滤它们。我目前的做法相当粗糙。我已经做出假设,发票总是小于15度不对齐的形象。如果是这种情况,我最终得到了合理的线条结果(见下文)。但是我不能完全确定一个合适的算法来聚类线(或投票)来推断角落。Hough 线不是连续的。而在噪声图像中,可能存在平行线,因此需要一些形式或距离的线源度量。有什么想法吗?

case 1 case 2 case 3 case 4
(来源: Madteckhead.com)

59861 次浏览

A student group at my university recently demonstrated an iPhone app (and python OpenCV app) that they'd written to do exactly this. As I remember, the steps were something like this:

  • Median filter to completely remove the text on the paper (this was handwritten text on white paper with fairly good lighting and may not work with printed text, it worked very well). The reason was that it makes the corner detection much easier.
  • Hough Transform for lines
  • Find the peaks in the Hough Transform accumulator space and draw each line across the entire image.
  • Analyse the lines and remove any that are very close to each other and are at a similar angle (cluster the lines into one). This is necessary because the Hough Transform isn't perfect as it's working in a discrete sample space.
  • Find pairs of lines that are roughly parallel and that intersect other pairs to see which lines form quads.

This seemed to work fairly well and they were able to take a photo of a piece of paper or book, perform the corner detection and then map the document in the image onto a flat plane in almost realtime (there was a single OpenCV function to perform the mapping). There was no OCR when I saw it working.

After edge-detection, use Hough Transform. Then, put those points in an SVM(supporting vector machine) with their labels, if the examples have smooth lines on them, SVM will not have any difficulty to divide the necessary parts of the example and other parts. My advice on SVM, put a parameter like connectivity and length. That is, if points are connected and long, they are likely to be a line of the receipt. Then, you can eliminate all of the other points.

I'm Martin's friend who was working on this earlier this year. This was my first ever coding project, and kinda ended in a bit of a rush, so the code needs some errr...decoding... I'll give a few tips from what I've seen you doing already, and then sort my code on my day off tomorrow.

First tip, OpenCV and python are awesome, move to them as soon as possible. :D

Instead of removing small objects and or noise, lower the canny restraints, so it accepts more edges, and then find the largest closed contour (in OpenCV use findcontour() with some simple parameters, I think I used CV_RETR_LIST). might still struggle when it's on a white piece of paper, but was definitely providing best results.

For the Houghline2() Transform, try with the CV_HOUGH_STANDARD as opposed to the CV_HOUGH_PROBABILISTIC, it'll give rho and theta values, defining the line in polar coordinates, and then you can group the lines within a certain tolerance to those.

My grouping worked as a look up table, for each line outputted from the hough transform it would give a rho and theta pair. If these values were within, say 5% of a pair of values in the table, they were discarded, if they were outside that 5%, a new entry was added to the table.

You can then do analysis of parallel lines or distance between lines much more easily.

Hope this helps.

Instead of starting from edge detection you could use Corner detection.

Marvin Framework provides an implementation of Moravec algorithm for this purpose. You could find the corners of the papers as a starting point. Below the output of Moravec's algorithm:

enter image description here

Here's what I came up with after a bit of experimentation:

import cv, cv2, numpy as np
import sys


def get_new(old):
new = np.ones(old.shape, np.uint8)
cv2.bitwise_not(new,new)
return new


if __name__ == '__main__':
orig = cv2.imread(sys.argv[1])


# these constants are carefully picked
MORPH = 9
CANNY = 84
HOUGH = 25


img = cv2.cvtColor(orig, cv2.COLOR_BGR2GRAY)
cv2.GaussianBlur(img, (3,3), 0, img)




# this is to recognize white on white
kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(MORPH,MORPH))
dilated = cv2.dilate(img, kernel)


edges = cv2.Canny(dilated, 0, CANNY, apertureSize=3)


lines = cv2.HoughLinesP(edges, 1,  3.14/180, HOUGH)
for line in lines[0]:
cv2.line(edges, (line[0], line[1]), (line[2], line[3]),
(255,0,0), 2, 8)


# finding contours
contours, _ = cv2.findContours(edges.copy(), cv.CV_RETR_EXTERNAL,
cv.CV_CHAIN_APPROX_TC89_KCOS)
contours = filter(lambda cont: cv2.arcLength(cont, False) > 100, contours)
contours = filter(lambda cont: cv2.contourArea(cont) > 10000, contours)


# simplify contours down to polygons
rects = []
for cont in contours:
rect = cv2.approxPolyDP(cont, 40, True).copy().reshape(-1, 2)
rects.append(rect)


# that's basically it
cv2.drawContours(orig, rects,-1,(0,255,0),1)


# show only contours
new = get_new(img)
cv2.drawContours(new, rects,-1,(0,255,0),1)
cv2.GaussianBlur(new, (9,9), 0, new)
new = cv2.Canny(new, 0, CANNY, apertureSize=3)


cv2.namedWindow('result', cv2.WINDOW_NORMAL)
cv2.imshow('result', orig)
cv2.waitKey(0)
cv2.imshow('result', dilated)
cv2.waitKey(0)
cv2.imshow('result', edges)
cv2.waitKey(0)
cv2.imshow('result', new)
cv2.waitKey(0)


cv2.destroyAllWindows()

Not perfect, but at least works for all samples:

1 2 3 4

Here you have @Vanuan 's code using C++:

cv::cvtColor(mat, mat, CV_BGR2GRAY);
cv::GaussianBlur(mat, mat, cv::Size(3,3), 0);
cv::Mat kernel = cv::getStructuringElement(cv::MORPH_RECT, cv::Point(9,9));
cv::Mat dilated;
cv::dilate(mat, dilated, kernel);


cv::Mat edges;
cv::Canny(dilated, edges, 84, 3);


std::vector<cv::Vec4i> lines;
lines.clear();
cv::HoughLinesP(edges, lines, 1, CV_PI/180, 25);
std::vector<cv::Vec4i>::iterator it = lines.begin();
for(; it!=lines.end(); ++it) {
cv::Vec4i l = *it;
cv::line(edges, cv::Point(l[0], l[1]), cv::Point(l[2], l[3]), cv::Scalar(255,0,0), 2, 8);
}
std::vector< std::vector<cv::Point> > contours;
cv::findContours(edges, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_TC89_KCOS);
std::vector< std::vector<cv::Point> > contoursCleaned;
for (int i=0; i < contours.size(); i++) {
if (cv::arcLength(contours[i], false) > 100)
contoursCleaned.push_back(contours[i]);
}
std::vector<std::vector<cv::Point> > contoursArea;


for (int i=0; i < contoursCleaned.size(); i++) {
if (cv::contourArea(contoursCleaned[i]) > 10000){
contoursArea.push_back(contoursCleaned[i]);
}
}
std::vector<std::vector<cv::Point> > contoursDraw (contoursCleaned.size());
for (int i=0; i < contoursArea.size(); i++){
cv::approxPolyDP(Mat(contoursArea[i]), contoursDraw[i], 40, true);
}
Mat drawing = Mat::zeros( mat.size(), CV_8UC3 );
cv::drawContours(drawing, contoursDraw, -1, cv::Scalar(0,255,0),1);
  1. Convert to lab space

  2. Use kmeans segment 2 cluster

  3. Then use contours or hough on one of the clusters (intenral)

Also you can use MSER (Maximally stable extremal regions) over Sobel operator result to find the stable regions of the image. For each region returned by MSER you can apply convex hull and poly approximation to obtain some like this:

But this kind of detection is useful for live detection more than a single picture that not always return the best result.

result