提取文本 OpenCV

我试图在图像中找到文本的边框,目前正在使用这种方法:

// calculate the local variances of the grayscale image
Mat t_mean, t_mean_2;
Mat grayF;
outImg_gray.convertTo(grayF, CV_32F);
int winSize = 35;
blur(grayF, t_mean, cv::Size(winSize,winSize));
blur(grayF.mul(grayF), t_mean_2, cv::Size(winSize,winSize));
Mat varMat = t_mean_2 - t_mean.mul(t_mean);
varMat.convertTo(varMat, CV_8U);


// threshold the high variance regions
Mat varMatRegions = varMat > 100;

当给出这样的图像:

enter image description here

然后,当我显示 varMatRegions时,我得到这个图像:

enter image description here

正如你所看到的,它在一定程度上结合了左边的文本块和卡片的标题,对于大多数卡片来说,这种方法工作得很好,但是对于较繁忙的卡片,它可能会导致问题。

这些轮廓连接不好的原因是它使轮廓的边框几乎占据整个卡片。

有没有人能提出一个不同的方法,我可以找到文本,以确保正确的文本检测?

谁能在这两张卡片上找到文字,谁就得200分。

enter image description here enter image description here

127845 次浏览

下面是我用来检测文本块的另一种方法:

  1. 将图像转换为灰度
  2. 应用了 门槛(简单的二进制阈值,以150为阈值)
  3. 应用 膨胀增强图像中的线条,使物体更加紧凑,减少空白碎片。对迭代次数使用较高的值,因此膨胀非常大(13次迭代,也是为了获得最佳结果而精心挑选的)。
  4. 使用 opencv 发现轮廓函数识别图像中物体的轮廓。
  5. 绘制一个 包围盒(长方形)外围每个轮廓对象-每个框架的文本块。
  6. 选择性地丢弃那些不太可能是你正在搜索的对象的区域(例如文本块) ,因为上面的算法也可以找到相交或嵌套的对象(例如第一张卡片的整个顶部区域) ,其中一些对你的目的可能不感兴趣。

下面是用 pyopencv 编写的 python 代码,应该很容易移植到 C + + 。

import cv2


image = cv2.imread("card.png")
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY) # grayscale
_,thresh = cv2.threshold(gray,150,255,cv2.THRESH_BINARY_INV) # threshold
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
dilated = cv2.dilate(thresh,kernel,iterations = 13) # dilate
_, contours, hierarchy = cv2.findContours(dilated,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE) # get contours


# for each contour found, draw a rectangle around it on original image
for contour in contours:
# get rectangle bounding contour
[x,y,w,h] = cv2.boundingRect(contour)


# discard areas that are too large
if h>300 and w>300:
continue


# discard areas that are too small
if h<40 or w<40:
continue


# draw rectangle around contour on original image
cv2.rectangle(image,(x,y),(x+w,y+h),(255,0,255),2)


# write original image with added contours to disk
cv2.imwrite("contoured.jpg", image)

原始图片是你帖子中的第一张图片。

经过预处理(灰度、阈值和扩张——所以在第3步之后) ,图像看起来是这样的:

Dilated image

下面是生成的图像(最后一行中的“ contoured.jpg”) ; 图像中对象的最终边界框如下所示:

enter image description here

您可以看到,左边的文本块被检测为一个单独的块,与其周围的区域分隔开。

使用相同的脚本和相同的参数(第二张图像的阈值类型改变了,如下所示) ,下面是其他两张卡的结果:

enter image description here

enter image description here

调整参数

参数(阈值,膨胀参数)为这个图像和这个任务(寻找文本块)优化,并可以调整,如果需要,为其他卡片图像或其他类型的对象被发现。

对于阈值(步骤2) ,我使用了黑色阈值。对于文字比背景亮的图片,比如文章中的第二张图片,应该使用白色阈值,所以用 cv2.THRESH_BINARY替换这种屏蔽类型。对于第二幅图像,我还使用了一个稍高的阈值(180)。改变阈值参数和膨胀迭代次数将导致不同程度的灵敏度划分物体的图像。

查找其他对象类型:

例如,将第一幅图像的展开次数减少到5次,我们就可以更精确地定义图像中的对象,粗略地找到图像中的所有 文字(而不是文本块) :

enter image description here

知道了一个单词的粗略大小,我在这里丢弃了那些太小(低于20像素宽度或高度)或太大(高于100像素宽度或高度)的区域,以忽略那些不太可能是单词的对象,从而得到上面图像的结果。

你可以试试由楚才艺和田英丽开发的 这种方法

他们还共享一个软件(该软件基于 Openv-1.0,应该在 Windows 平台下运行)您可以使用(尽管没有可用的源代码)。它将生成图像中的所有文本边界框(以彩色阴影显示)。通过应用你的样本图像,你会得到以下结果:

注意: 为了使结果更加健壮,您可以进一步合并相邻的框。


更新: 如果你的最终目标是识别图像中的文字,你可以进一步查看 Gttext,这是一个免费的 OCR 软件和彩色图像文字地面真实工具。源代码也是可用的。

有了这个,你就可以收到这样的短信:

您可以通过查找近边缘元素来检测文本(灵感来自 LPD) :

#include "opencv2/opencv.hpp"


std::vector<cv::Rect> detectLetters(cv::Mat img)
{
std::vector<cv::Rect> boundRect;
cv::Mat img_gray, img_sobel, img_threshold, element;
cvtColor(img, img_gray, CV_BGR2GRAY);
cv::Sobel(img_gray, img_sobel, CV_8U, 1, 0, 3, 1, 0, cv::BORDER_DEFAULT);
cv::threshold(img_sobel, img_threshold, 0, 255, CV_THRESH_OTSU+CV_THRESH_BINARY);
element = getStructuringElement(cv::MORPH_RECT, cv::Size(17, 3) );
cv::morphologyEx(img_threshold, img_threshold, CV_MOP_CLOSE, element); //Does the trick
std::vector< std::vector< cv::Point> > contours;
cv::findContours(img_threshold, contours, 0, 1);
std::vector<std::vector<cv::Point> > contours_poly( contours.size() );
for( int i = 0; i < contours.size(); i++ )
if (contours[i].size()>100)
{
cv::approxPolyDP( cv::Mat(contours[i]), contours_poly[i], 3, true );
cv::Rect appRect( boundingRect( cv::Mat(contours_poly[i]) ));
if (appRect.width>appRect.height)
boundRect.push_back(appRect);
}
return boundRect;
}

用法:

int main(int argc,char** argv)
{
//Read
cv::Mat img1=cv::imread("side_1.jpg");
cv::Mat img2=cv::imread("side_2.jpg");
//Detect
std::vector<cv::Rect> letterBBoxes1=detectLetters(img1);
std::vector<cv::Rect> letterBBoxes2=detectLetters(img2);
//Display
for(int i=0; i< letterBBoxes1.size(); i++)
cv::rectangle(img1,letterBBoxes1[i],cv::Scalar(0,255,0),3,8,0);
cv::imwrite( "imgOut1.jpg", img1);
for(int i=0; i< letterBBoxes2.size(); i++)
cv::rectangle(img2,letterBBoxes2[i],cv::Scalar(0,255,0),3,8,0);
cv::imwrite( "imgOut2.jpg", img2);
return 0;
}

结果:

Element = getstructuringElement (cv: : MORPH _ RECT,cv: : Size (17,3)) ; imgOut1 imgOut2

Element = getstructuringElement (cv: : MORPH _ RECT,cv: : Size (30,30)) ; imgOut1 imgOut2

对于上面提到的其他图像,结果是相似的。

我在下面的程序中使用了一个基于渐变的方法。添加了生成的图像。请注意,我正在使用缩小版本的图像进行处理。

C + + 版本

The MIT License (MIT)


Copyright (c) 2014 Dhanushka Dangampola


Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:


The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.


THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.


#include "stdafx.h"


#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <iostream>


using namespace cv;
using namespace std;


#define INPUT_FILE              "1.jpg"
#define OUTPUT_FOLDER_PATH      string("")


int _tmain(int argc, _TCHAR* argv[])
{
Mat large = imread(INPUT_FILE);
Mat rgb;
// downsample and use it for processing
pyrDown(large, rgb);
Mat small;
cvtColor(rgb, small, CV_BGR2GRAY);
// morphological gradient
Mat grad;
Mat morphKernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));
morphologyEx(small, grad, MORPH_GRADIENT, morphKernel);
// binarize
Mat bw;
threshold(grad, bw, 0.0, 255.0, THRESH_BINARY | THRESH_OTSU);
// connect horizontally oriented regions
Mat connected;
morphKernel = getStructuringElement(MORPH_RECT, Size(9, 1));
morphologyEx(bw, connected, MORPH_CLOSE, morphKernel);
// find contours
Mat mask = Mat::zeros(bw.size(), CV_8UC1);
vector<vector<Point>> contours;
vector<Vec4i> hierarchy;
findContours(connected, contours, hierarchy, CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE, Point(0, 0));
// filter contours
for(int idx = 0; idx >= 0; idx = hierarchy[idx][0])
{
Rect rect = boundingRect(contours[idx]);
Mat maskROI(mask, rect);
maskROI = Scalar(0, 0, 0);
// fill the contour
drawContours(mask, contours, idx, Scalar(255, 255, 255), CV_FILLED);
// ratio of non-zero pixels in the filled region
double r = (double)countNonZero(maskROI)/(rect.width*rect.height);


if (r > .45 /* assume at least 45% of the area is filled if it contains text */
&&
(rect.height > 8 && rect.width > 8) /* constraints on region size */
/* these two conditions alone are not very robust. better to use something
like the number of significant peaks in a horizontal projection as a third condition */
)
{
rectangle(rgb, rect, Scalar(0, 255, 0), 2);
}
}
imwrite(OUTPUT_FOLDER_PATH + string("rgb.jpg"), rgb);


return 0;
}

巨蟒版本

The MIT License (MIT)


Copyright (c) 2017 Dhanushka Dangampola


Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:


The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.


THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.


import cv2
import numpy as np


large = cv2.imread('1.jpg')
rgb = cv2.pyrDown(large)
small = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)


kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
grad = cv2.morphologyEx(small, cv2.MORPH_GRADIENT, kernel)


_, bw = cv2.threshold(grad, 0.0, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)


kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 1))
connected = cv2.morphologyEx(bw, cv2.MORPH_CLOSE, kernel)
# using RETR_EXTERNAL instead of RETR_CCOMP
contours, hierarchy = cv2.findContours(connected.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
#For opencv 3+ comment the previous line and uncomment the following line
#_, contours, hierarchy = cv2.findContours(connected.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)


mask = np.zeros(bw.shape, dtype=np.uint8)


for idx in range(len(contours)):
x, y, w, h = cv2.boundingRect(contours[idx])
mask[y:y+h, x:x+w] = 0
cv2.drawContours(mask, contours, idx, (255, 255, 255), -1)
r = float(cv2.countNonZero(mask[y:y+h, x:x+w])) / (w * h)


if r > 0.45 and w > 8 and h > 8:
cv2.rectangle(rgb, (x, y), (x+w-1, y+h-1), (0, 255, 0), 2)


cv2.imshow('rects', rgb)

enter image description here enter image description here enter image description here

@ dhanushka 的方法显示了最大的希望,但是我想玩玩 Python,所以继续把它翻译成有趣的语言:

import cv2
import numpy as np
from cv2 import boundingRect, countNonZero, cvtColor, drawContours, findContours, getStructuringElement, imread, morphologyEx, pyrDown, rectangle, threshold


large = imread(image_path)
# downsample and use it for processing
rgb = pyrDown(large)
# apply grayscale
small = cvtColor(rgb, cv2.COLOR_BGR2GRAY)
# morphological gradient
morph_kernel = getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
grad = morphologyEx(small, cv2.MORPH_GRADIENT, morph_kernel)
# binarize
_, bw = threshold(src=grad, thresh=0, maxval=255, type=cv2.THRESH_BINARY+cv2.THRESH_OTSU)
morph_kernel = getStructuringElement(cv2.MORPH_RECT, (9, 1))
# connect horizontally oriented regions
connected = morphologyEx(bw, cv2.MORPH_CLOSE, morph_kernel)
mask = np.zeros(bw.shape, np.uint8)
# find contours
im2, contours, hierarchy = findContours(connected, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# filter contours
for idx in range(0, len(hierarchy[0])):
rect = x, y, rect_width, rect_height = boundingRect(contours[idx])
# fill the contour
mask = drawContours(mask, contours, idx, (255, 255, 2555), cv2.FILLED)
# ratio of non-zero pixels in the filled region
r = float(countNonZero(mask)) / (rect_width * rect_height)
if r > 0.45 and rect_height > 8 and rect_width > 8:
rgb = rectangle(rgb, (x, y+rect_height), (x+rect_width, y), (0,255,0),3)

现在显示图像:

from PIL import Image
Image.fromarray(rgb).show()

虽然不是最 Python 化的脚本,但我试图尽可能地模仿原始的 C + + 代码,以便读者能够遵循。

它几乎和原版一样好用。我将很高兴地阅读建议,如何可以改进/修复,以类似于原来的结果完全。

enter image description here

enter image description here

enter image description here

以上代码 JAVA 版本: 谢谢@William

public static List<Rect> detectLetters(Mat img){
List<Rect> boundRect=new ArrayList<>();


Mat img_gray =new Mat(), img_sobel=new Mat(), img_threshold=new Mat(), element=new Mat();
Imgproc.cvtColor(img, img_gray, Imgproc.COLOR_RGB2GRAY);
Imgproc.Sobel(img_gray, img_sobel, CvType.CV_8U, 1, 0, 3, 1, 0, Core.BORDER_DEFAULT);
//at src, Mat dst, double thresh, double maxval, int type
Imgproc.threshold(img_sobel, img_threshold, 0, 255, 8);
element=Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(15,5));
Imgproc.morphologyEx(img_threshold, img_threshold, Imgproc.MORPH_CLOSE, element);
List<MatOfPoint> contours = new ArrayList<MatOfPoint>();
Mat hierarchy = new Mat();
Imgproc.findContours(img_threshold, contours,hierarchy, 0, 1);


List<MatOfPoint> contours_poly = new ArrayList<MatOfPoint>(contours.size());


for( int i = 0; i < contours.size(); i++ ){


MatOfPoint2f  mMOP2f1=new MatOfPoint2f();
MatOfPoint2f  mMOP2f2=new MatOfPoint2f();


contours.get(i).convertTo(mMOP2f1, CvType.CV_32FC2);
Imgproc.approxPolyDP(mMOP2f1, mMOP2f2, 2, true);
mMOP2f2.convertTo(contours.get(i), CvType.CV_32S);




Rect appRect = Imgproc.boundingRect(contours.get(i));
if (appRect.width>appRect.height) {
boundRect.add(appRect);
}
}


return boundRect;
}

并在实践中使用以下代码:

        System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
Mat img1=Imgcodecs.imread("abc.png");
List<Rect> letterBBoxes1=Utils.detectLetters(img1);


for(int i=0; i< letterBBoxes1.size(); i++)
Imgproc.rectangle(img1,letterBBoxes1.get(i).br(), letterBBoxes1.get(i).tl(),new Scalar(0,255,0),3,8,0);
Imgcodecs.imwrite("abc1.png", img1);

@ dhanushka 解决方案的 Python 实现:

def process_rgb(rgb):
hasText = False
gray = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)
morphKernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3))
grad = cv2.morphologyEx(gray, cv2.MORPH_GRADIENT, morphKernel)
# binarize
_, bw = cv2.threshold(grad, 0.0, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
# connect horizontally oriented regions
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 1))
connected = cv2.morphologyEx(bw, cv2.MORPH_CLOSE, morphKernel)
# find contours
mask = np.zeros(bw.shape[:2], dtype="uint8")
_,contours, hierarchy = cv2.findContours(connected, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# filter contours
idx = 0
while idx >= 0:
x,y,w,h = cv2.boundingRect(contours[idx])
# fill the contour
cv2.drawContours(mask, contours, idx, (255, 255, 255), cv2.FILLED)
# ratio of non-zero pixels in the filled region
r = cv2.contourArea(contours[idx])/(w*h)
if(r > 0.45 and h > 5 and w > 5 and w > h):
cv2.rectangle(rgb, (x,y), (x+w,y+h), (0, 255, 0), 2)
hasText = True
idx = hierarchy[0][idx][0]
return hasText, rgb

这是来自 dhanushka 的使用 OpenCVSharp回答的 C # 版本

        Mat large = new Mat(INPUT_FILE);
Mat rgb = new Mat(), small = new Mat(), grad = new Mat(), bw = new Mat(), connected = new Mat();


// downsample and use it for processing
Cv2.PyrDown(large, rgb);
Cv2.CvtColor(rgb, small, ColorConversionCodes.BGR2GRAY);


// morphological gradient
var morphKernel = Cv2.GetStructuringElement(MorphShapes.Ellipse, new OpenCvSharp.Size(3, 3));
Cv2.MorphologyEx(small, grad, MorphTypes.Gradient, morphKernel);


// binarize
Cv2.Threshold(grad, bw, 0, 255, ThresholdTypes.Binary | ThresholdTypes.Otsu);


// connect horizontally oriented regions
morphKernel = Cv2.GetStructuringElement(MorphShapes.Rect, new OpenCvSharp.Size(9, 1));
Cv2.MorphologyEx(bw, connected, MorphTypes.Close, morphKernel);


// find contours
var mask = new Mat(Mat.Zeros(bw.Size(), MatType.CV_8UC1), Range.All);
Cv2.FindContours(connected, out OpenCvSharp.Point[][] contours, out HierarchyIndex[] hierarchy, RetrievalModes.CComp, ContourApproximationModes.ApproxSimple, new OpenCvSharp.Point(0, 0));


// filter contours
var idx = 0;
foreach (var hierarchyItem in hierarchy)
{
idx = hierarchyItem.Next;
if (idx < 0)
break;
OpenCvSharp.Rect rect = Cv2.BoundingRect(contours[idx]);
var maskROI = new Mat(mask, rect);
maskROI.SetTo(new Scalar(0, 0, 0));


// fill the contour
Cv2.DrawContours(mask, contours, idx, Scalar.White, -1);


// ratio of non-zero pixels in the filled region
double r = (double)Cv2.CountNonZero(maskROI) / (rect.Width * rect.Height);
if (r > .45 /* assume at least 45% of the area is filled if it contains text */
&&
(rect.Height > 8 && rect.Width > 8) /* constraints on region size */
/* these two conditions alone are not very robust. better to use something
like the number of significant peaks in a horizontal projection as a third condition */
)
{
Cv2.Rectangle(rgb, rect, new Scalar(0, 255, 0), 2);
}
}


rgb.SaveImage(Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "rgb.jpg"));

这是来自 dhanushka 的使用 EmguCV回答的 VB.NET 版本。

EmguCV 中的一些函数和结构需要与 OpenCVSharp 的 C # 版本不同的考虑

Imports Emgu.CV
Imports Emgu.CV.Structure
Imports Emgu.CV.CvEnum
Imports Emgu.CV.Util


Dim input_file As String = "C:\your_input_image.png"
Dim large As Mat = New Mat(input_file)
Dim rgb As New Mat
Dim small As New Mat
Dim grad As New Mat
Dim bw As New Mat
Dim connected As New Mat
Dim morphanchor As New Point(0, 0)


'//downsample and use it for processing
CvInvoke.PyrDown(large, rgb)
CvInvoke.CvtColor(rgb, small, ColorConversion.Bgr2Gray)


'//morphological gradient
Dim morphKernel As Mat = CvInvoke.GetStructuringElement(ElementShape.Ellipse, New Size(3, 3), morphanchor)
CvInvoke.MorphologyEx(small, grad, MorphOp.Gradient, morphKernel, New Point(0, 0), 1, BorderType.Isolated, New MCvScalar(0))


'// binarize
CvInvoke.Threshold(grad, bw, 0, 255, ThresholdType.Binary Or ThresholdType.Otsu)


'// connect horizontally oriented regions
morphKernel = CvInvoke.GetStructuringElement(ElementShape.Rectangle, New Size(9, 1), morphanchor)
CvInvoke.MorphologyEx(bw, connected, MorphOp.Close, morphKernel, morphanchor, 1, BorderType.Isolated, New MCvScalar(0))


'// find contours
Dim mask As Mat = Mat.Zeros(bw.Size.Height, bw.Size.Width, DepthType.Cv8U, 1)  '' MatType.CV_8UC1
Dim contours As New VectorOfVectorOfPoint
Dim hierarchy As New Mat


CvInvoke.FindContours(connected, contours, hierarchy, RetrType.Ccomp, ChainApproxMethod.ChainApproxSimple, Nothing)


'// filter contours
Dim idx As Integer
Dim rect As Rectangle
Dim maskROI As Mat
Dim r As Double
For Each hierarchyItem In hierarchy.GetData
rect = CvInvoke.BoundingRectangle(contours(idx))
maskROI = New Mat(mask, rect)
maskROI.SetTo(New MCvScalar(0, 0, 0))


'// fill the contour
CvInvoke.DrawContours(mask, contours, idx, New MCvScalar(255), -1)


'// ratio of non-zero pixels in the filled region
r = CvInvoke.CountNonZero(maskROI) / (rect.Width * rect.Height)


'/* assume at least 45% of the area Is filled if it contains text */
'/* constraints on region size */
'/* these two conditions alone are Not very robust. better to use something
'Like the number of significant peaks in a horizontal projection as a third condition */
If r > 0.45 AndAlso rect.Height > 8 AndAlso rect.Width > 8 Then
'draw green rectangle
CvInvoke.Rectangle(rgb, rect, New MCvScalar(0, 255, 0), 2)
End If
idx += 1
Next
rgb.Save(IO.Path.Combine(Application.StartupPath, "rgb.jpg"))

您可以利用一个 python 实现 特警队

全面披露: 我是这个图书馆的作者

要做到这一点:-

第一和第二张图片

请注意,这里的 text _ mode 是‘ lb _ df’,它代表光背景深色前景,也就是说,这张图片中的文本将会比背景颜色更深

from swtloc import SWTLocalizer
from swtloc.utils import imgshowN, imgshow


swtl = SWTLocalizer()
# Stroke Width Transform
swtl.swttransform(imgpaths='img1.jpg', text_mode = 'lb_df',
save_results=True, save_rootpath = 'swtres/',
minrsw = 3, maxrsw = 20, max_angledev = np.pi/3)
imgshow(swtl.swtlabelled_pruned13C)


# Grouping
respacket=swtl.get_grouped(lookup_radii_multiplier=0.9, ht_ratio=3.0)
grouped_annot_bubble = respacket[2]
maskviz = respacket[4]
maskcomb  = respacket[5]


# Saving the results
_=cv2.imwrite('img1_processed.jpg', swtl.swtlabelled_pruned13C)
imgshowN([maskcomb, grouped_annot_bubble], savepath='grouped_img1.jpg')

enter image description here enter image description here


enter image description here enter image description here

第三张图片

注意,这里的 text _ mode 是“ db _ lf”,它代表暗背景光前景,也就是说,这张图片中的文本颜色要比背景颜色浅

from swtloc import SWTLocalizer
from swtloc.utils import imgshowN, imgshow


swtl = SWTLocalizer()
# Stroke Width Transform
swtl.swttransform(imgpaths=imgpaths[1], text_mode = 'db_lf',
save_results=True, save_rootpath = 'swtres/',
minrsw = 3, maxrsw = 20, max_angledev = np.pi/3)
imgshow(swtl.swtlabelled_pruned13C)


# Grouping
respacket=swtl.get_grouped(lookup_radii_multiplier=0.9, ht_ratio=3.0)
grouped_annot_bubble = respacket[2]
maskviz = respacket[4]
maskcomb  = respacket[5]


# Saving the results
_=cv2.imwrite('img1_processed.jpg', swtl.swtlabelled_pruned13C)
imgshowN([maskcomb, grouped_annot_bubble], savepath='grouped_img1.jpg')

enter image description here enter image description here

您还会注意到,分组操作并不那么精确,为了得到图像可能变化时所需的结果,请尝试在 swtl.get_grouped()函数中调优分组参数。