小开

可能有几个可能的问题导致低质量的 Depth Channel和 Disparity Channel什么导致我们低质量的立体声序列。以下是其中的6个问题:

可能的问题一

不完全公式

正如字 uncalibrated所暗示的，stereoRectifyUncalibrated实例方法为您计算一个校正变换，以防您不知道或不能知道您的立体声对的内在参数及其在环境中的相对位置。

cv.StereoRectifyUncalibrated(pts1, pts2, fm, imgSize, rhm1, rhm2, thres)

地点:

# pts1    –> an array of feature points in a first camera
# pts2    –> an array of feature points in a first camera
# fm      –> input fundamental matrix
# imgSize -> size of an image
# rhm1    -> output rectification homography matrix for a first image
# rhm2    -> output rectification homography matrix for a second image
# thres   –> optional threshold used to filter out outliers

你的方法是这样的:

cv2.StereoRectifyUncalibrated(p1fNew, p2fNew, F, (2048, 2048))

因此，不要考虑三个参数: rhm1、 rhm2和 thres。如果是 threshold > 0，那么所有不符合极性几何形状的点对在计算同形异体之前都会被拒绝。否则，所有点都被认为是内点。这个公式是这样的:

(pts2[i]^t * fm * pts1[i]) > thres


# t   –> translation vector between coordinate systems of cameras

因此，我认为视觉上的误差可能是由于计算公式不完整造成的。

你可以在官方资源上阅读摄像机标定与三维重建。

可能的问题二

轴间距

一个健壮的 interaxial distance之间的左和右相机镜头必须是 not greater than 200 mm 。当 interaxial distance大于 interocular距离时，这种效果被称为 hyperstereoscopy或 hyperdivergence，不仅在场景中造成深度夸张，而且给观看者的身体带来不便。阅读 Autodesk 的立体电影制作白纸了解更多关于这个主题的信息。

可能的问题三

平行与脚趾相机模式

由于相机模式计算不正确，导致的 Disparity Map视觉误差可能会发生。许多立体摄影师更喜欢 Toe-In camera mode，而皮克斯更喜欢 Parallel camera mode。

可能的问题四

垂直对齐

在立体视觉中，如果发生垂直位移(即使其中一个视图向上移动了1毫米) ，就会破坏一个强大的立体体验。因此，在生成 Disparity Map之前，必须确保立体声对的左右视图相应地对齐。看看彩色立体白纸大约15个立体声中常见的问题。

立体声校正矩阵:

┌ ┐ | f 0 cx tx | | 0 f cy ty | # use "ty" value to fix vertical shift in one image | 0 0 1 0 | └ ┘

这里有一个 StereoRectify方法:

cv.StereoRectify(cameraMatrix1, cameraMatrix2, distCoeffs1, distCoeffs2, imageSize, R, T, R1, R2, P1, P2, Q=None, flags=CV_CALIB_ZERO_DISPARITY, alpha=-1, newImageSize=(0, 0)) -> (roi1, roi2)

可能的问题 V

镜头失真

镜头失真是立体构图中的一个重要课题。在生成 Disparity Map之前，您需要恢复左右视图的失真，在这之后生成一个视差通道，然后再次重新失真这两个视图。

可能出现的问题六

无抗锯齿的低质量深度通道

为了创建一个高质量的 Disparity Map你需要左和右 Depth Channels，必须预先生成。当你在3D 包工作，你可以渲染一个高品质的深度通道(与清晰的边缘) ，只需一点击。但是从视频序列生成高质量的深度通道并不容易，因为立体声对必须在您的环境中移动，以便为将来的运动深度算法生成初始数据。如果一帧中没有运动，深度通道将会非常差。

此外，Depth通道本身还有一个缺点-它的边缘不匹配的 RGB 的边缘，因为 它没有反锯齿。

视差通道代码片段:

这里我想介绍一种生成 Disparity Map的快速方法:

import numpy as np import cv2 as cv from matplotlib import pyplot as plt imageLeft = cv.imread('paris_left.png', 0) imageRight = cv.imread('paris_right.png', 0) stereo = cv.StereoBM_create(numDisparities=16, blockSize=15) disparity = stereo.compute(imageLeft, imageRight) plt.imshow(disparity, 'gray') plt.show()

小开

最佳答案

TLDR; 使用 StereoSGBM (半全局块匹配)的图像更平滑的边缘和使用一些后过滤，如果你想它更平滑

OP 没有提供原始图像，所以我从米德尔伯里数据集使用 Tsukuba。

结果与常规 StereoBM

结果与 StereoSGBM (调谐)

这是我能在文学作品中找到的最好的结果

有关详细信息，请参阅出版物给你。

后过滤的例子(见下面的链接)

OP 问题的理论/其他考虑

校准后的图像中的大片黑色区域会让我相信，对于这些图像，校准工作做得不是很好。可能有各种各样的原因，可能是物理设置，可能是光线校准，等等，但有很多相机校准教程在那里，我的理解是，你正在寻求一种方法，以获得一个更好的深度地图从一个未校准的设置(这是不是100% 清楚，但标题似乎支持这一点，我认为这就是人们会来这里试图找到)。

您的基本方法是正确的，但结果肯定是可以改进的。这种形式的深度映射并不能产生最高质量的地图(尤其是未经校准的地图)。最大的改进可能来自使用不同的立体匹配算法。灯光也可能产生显著的效果。正确的图像(至少在我的肉眼看来)似乎没有那么明亮，这可能会干扰重建。您可以首先尝试将其亮化到与其他相同的水平，或者收集新的图像，如果可能的话。从现在开始，我假设你们无法使用原始摄像机，所以我会考虑收集新图像，改变设置，或者进行校准，以避免被拍到。(如果你确实有机会进入设置和相机，那么我建议检查校准和使用校准的方法，因为这将工作得更好)。

你使用 StereoBM来计算你的视差(深度贴图) ，这确实有效，但是 StereoSGBM更适合这个应用(它能更好地处理光滑的边缘)。你可以看到下面的区别。

这篇文章更深入地解释了两者的区别:

块匹配主要关注高纹理图像(想象一棵树的图片) ，半全局块匹配主要关注亚像素级匹配和纹理更平滑的图片(想象一个走廊的图片)。

没有任何明确的内在摄像机参数，摄像机设置的细节(如焦距，摄像机之间的距离，与被摄对象的距离等) ，图像中的已知尺寸，或运动(使用由运动构成的结构) ，你只能获得一个投影变换的三维重建; 你不会有一个比例或必须旋转的感觉，但你仍然可以生成一个相对深度地图。你可能会遭受一些枪管和其他失真，可以去除适当的相机校准，但你可以得到合理的结果没有它，只要相机不是可怕的(镜头系统不是太失真) ，并设置非常接近规范配置规范配置规范配置规范配置(这基本上意味着他们的取向，使他们的光轴尽可能接近平行，他们的领域重叠足够)。然而，这似乎不是 OPs 的问题，因为他确实设法用未校准的方法得到了正确的校正图像。

基本程序

在两张图片中找到至少5个匹配良好的点，你可以用它们来计算基本矩阵(你可以使用任何你喜欢的检测器和匹配器，我保留了 FLANN，但是使用 ORB 来做检测，因为 SIFT 不在 OpenCV 的主版本4.2.0中)
用 findFundamentalMat计算基本矩阵 F
用 stereoRectifyUncalibrated和 warpPerspective恢复图像失真
用 StereoSGBM计算视差(深度图)

结果要好得多:

与 ORB 和 FLANN 匹配

未失真的图像(先左后右)

差距

StereoBM

这个结果看起来类似于 OPs 问题(斑点、缝隙、某些区域的错误深度)。

StereoSGBM (调谐)

这个结果看起来好多了，并且使用了与 OP 大致相同的方法，减去最终的视差计算，使我认为 OP 会看到类似的图像改进，如果它们被提供的话。

后过滤

在 OpenCV 文档中有一篇关于这个的好文章。如果你需要真正平滑的地图，我建议你看看它。

上面的示例照片是来自数据集中场景 ambush_2的帧1。

完整代码(在 OpenCV 4.2.0上测试) :

import cv2
import numpy as np
import matplotlib.pyplot as plt


imgL = cv2.imread("tsukuba_l.png", cv2.IMREAD_GRAYSCALE)  # left image
imgR = cv2.imread("tsukuba_r.png", cv2.IMREAD_GRAYSCALE)  # right image




def get_keypoints_and_descriptors(imgL, imgR):
"""Use ORB detector and FLANN matcher to get keypoints, descritpors,
and corresponding matches that will be good for computing
homography.
"""
orb = cv2.ORB_create()
kp1, des1 = orb.detectAndCompute(imgL, None)
kp2, des2 = orb.detectAndCompute(imgR, None)


############## Using FLANN matcher ##############
# Each keypoint of the first image is matched with a number of
# keypoints from the second image. k=2 means keep the 2 best matches
# for each keypoint (best matches = the ones with the smallest
# distance measurement).
FLANN_INDEX_LSH = 6
index_params = dict(
algorithm=FLANN_INDEX_LSH,
table_number=6,  # 12
key_size=12,  # 20
multi_probe_level=1,
)  # 2
search_params = dict(checks=50)  # or pass empty dictionary
flann = cv2.FlannBasedMatcher(index_params, search_params)
flann_match_pairs = flann.knnMatch(des1, des2, k=2)
return kp1, des1, kp2, des2, flann_match_pairs




def lowes_ratio_test(matches, ratio_threshold=0.6):
"""Filter matches using the Lowe's ratio test.


The ratio test checks if matches are ambiguous and should be
removed by checking that the two distances are sufficiently
different. If they are not, then the match at that keypoint is
ignored.


https://stackoverflow.com/questions/51197091/how-does-the-lowes-ratio-test-work
"""
filtered_matches = []
for m, n in matches:
if m.distance < ratio_threshold * n.distance:
filtered_matches.append(m)
return filtered_matches




def draw_matches(imgL, imgR, kp1, des1, kp2, des2, flann_match_pairs):
"""Draw the first 8 mathces between the left and right images."""
# https://docs.opencv.org/4.2.0/d4/d5d/group__features2d__draw.html
# https://docs.opencv.org/2.4/modules/features2d/doc/common_interfaces_of_descriptor_matchers.html
img = cv2.drawMatches(
imgL,
kp1,
imgR,
kp2,
flann_match_pairs[:8],
None,
flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS,
)
cv2.imshow("Matches", img)
cv2.imwrite("ORB_FLANN_Matches.png", img)
cv2.waitKey(0)




def compute_fundamental_matrix(matches, kp1, kp2, method=cv2.FM_RANSAC):
"""Use the set of good mathces to estimate the Fundamental Matrix.


See  https://en.wikipedia.org/wiki/Eight-point_algorithm#The_normalized_eight-point_algorithm
for more info.
"""
pts1, pts2 = [], []
fundamental_matrix, inliers = None, None
for m in matches[:8]:
pts1.append(kp1[m.queryIdx].pt)
pts2.append(kp2[m.trainIdx].pt)
if pts1 and pts2:
# You can play with the Threshold and confidence values here
# until you get something that gives you reasonable results. I
# used the defaults
fundamental_matrix, inliers = cv2.findFundamentalMat(
np.float32(pts1),
np.float32(pts2),
method=method,
# ransacReprojThreshold=3,
# confidence=0.99,
)
return fundamental_matrix, inliers, pts1, pts2




############## Find good keypoints to use ##############
kp1, des1, kp2, des2, flann_match_pairs = get_keypoints_and_descriptors(imgL, imgR)
good_matches = lowes_ratio_test(flann_match_pairs, 0.2)
draw_matches(imgL, imgR, kp1, des1, kp2, des2, good_matches)




############## Compute Fundamental Matrix ##############
F, I, points1, points2 = compute_fundamental_matrix(good_matches, kp1, kp2)




############## Stereo rectify uncalibrated ##############
h1, w1 = imgL.shape
h2, w2 = imgR.shape
thresh = 0
_, H1, H2 = cv2.stereoRectifyUncalibrated(
np.float32(points1), np.float32(points2), F, imgSize=(w1, h1), threshold=thresh,
)


############## Undistort (Rectify) ##############
imgL_undistorted = cv2.warpPerspective(imgL, H1, (w1, h1))
imgR_undistorted = cv2.warpPerspective(imgR, H2, (w2, h2))
cv2.imwrite("undistorted_L.png", imgL_undistorted)
cv2.imwrite("undistorted_R.png", imgR_undistorted)


############## Calculate Disparity (Depth Map) ##############


# Using StereoBM
stereo = cv2.StereoBM_create(numDisparities=16, blockSize=15)
disparity_BM = stereo.compute(imgL_undistorted, imgR_undistorted)
plt.imshow(disparity_BM, "gray")
plt.colorbar()
plt.show()


# Using StereoSGBM
# Set disparity parameters. Note: disparity range is tuned according to
#  specific parameters obtained through trial and error.
win_size = 2
min_disp = -4
max_disp = 9
num_disp = max_disp - min_disp  # Needs to be divisible by 16
stereo = cv2.StereoSGBM_create(
minDisparity=min_disp,
numDisparities=num_disp,
blockSize=5,
uniquenessRatio=5,
speckleWindowSize=5,
speckleRange=5,
disp12MaxDiff=2,
P1=8 * 3 * win_size ** 2,
P2=32 * 3 * win_size ** 2,
)
disparity_SGBM = stereo.compute(imgL_undistorted, imgR_undistorted)
plt.imshow(disparity_SGBM, "gray")
plt.colorbar()
plt.show()

来自未校准立体声系统的深度图