IOS 恢复摄像机投影

我正试图估计我的设备位置与太空中的二维码有关。我使用的是在 iOS11中引入的 ARKit 和 Vision 框架,但是这个问题的答案可能并不取决于它们。

使用 Vision 框架,我可以得到在相机框架中包围二维码的矩形。我想匹配这个矩形的设备转换和旋转必要的转换二维码从一个标准的位置。

例如,如果我观察框架:

*            *


B
C
A
D




*            *

而如果我距离二维码1米,以它为中心,假设二维码的边长为10厘米,我会看到:

*            *




A0  B0


D0  C0




*            *

我的设备在这两个帧之间的转换是什么?我明白,一个确切的结果可能是不可能的,因为可能观察到的二维码稍微不平面,我们试图估计一个仿射变换的东西,不是一个完美的。

我想 sceneView.pointOfView?.camera?.projectionTransformsceneView.pointOfView?.camera?.projectionTransform?.camera.projectionMatrix更有帮助,因为后者已经考虑到了从 ARKit 推断出的转换,而我对这个问题不感兴趣。

我该怎么填补

func get transform(
qrCodeRectangle: VNBarcodeObservation,
cameraTransform: SCNMatrix4) {
// qrCodeRectangle.topLeft etc is the position in [0, 1] * [0, 1] of A0


// expected real world position of the QR code in a referential coordinate system
let a0 = SCNVector3(x: -0.05, y: 0.05, z: 1)
let b0 = SCNVector3(x: 0.05, y: 0.05, z: 1)
let c0 = SCNVector3(x: 0.05, y: -0.05, z: 1)
let d0 = SCNVector3(x: -0.05, y: -0.05, z: 1)


let A0, B0, C0, D0 = ?? // CGPoints representing position in
// camera frame for camera in 0, 0, 0 facing Z+


// then get transform from 0, 0, 0 to current position/rotation that sees
// a0, b0, c0, d0 through the camera as qrCodeRectangle
}

= = = = = 编辑 = = = =

在尝试了许多事情之后,我最终使用 openCV 投影和透视求解器 solvePnP来进行摄像机姿态估计。这给了我一个旋转和平移,可以在 QR 码参考中表示摄像机的姿态。然而,当使用这些值和放置对应于逆变换的物体时(QR 码应该在相机空间中的位置) ,我得到了不准确的位移值,并且我无法使旋转起作用:

// some flavor of pseudo code below
func renderer(_ sender: SCNSceneRenderer, updateAtTime time: TimeInterval) {
guard let currentFrame = sceneView.session.currentFrame, let pov = sceneView.pointOfView else { return }
let intrisics = currentFrame.camera.intrinsics
let QRCornerCoordinatesInQRRef = [(-0.05, -0.05, 0), (0.05, -0.05, 0), (-0.05, 0.05, 0), (0.05, 0.05, 0)]


// uses VNDetectBarcodesRequest to find a QR code and returns a bounding rectangle
guard let qr = findQRCode(in: currentFrame) else { return }


let imageSize = CGSize(
width: CVPixelBufferGetWidth(currentFrame.capturedImage),
height: CVPixelBufferGetHeight(currentFrame.capturedImage)
)


let observations = [
qr.bottomLeft,
qr.bottomRight,
qr.topLeft,
qr.topRight,
].map({ (imageSize.height * (1 - $0.y), imageSize.width * $0.x) })
// image and SceneKit coordinated are not the same
// replacing this by:
// (imageSize.height * (1.35 - $0.y), imageSize.width * ($0.x - 0.2))
// weirdly fixes an issue, see below


let rotation, translation = openCV.solvePnP(QRCornerCoordinatesInQRRef, observations, intrisics)
// calls openCV solvePnP and get the results


let positionInCameraRef = -rotation.inverted * translation
let node = SCNNode(geometry: someGeometry)
pov.addChildNode(node)
node.position = translation
node.orientation = rotation.asQuaternion
}

以下是输出结果:

enter image description here

其中 A,B,C,D 是二维码角,按照它们传递给程序的顺序。

当手机旋转时,预测的起点仍然在原地,但是它偏离了它应该在的位置。令人惊讶的是,如果我改变观测值,我能够纠正这个:

  // (imageSize.height * (1 - $0.y), imageSize.width * $0.x)
// replaced by:
(imageSize.height * (1.35 - $0.y), imageSize.width * ($0.x - 0.2))

enter image description here

现在预测的原点保持稳定,但是我不明白移动值来自哪里。

最后,我试图找到一个与二维码相关的固定方向:

    var n = SCNNode(geometry: redGeometry)
node.addChildNode(n)
n.position = SCNVector3(0.1, 0, 0)
n = SCNNode(geometry: blueGeometry)
node.addChildNode(n)
n.position = SCNVector3(0, 0.1, 0)
n = SCNNode(geometry: greenGeometry)
node.addChildNode(n)
n.position = SCNVector3(0, 0, 0.1)

当我直接看二维码的时候,方向是正确的,但是它会随着一些似乎与手机旋转有关的东西而变化:enter image description here

我有几个突出的问题:

  • 我怎么解这个旋转呢?
  • 位移值从何而来?
  • 旋转、平移、 QRCorner 坐标、观测值、内点验证了哪些简单的关系?是 O ~ K ^-1 * (R _ 3x2 | T) Q 吗?因为如果是这样的话数量级就少了一些。

如果这有帮助的话,这里有一些数值:

Intrisics matrix
Mat 3x3
1090.318, 0.000, 618.661
0.000, 1090.318, 359.616
0.000, 0.000, 1.000


imageSize
1280.0, 720.0
screenSize
414.0, 736.0

= = = = 编辑,第二季,第2集 = = =

我注意到,当手机与二维码保持水平平行时(即旋转矩阵是[[ a,0,b ] ,[0,1,0] ,[ c,0,d ]) ,旋转效果很好,不管实际的二维码方向是什么:

enter image description here

其他的轮换不起作用。

4071 次浏览

Math (Trig.):

Equation

Notes: the bottom is l (the QR code length), the left angle is k, and the top angle is i (the camera)

Picture

Coordinate systems' correspondence

Take into consideration that Vision/CoreML coordinate system doesn't correspond to ARKit/SceneKit coordinate system. For details look at this post.

Rotation's direction

I suppose the problem is not in matrix. It's in vertices placement. For tracking 2D images you need to place ABCD vertices counter-clockwise (the starting point is A vertex located in imaginary origin x:0, y:0). I think Apple Documentation on VNRectangleObservation class (info about projected rectangular regions detected by an image analysis request) is vague. You placed your vertices in the same order as is in official documentation:

var bottomLeft: CGPoint
var bottomRight: CGPoint
var topLeft: CGPoint
var topRight: CGPoint

But they need to be placed the same way like positive rotation direction (about Z axis) occurs in Cartesian coordinates system:

enter image description here

World Coordinate Space in ARKit (as well as in SceneKit and Vision) always follows a right-handed convention (the positive Y axis points upward, the positive Z axis points toward the viewer and the positive X axis points toward the viewer's right), but is oriented based on your session's configuration. Camera works in Local Coordinate Space.

Rotation direction about any axis is positive (Counter-Clockwise) and negative (Clockwise). For tracking in ARKit and Vision it's critically important.

enter image description here

The order of rotation also makes sense. ARKit, as well as SceneKit, applies rotation relative to the node’s pivot property in the reverse order of the components: first roll (about Z axis), then yaw (about Y axis), then pitch (about X axis). So the rotation order is ZYX.