pypose.module.EPnP¶

class pypose.module.EPnP(intrinsics=None, refine=True)[source]¶

Batched EPnP Solver - a non-iterative \(\mathcal{O}(n)\) solution to the Perspective-\(n\)-Point (PnP) problem for \(n \geq 4\).

Parameters:

intrinsics (torch.Tensor, optional) – The camera intrinsics. The shape is (…, 3, 3). Default: None
refine (bool, optional) – refine the solution with Gaussian-Newton optimizer. Default: True.

Assume each of the \(n\) points in the world coordinate is \(p^w_i\) and in camera coordinate is \(p^c_i\). They are represented by weighted sums of the four virtual control points, \(c^w_j\) and \(c^c_j\) in the world and camera coordinate, respectively.

\[\begin{aligned} & p^w_i = \sum^4_{j=1}{\alpha_{ij}c^w_j} \\ & p^c_i = \sum^4_{j=1}{\alpha_{ij}c^c_j} \\ & \sum^4_{j=1}{\alpha_{ij}} = 1 \end{aligned} \]

Let the projection matrix be \(P = K[R|T]\), where \(K\) is the camera intrinsics, \(R\) is the rotation matrix and \(T\) is the translation vector. Then we have

\[\begin{aligned} s_i p^{\text{img}}_i &= K\sum^4_{j=1}{\alpha_{ij}c^c_j}, \end{aligned} \]

where \(p^{\text{img}}_i\) is pixels in homogeneous form \((u_i, v_i, 1)\), \(s_i\) is the scale factor. Let the control point in camera coordinate represented by \(c^c_j = (x^c_j, y^c_j, z^c_j)\). Rearranging the projection equation yields two linear equations for each of the \(n\) points:

\[\begin{aligned} \sum^4_{j=1}{\alpha_{ij}f_xx^c_j + \alpha_{ij}(u_0 - u_i)z^c_j} &= 0 \\ \sum^4_{j=1}{\alpha_{ij}f_yy^c_j + \alpha_{ij}(v_0 - v_i)z^c_j} &= 0 \end{aligned} \]

Assume \(\mathbf{x} = \begin{bmatrix} c^{c^T}_1 & c^{c^T}_2 & c^{c^T}_3 & c^{c^T}_4 \end{bmatrix}^T\), then the two equations form a system \(Mx = 0\) considering all of the \(n\) points. Its solution can be expressed as

\[\begin{aligned} x &= \sum^4_{i=1}{\beta_iv_i}, \end{aligned} \]

where \(v_i\) is the null vectors of matrix \(M^T M\) corresponding to its least 4 eigenvalues.

The final step involves calculating the coefficients \(\beta_i\). Optionally, the Gauss-Newton algorithm can be used to refine the solution of \(\beta_i\).

Example

>>> import torch, pypose as pp
>>> f, (H, W) = 2, (9, 9) # focal length and image height, width
>>> intrinsics = torch.tensor([[f, 0, H / 2],
...                            [0, f, W / 2],
...                            [0, 0,   1  ]])
>>> object = torch.tensor([[2., 0., 2.],
...                        [1., 0., 2.],
...                        [0., 1., 1.],
...                        [0., 0., 1.],
...                        [1., 0., 1.],
...                        [5., 5., 3.]])
>>> pixels = pp.point2pixel(object, intrinsics)
>>> pose = pp.SE3([ 0., -8,  0.,  0., -0.3827,  0.,  0.9239])
>>> points = pose.Inv() @ object
...
>>> epnp = pp.module.EPnP(intrinsics)
>>> pose = epnp(points, pixels)
SE3Type LieTensor:
LieTensor([ 3.9816e-05, -8.0000e+00,  5.8174e-05, -3.3186e-06, -3.8271e-01,
            3.6321e-06,  9.2387e-01])

Warning

Currently this module only supports batched rectified camera intrinsics, which can be defined in the form:

\[K = \begin{pmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{pmatrix} \]

The full form of camera intrinsics will be supported in a future release.

Note

The implementation is based on the paper

Francesc Moreno-Noguer, Vincent Lepetit, and Pascal Fua, EPnP: An Accurate O(n) Solution to the PnP Problem, International Journal of Computer Vision (IJCV), 2009.

forward(points, pixels, intrinsics=None)[source]¶

Parameters:

points (torch.Tensor) – 3D object points in the world coordinates. Shape (…, N, 3)
pixels (torch.Tensor) – 2D image points, which are the projection of object points. Shape (…, N, 2)
intrinsics (torch.Tensor, optional) – camera intrinsics. Shape (…, 3, 3). Setting it to any non-None value will override the default intrinsics kept in the module.

Returns:

estimated pose (SE3type) for the camera.

Return type:

LieTensor

pypose.module.EPnP¶

Docs

Tutorials

Get Started