Introduction

This site concerns posest, a C/C++ library for 3D pose estimation from point correspondences that is distributed as open source under the GNU General Public License (GPL). Pose estimation refers to the computation of position and orientation estimates that fully define the posture of a rigid object in space (6 DoF in total). The computation is based on a set of known 3D points and their corresponding 2D projections on an imaging sensor. The library estimates the relative motion between the 3D points and the camera. Depending on the application, it can estimate object pose when the camera is stationary and the points originate from a moving object, or camera pose when a camera moves freely in a mostly stationary scene. Single or binocular camera systems are supported. Image points typically originate from local features, however posest is oblivious to their origin. Mismatched 3D-2D point pairs (i.e. outliers) are tolerated via robust regression techniques. Pose is estimated by optimizing physically meaningful geometric criteria.

The development of posest has been partially supported by the EU FP7 and H2020 programmes under grants
270138 (DARWIN), 826506 (sustAGE) and 101017151 (FELICE).

Technical Overview

posest includes monocular and binocular pose estimation variants. Binocular pose estimation offers increased accuracy at the cost of increased computing time. The approach adopted in each case is outlined below; more details can be found in our related ICVS'13 publication which is available here [bibtex].

Monocular robust pose estimation

Preliminary pose estimation: A P3P or P4P solver is embedded in a RANSAC framework that uses the redescending M-estimator sample consensus (MSAC) cost function for hypotheses scoring. RANSAC determines the best scoring hypothesis and classifies correspondences into inliers and outliers. In addition to pose, the P4P solver can also yield focal length estimates (i.e., it is actually a P4Pf solver).
Non-linear refinement: The preliminary pose is refined by minimizing the cumulative reprojection error for all inliers. To mitigate the influence of mislocalized 2D points, a M-estimate of the reprojection error rather than the squared Euclidean norm is minimized. This minimization is carried out iteratively with the Levenberg-Marquardt algorithm initialized with the preliminary pose and using analytic Jacobians.
Global optimization: When the non-linear refinement is initiated far from the true minimum, it runs the risk of getting trapped to a local minimum. To counter this, multi-start global optimization can (optionally) be employed to explore the pose space using multiple local optimizations each initialized with a different starting point.

Binocular robust pose estimation

Extrinsic calibration: The two cameras are calibrated extrinsically with the aid of a calibration grid. This procedure is performed offline as the calibration is assumed to remain fixed.
Monocular pose estimation: Pose estimation is performed individually for each of the two cameras, as described above.
Joint pose refinement: A single pose is estimated for the binocular pair, enforcing the constraint that the relative pose of the employed cameras is known. The estimation employs the inliers of each monocular estimation and is based on the minimization of the binocular reprojection error in both images. Such an approach circumvents the error-prone reconstruction of points via triangulation and does not limit the baseline of the two views nor calls for sparse feature or 3D point matching.

More techical details are provided in the FAQ and section Available Functions. Included demo programs posest_demo.c and binocposest_demo.c provide working examples of using posest. A MEX-file interface for using posest from within matlab is also provided.
It is stressed that posest does not include any means for detecting and matching point features between the 3D world and images. Such functionality can be supplied by other software such as, for example, Lowe's SIFT detector & descriptor, or the VLFeat and OpenSURF (archive) libraries.

Available Functions

posest provides two primary user-callable functions for performing pose estimation. Function prototypes and a brief explanation of their arguments is provided below. Some general conventions followed by both functions are as follows. Points (2 or 3D) are provided as 2D arrays where rows correspond to individual vectors and columns to vector coordinates. For example, the two 3D vectors (10., 20., 30.) and (70., 80., 90.) are laid out in memory as

10.	20.	30.
70.	80.	90.

which is of course equivalent to

10.

20.

30.

70.

80.

90.

2D matrices such as intrinsics or camera matrices should be supplied using the standard (i.e., row-major) C convention. Both functions employ the same return codes.
The included posest_demo.c sample program gives a working example of pose estimation from user-supplied data.

`posest()`

.:: toggle display ::.


/*
 * This function implements robust non-linear 3D pose estimation for the monocular case.
 * Arguments are as follows:
 *
 * pts2D & pts3D contain the corresponding 2D-3D points, possibly including outliers
 *
 * nmatches is the number of correspondences
 *
 * inlPcent specifies the expected fraction of outliers; this is only used to determine
 *       the number of RANSAC iterations
 *
 * pp upon return contains the estimated pose parameters, see the description below
 *       for their layout
 *
 * npp specifies the number of parameters to be estimated:
 *       a value of NUM_RTPARAMS (6) specifies that r & t are estimated and returned as vectors in that order
 *       a value of NUM_RTFPARAMS (7+1) specifies that r, t & focal length are estimated
 *
 * K is the 3x3 camera intrinsic calibration matrix
 *
 * NLrefine specifies the non-linear refinement to be carried on inliers after RANSAC
 *       POSEST_REPR_ERR_NO_NLN_REFINE    does not perform non-linear refinement
 *       POSEST_REPR_ERR_NLN_REFINE       performs non-linear refinement to minimize reprojection error
 *       POSEST_REPR_ERR_NLN_MLSL_REFINE  performs non-linear refinement with the multistart scheme
 *
 * idxOutliers should point to sufficiently large memory for storing the indices of
 *       outlying corresponding pairs upon return. Specify NULL if this is not needed
 *
 * nbOutliers is the number of detected outliers (whose indices are in idxOutliers)
 *
 * verbose specifies the verbosity level
 *
 * In case two views are available, use posestBinoc() below.
 *
 * Returns POSEST_OK on success, POSEST_ERR otherwise
 *
 */

int posest(double (*pts2D)[2], double (*pts3D)[3], int nmatches, double inlPcent, double K[9],
           double *pp, int npp, int NLrefine, int *idxOutliers, int *nbOutliers, int verbose);

`posestBinoc()`

.:: toggle display ::.


/*
 * This function implements robust non-linear 3D pose estimation for the binocular case.
 * The pose is estimated with the first (left) camera as reference.
 * Arguments are as follows:
 *
 * pts2DL & pts3DL contain the corresponding 2D-3D points for the left frame
 *
 * nmatchesL is the number of left frame correspondences
 *
 * PextL is the 3x4 Euclidean camera projection matrix of the left frame  
 *
 * pts2DR & pts3DR contain the corresponding 2D-3D points for the right frame
 *
 * nmatchesR is the number of right frame correspondences
 *
 * PextR is the 3x4 Euclidean camera projection matrix of the right frame  
 *
 * inlPcent specifies the expected fraction of outliers; this is only used to determine
 *       the number of RANSAC iterations
 *
 * inscale is the scale factor used to scale input 3D points prior to further processing;
 *       use 1.0 if points are already properly scaled
 *
 * NLrefine specifies the non-linear refinement to be carried for the monocular pose estimation;
 *       see the homonymous argument in posest()
 *
 * estScale specified whether scale should be estimated. If estScale>0 then the scale
 *       returned is the product of inscale and that estimated
 *
 * rtLs upon return contains the estimated pose parameters (with the left frame as reference)
 *       and scale (if estScale>0). Parameter layout is r, t & scale
 *
 * idxOutliersLR should point to sufficiently large memory for storing the indices of
 *       outlying corresponding pairs in both frames upon return. Indices <nmatchesL refer to the
 *       left frame, whereas indices i>=nmatchesL should be interpreted as i-nmatchesL and refer
 *       to the right frame. Specify NULL if this information is not needed
 *
 * nbOutliersLR is the total number of detected outliers (left + right)
 *
 * verbose specifies the verbosity level
 *
 * In the single view case, use posest() above.
 *
 * Returns POSEST_OK on success, POSEST_ERR otherwise
 *
 */

 int posestBinoc(double (*pts2DL)[2], double (*pts3DL)[3], int nmatchesL, double PextL[NUM_PPARAMS],
                 double (*pts2DR)[2], double (*pts3DR)[3], int nmatchesR, double PextR[NUM_PPARAMS],
                                 double inlPcent, double inscale, double rtLs[NUM_RTPARAMS+1],
                                 int estScale, int NLrefine, int *idxOutliersLR, int *nbOutliersLR, int verbose);

Contact Address

If you find this package useful or have any comments/questions/suggestions, please contact me at

Be warned that although we try to reply to most messages, it might take long to do so.
In case that you use posest in your published work, please cite this paper and acknowledge this site.

hits since Thu Oct 9 12:04:27 EET 2014

posest : A C/C++ Library for Robust 6DoF Pose Estimation from 3D-2D Correspondences

Introduction

Technical Overview

Available Functions

`posest()`

`posestBinoc()`

Contact Address

Search site

Download code

News

Links