If you are looking for a dense Levenberg-Marquardt C/C++ implementation, please have a look at levmar.

Introduction

This site provides sparseLM, a general-purpose software package for large-scale, arbitrarily sparse non-linear least squares that is distributed under the GNU General Public License. sparseLM implements a sparse variant of the Levenberg-Marquardt optimization algorithm in ANSI C and includes a Matlab MEX-interface.

Motivation

Non-linear model fitting is a standard procedure for inferring the values of parameters from observed data in a manner that reduces the impact of measurement error. This is achieved by employing non-linear least squares to minimize the total error pertaining to overdetermined sets of observations. The total error is expressed by a sum-of-squares cost function (i.e., a $L$ ₂ norm), whose minimizer represents the statistically optimal estimate of the sought parameters under the assumption of Gaussian noise. Owing to their non-convexity, $L$ ₂ cost functions are minimized with iterative non-linear optimization techniques, of which the Levenberg-Marquardt (LM) algorithm has become the de facto standard. LM operates by repeatedly linearizing the function to be minimized in the neighborhood of the current minimizer estimate and computing an improvement to it through the solution of a linear system defined with the aid of the Jacobian and known as the normal equations. Considering that each computation of the solution to a dense linear system has complexity $O(N$ ³) in the number of unknown parameters $N$ , it is clear that general purpose LM implementations are computationally very demanding when employed to minimize functions involving a large number of parameters.

Fortunately, when dealing with large estimation problems arising in practice, the corresponding error typically exhibits lack of interdependence among certain subgroups of the parameters to be estimated. This observation translates to Jacobians for the least squares minimization that are sparse, that is, consist of mostly zero elements. In turn, sparse Jacobians yield normal equation systems with sparse block structure. By avoiding storing and operating on zero elements of the normal matrix during the course of LM, substantial memory and execution time benefits can be gained. The downside is that taking advantage of arbitrary sparseness is problem-specific and requires considerable implementation effort for developing LM variants adapted to particular sparse problems.

Overview

During the last few years, effective mechanisms for dealing with arbitrary sparseness have emerged in the form of a number of algorithms and corresponding implementations for the direct solution of large sparse linear systems of equations. Owing to a number of attractive properties (no need for preconditioners, fast convergence, computation of exact rather than approximate solutions, mature implementations), sparse linear solvers are well-suited as general-purpose sparse linear solvers. sparseLM builds upon this technology to fulfill the need for a high quality sparse Levenberg-Marquardt solver designed for general-purpose, arbitrarily sparse non-linear least squares minimization. The software has been designed with the twofold objective of exploiting sparseness for maximizing performance while shielding the user from the algorithmic details associated with direct linear solvers. It accepts sparse Jacobians encoded in either compressed row storage (CRS) or compressed column storage (CCS) (a.k.a. Harwell-Boeing) format, allowing user applications to choose the representation that is most natural to them. It also offers the possibility of numerically approximating the Jacobian using forward finite differences on data provided by successive invocations of the function to be minimized. In this case, only the sparsity pattern of the Jacobian should be specified by the user and not its numerical values. Although not recommended due to performance considerations, sparseLM can even be instructed to attempt the automatic detection of the sparsity pattern corresponding to the Jacobian of the function to be minimized. The core LM algorithm employed by sparseLM is identical to that in the levmar dense LM package which is available here. As a result, sparseLM has a similar "look and feel" with levmar. More details on sparseLM can be found in this ECCV10 paper (bibtex entry).

sparseLM includes interfaces to a wide variety of direct sparse codes, the list of which currently consists of HSL's MA57, MA77, MA47 & MA27, PARDISO, SuperLU, TAUCS, UMFPACK, CHOLMOD, LDL, CSparse, SPOOLES and MUMPS.
A few of the above solvers (e.g. PARDISO, MUMPS) use shared-memory parallelism to improve numerical performance, whereas others (such as MA77 & TAUCS) support out-of-core processing for dealing with very large matrices that cannot fit in RAM. CHOLMOD, an open-source sparse Cholesky factorization package, is chosen as sparseLM's default solver. Header files for CHOLMOD and libraries compiled with MSVC 2005 under WinXP are here.

To the best of my knowledge, sparseLM is the first and currently the only package of its kind to be released as free software.

Available Functions

sparseLM offers several user-callable functions whose suffix denotes whether the Jacobian is analytic or approximate and specifies the sparse format it is stored in. More specifically, sparseLM includes the functions below:

sparselm_dercrs(): analytic Jacobian supplied in CRS
sparselm_derccs(): analytic Jacobian supplied in CCS
sparselm_difcrs(): finite difference approximated Jacobian, zero pattern supplied in CRS
sparselm_difccs(): finite difference approximated Jacobian, zero pattern supplied in CCS

Both CRS and CCS employ C's native zero-based indexing convention. Notice that using finite differences to approximate the Jacobian results in repetitive evaluations of the function to be fitted, thus negatively influencing performance. To reduce the total number of these evaluations, a finite difference approximation to a Jacobian is estimated using a scheme that exploits sparseness to compute several of the Jacobian's columns with a single function evaluation. All functions basically solve the same problem, i.e. they seek the parameter vector p that best describes (in terms of the L2 norm) the measurements vector x. More precisely, given a vector function f : R^m --> R^n with n>=m, they compute a p such that f(p) ~= x, i.e. the squared norm ||e||^2=||x-f(p)||^2 is minimized.

Briefly, the arguments of the functions are as follows: pointers to routines evaluating the vector function f and its Jacobian J, pointers to the initial estimate of the parameter vector p and the measurement vector x, the dimension of p, the number of initial elements of p that are to be kept fixed to their initial values, the dimension of x, the number of nonzero elements of J, the number of nonzero elements of the approximate Hessian J^t*J, the maximum number of iterations, a pointer to an array whose elements specify certain minimization options, a pointer to an array into the elements of which various pieces of information regarding the result of minimization will be returned, and a pointer to application-specific data structures that might be necessary for computing the function to be minimized and its derivatives. More details can be found below. Additionally, the included splmdemo.c sample program gives working examples of minimizing several nonlinear functions. The exact prototypes of sparselm_dercrs() and sparselm_difcrs() are given next, along with a description of each argument. I and O in these descriptions denote input and output arguments, respectively. Being similar, the prototypes of sparselm_derccs() and sparselm_difccs() are omitted.

`sparselm_dercrs()`

.:: toggle display ::.


/* 
 * This function seeks the parameter vector p that best describes the measurements vector x.
 * More precisely, given a vector function  func : R^m --> R^n with n>=m,
 * it finds p s.t. func(p) ~= x, i.e. the squared second order (i.e. L2) norm of
 * e=x-func(p) is minimized.
 *
 * This function accepts either a) an analytic Jacobian in CRS or CCS format or
 * b) the nonzero pattern of the Jacobian in CRS or CCS and then approximates its
 * elements via finite differencing.
 *
 * Provision of an analytic Jacobian function is required. In case such a function is unavailable,
 * use sparselm_difcrs() below.
 *
 * Returns the number of iterations (>=0) if successful, SPLM_ERROR if failed
 */

int sparselm_dercrs(
  void (*func)(double *p, double *hx, int nvars, int nobs, void *adata),
                                      /* functional relation describing measurements. Given a parameter vector p,
                                       * computes a prediction of the measurements \hat{x}. p is nvarsx1,
                                       * \hat{x} is nobsx1, maximum
                                       */
  void (*fjac)(double *p, struct splm_crsm *jac, int nvars, int nobs, void *adata),
                                      /* function to supply the nonzero pattern of the sparse Jacobian of func and
                                       * evaluate it at p in CRS format. Non-zero elements are to be stored in jac
                                       * which has been preallocated with a capacity of Jnnz
                                       */

  double *const p,                    /* I/O: initial parameter estimates. On output has the estimated solution. size nvars */
  double *const x,                    /* I: measurement vector. size nobs. NULL implies a zero vector */
  const int nvars,                    /* I: parameter vector dimension (i.e. #unknowns) [m] */
  const int nconvars,                 /* I: number of parameters (starting from the 1st) whose values should not be modified. >=0 */
  const int nobs,                     /* I: measurement vector dimension [n] */
  int Jnnz,                           /* I: number of nonzeros for the Jacobian J */
  int JtJnnz,                         /* I: number of nonzeros for the product J^t*J, -1 if unknown */
  const int itmax,                    /* I: maximum number of iterations */
  double opts[SPLM_OPTSSZ],           /* I: minim. options [\mu, \epsilon1, \epsilon2, \epsilon3, delta, spsolver]. Respectively the scale factor
                                       * for initial \mu, stopping thresholds for ||J^T e||_inf, ||dp||_2 and ||e||_2. Fifth element (i.e., delta)
                                       * is unused, last element specifies the sparse direct solver to employ. Set to NULL for defaults to be used.
                                       */
double info[SPLM_INFOSZ],             /* O: information regarding the minimization. Set to NULL if don't care
                                       * info[0]=||e||_2 at initial p.
                                       * info[1-4]=[ ||e||_2, ||J^T e||_inf,  ||dp||_2, mu/max[J^T J]_ii ], all computed at estimated p.
                                       * info[5]= # iterations,
                                       * info[6]=reason for terminating: 1 - stopped by small gradient J^T e
                                       *                                 2 - stopped by small dp
                                       *                                 3 - stopped by itmax
                                       *                                 4 - singular matrix. Restart from current p with increased mu 
                                       *                                 5 - too many failed attempts to increase damping. Restart with increased mu
                                       *                                 6 - stopped by small ||e||_2
                                       *                                 7 - stopped by invalid (i.e. NaN or Inf) "func" values. User error
                                       * info[7]= # function evaluations
                                       * info[8]= # Jacobian evaluations
                                       * info[9]= # linear systems solved, i.e. # attempts for reducing error
                                       */
void *adata                           /* pointer to possibly additional data, passed uninterpreted to func, fjac */

);

`sparselm_difcrs()`

.:: toggle display ::.


/*
 * Similar to sparselm_dercrs() except that fjac supplies only the non-zero pattern of the Jacobian
 * and its non-zero elements are then appoximated internally with the aid of finite differences.
 * If the analytic Jacobian is available, it is advised to use sparselm_dercrs() above.
 *
 * Returns the number of iterations (>=0) if successful, SPLM_ERROR if failed
 */
int sparselm_difcrs(
  void (*func)(double *p, double *hx, int nvars, int nobs, void *adata),
                                      /* functional relation describing measurements. Given a parameter vector p,
                                       * computes a prediction of the measurements \hat{x}. p is nvarsx1,
                                       * \hat{x} is nobsx1, maximum
                                       */
  void (*fjac)(double *p, struct splm_crsm *jac, int nvars, int nobs, void *adata),
                                      /* function to initialize the preallocated jac structure with the Jacobian's
                                       * non-zero pattern. Non-zero elements will be computed with the aid of finite differencing.
                                       * if NULL, the non-zero pattern of the Jacobian is detected automatically via an
                                       * exhaustive search procedure that is time-consuming and thus not recommended.
                                       */

  double *const p,                    /* I/O: initial parameter estimates. On output has the estimated solution. size nvars */
  double *const x,                    /* I: measurement vector. size nobs. NULL implies a zero vector */
  const int nvars,                    /* I: parameter vector dimension (i.e. #unknowns) [m] */
  const int nconvars,                 /* I: number of parameters (starting from the 1st) whose values should not be modified. >=0 */
  const int nobs,                     /* I: measurement vector dimension [n] */
  int Jnnz,                           /* I: number of nonzeros for the Jacobian J */
  int JtJnnz,                         /* I: number of nonzeros for the product J^t*J, -1 if unknown */
  const int itmax,                    /* I: maximum number of iterations */
  double opts[SPLM_OPTSSZ],           /* I: minim. options [\mu, \epsilon1, \epsilon2, \epsilon3, delta, spsolver]. Respectively the scale factor
                                       * for initial \mu, stopping thresholds for ||J^T e||_inf, ||dp||_2 and ||e||_2, the step used in difference
                                       * approximation to the Jacobian and the sparse direct solver to employ. Set to NULL for defaults to be used.
                                       * If \delta<0, the Jacobian is approximated with central differences which are more accurate (but more
                                       * expensive to compute!) compared to the forward differences employed by default.
                                       */
double info[SPLM_INFOSZ],             /* O: information regarding the minimization. Set to NULL if don't care
                                       * info[0]=||e||_2 at initial p.
                                       * info[1-4]=[ ||e||_2, ||J^T e||_inf,  ||dp||_2, mu/max[J^T J]_ii ], all computed at estimated p.
                                       * info[5]= # iterations,
                                       * info[6]=reason for terminating: 1 - stopped by small gradient J^T e
                                       *                                 2 - stopped by small dp
                                       *                                 3 - stopped by itmax
                                       *                                 4 - singular matrix. Restart from current p with increased mu 
                                       *                                 5 - too many failed attempts to increase damping. Restart with increased mu
                                       *                                 6 - stopped by small ||e||_2
                                       *                                 7 - stopped by invalid (i.e. NaN or Inf) "func" values. User error
                                       * info[7]= # function evaluations
                                       * info[8]= # Jacobian evaluations
                                       * info[9]= # linear systems solved, i.e. # attempts for reducing error
                                       */
void *adata                           /* pointer to possibly additional data, passed uninterpreted to func, fjac */

);

Note that passing NULL as the value of the opts argument of both functions results in default settings being used. Consult Madsen et al's lecture notes for details on the roles of opts and info arguments. The last argument (i.e. adata) is intended to help avoid direct use of globals in the routines computing the function to be minimized and its Jacobian. A structure containing pointers to appropriate data structures can be set up and a pointer to it can be passed to the LM function which then passes it uninterpreted to each call of the user-supplied routines. For CCS Jacobians, functions sparselm_derccs() and sparselm_difccs() can be used.

In addition to the above, sparseLM also offers some utility functions that often prove handy. These include functions splm_ccsm2crsm() & splm_crsm2ccsm() which implement conversion of a sparse matrix between CRS and CCS formats. Furthermore, function splm_stm2ccsm() allows a Jacobian that has been constructed element-by-element in sparse triplet format to be converted to CCS for use by sparselm_derccs()/sparselm_difccs().

Contact Address

If you find this package useful or have any comments/questions/suggestions, please contact me at

Be warned that although I try to reply to most messages, I might take long to do so.
In case that you use sparseLM in your published work, please include a reference to this paper: [ bibtex entry ].

Search site

Download code

Latest: sparselm-1.3
Older versions:
sparselm-1.2, sparselm-1.1, sparselm-1.0

News

All news

Links

sparseLM is and will remain a free software project. If you find it useful, you can support its development by donating any amount of your choice.

hits since Sat Dec 4 11:45:23 EET 2010

sparseLM : Sparse Levenberg-Marquardt nonlinear least squares in C/C++