Sparse Function Minimization

Compressive sensing (CS) makes heavy use of optimization to determine or approximate a sparse solution to an underdetermined linear system. The two essential issues are:

  1. How can we guarantee a sparse solution is unique and
  2. How can we guarantee that we can find or approximate the sparse solution with a low complexity algorithm

The latter problem is usually expressed as an optimization problem, especially in the presence of noise, attempting to find a sparse minimum of a least-squares cost function:

\displaystyle\min_x\|y-A x\|_2^2~\mathrm{s.t.}\|x\|_0\le K

where f(x)=\|y-A x\|_2 ^2 is a least square cost function to be minimized with a sparse variable x.

The commonly used least squares function, usually motivated by the presence of Gaussian noise in the measurements, is quite useful in most signal processing applications. However, more general cost functions are often necessary in other applications, such as machine learning, classification and detection and CS in the presence of severe quantization.

For that reason, Sohail Bahmani and Bhiksha Raj and I, desire to understand and solve the problem

\displaystyle\min_x f(x)~\mathrm{s.t.}\|x\|_0\le K.

Towards that end, we have developed a new algorithm, the Gradient Support Pursuit (GraSP), together with appropriate theory and solution guarantees. Our work generalizes the very well-known CoSaMP algorithm, the Restricted Isometry Property (RIP), and the signal reconstruction guarantees common in the CS literature. To demonstrate the algorithm in practice we also applied it to the sparse logistic regression problem [1, 2, 3]. Some of my earlier work on 1-bit CS algorithms, also uses the same ideas, although the theoretical guarantees do not immediately apply [1]. The overall algorithm is summarized in the figure below.

Pursuing the Support of the Gradient

A fundamental step in all sparse recovery algorithms is the correlation of the residual at the current signal estimate \widehat{x} with the measurement matrix, i.e. the operation A^*(y-A\widehat{x}) . The output of this operation is a good indicator of the important components to be used in describing the signal and is often referred to as a “proxy step,” a correlation step or a matched filter. This proxy also happens to be the gradient of f(x)=\|y-A x\|_2 ^2 evaluated at the current signal estimate \widehat{x}. This is the main intuition behind GraSP: if we want to reduce f(x) by changing only a small (sparse) set of coefficients, the largest gradient components indicate the location of the coefficients with the most impact in the cost.

Generalizing the proxy step of CoSaMP to a general gradient computation is one of the two major modifications introduced in GraSP. The second generalizes a restricted pseudoinverse computation in CoSaMP, replacing it with a restricted function minimization.

By replacing the function  f(x) with  f(x)=\|y-A x\|_2^2 , GraSP reverts to the familiar CoSaMP, summarized in the figure below, in a parallel fashion to the one above.

Theoretical Guarantees

To guarantee the performance of this algorithm, we generalized the RIP to the Stable Restricted Hessian (SRH) and the Stable Restricted Linearization (SRL) properties. The SRH describes how the hessian of the function behaves at any sparse point in the search space, characterizing the ratio of largest to smallest curvature around any sparse point, restricted to sparse directions. The SRL extends the guarantees to functions that are not necessarily differentiable. Both of these properties are used to bound the approximation error of the second-order Taylor expansion of the cost function, restricted to sparse canonical subspaces. In turn, this allows us to provide very good guarantees on the approximation performance of the algorithm.

For the standard least squares cost function, the SRH is identical to the RIP, and the guarantees provided are identical to the standard CS and CoSaMP guarantees. The interesting aspect, however, is that the cost function does not need to be convex in general but only when restricted to sparse inputs.

Sohail has put up a webpage for GraSP, where you can find code and examples to try.


P. T. Boufounos, “Greedy sparse signal reconstruction from sign measurements,” Proc. Asilomar Conf. on Signals Systems and Comput., pp. 1305-1309, Pacific Grove, CA, November 1-4, 2009.

[preprint] [Bibtex]

  Address =   {Pacific Grove, CA},
  Author =   {Boufounos, P. T.},
  Booktitle =   {Proc. Asilomar Conf. on Signals Systems and Comput.},
  Doi =     {10.1109/ACSSC.2009.5469926},
  Month =   {November 1-4},
  Organization = {IEEE},
  Pages =   {1305--1309},
  Pdf =     {},
  Title =   {Greedy sparse signal reconstruction from sign measurements},
  Url =     {},
  Year =   {2009}


S. Bahmani, B. Raj, and P. T. Boufounos, “Greedy Sparsity-Constrained Optimization,” Journal of Machine Learning Research, v. 14, pp. 807-841, March, 2013.

[preprint] [arXiv] [Bibtex]

  Arxivurl =   {},
  Author =   {Bahmani, S. and Raj, B. and Boufounos, P. T.},
  Month =   {March},
  Journal =   {Journal of Machine Learning Research},
  Url =     {},
  Pdf =     {},
  Title =   {Greedy Sparsity-Constrained Optimization},
  Year =   2013,
  volume =   14,
  pages =   {807--841}


S. Bahmani, P. T. Boufounos, and B. Raj, “Greedy Sparsity-Constrained Optimization,” Proc. Asilomar Conf. Signals Systems and Computers, Pacific Grove, CA, November 6-9, 2011.

[preprint] [Bibtex]

  Address =   {Pacific Grove, CA},
  Author =   {Bahmani, S. and Boufounos, P. T. and Raj, B.},
  Booktitle =   {Proc. Asilomar Conf. Signals Systems and Computers},
  Month =   {November 6-9},
  Doi =     {10.1109/ACSSC.2011.6190194},
  Url =     {},
  Pdf =          {},
  Title =   {Greedy Sparsity-Constrained Optimization},
  Year =   {2011}