# Learning from Their Mistakes: Self-Calibrating Sensors

The Internet of Things (IoT) contains billions of sensors that provide information for a large number of measurands. Many of these sensors are embedded in complex systems and can be deployed in remote locations. Careful calibration of such sensors, which is essential for optimal results, is often difficult or even impossible to achieve in practice. Indeed, the need for precise calibration of sensing devices—ranging from tiny sensors to space telescopes—manifests itself as a major roadblock in many scientific and technological endeavors beyond the IoT. In this context, calibration is an effort to correct for specific uncertainties or aberrations in the measurement process.

Consider the calibration of antenna arrays to correct gain/phase offsets in received data, a common problem in *direction-of-arrival estimation* when engineers attempt to calculate the direction of propagating waves based on data received from the antennae (see Figure 1) [6, 7]. Another instance is *blind deconvolution*, the issue of recovering a signal from its noisy convolution with a poorly-known or unknown point spread function [3]. This problem occurs in diverse fields, such as astronomical imaging and audio processing. Other instances arise in wireless communication [4], cryo-electron microscopy [12], and X-ray crystallography [11].

**Figure 1.**Direction-of-arrival estimation.

**1a.**Multiple signals impinge on an array of antennae. The goal is to estimate all angles xi using the (noisy) data received at the array. Here we assume that the antenna gains are unknown.

**1b.**A typical example of how one of the methods can solve this self-calibration problem. Figure 1a courtesy of authors and 1b courtesy of [6, 9].

It is therefore highly desirable to equip sensors and systems with the capability for *self-calibration* using information collected by the system to simultaneously estimate the calibration parameters and perform the system’s intended function (image reconstruction, signal estimation, target detection, etc.) [7, 9]. With some poetic license, we might say that sensors learn from their mistakes; instead of physically recalibrating themselves (usually an impractical or impossible task), they conduct a form of *virtual self-calibration* by correcting errors in the sensing process via carefully-constructed algorithms.

We can express many self-calibration problems in the following mathematical form:

\[y = A(\theta,x) + w, \tag1 \]

where \(y\) represents the measurements, \(A(\theta\),\(\cdot)\) is the sensing operator dependent on some unknown or imprecisely known parameters, \(\theta\), \(x\) is the information we wish to recover, and \(w\) denotes additive noise. \(A\) may depend linearly or nonlinearly on \(\theta\), while the properties of sensing uncertainty \(\theta\) depend largely on the application.

The generality of (1) renders it incapable of developing a rigorous and efficient framework for its solution. So we focus instead on some important special cases. A seemingly simple yet surprisingly useful case is the model

\[y = D(\theta) A x + w, \tag2 \]

where the matrix \(A\) and vector \(y\) are known but the diagonal matrix \(D(\theta)\) and vector \(x\) are not. As before, \(w\) is additive noise and thus unknown as well, though we may have statistical information about it. We can also interpret the goal of finding \(x\) as solving a linear system where an *unknown* gain \(D_i(\theta)\) rescales the \(i\)th row of the system matrix \(A.\)

One may assume that we have reached a level of simplicity with (2) that is too trivial to analyze mathematically and no longer useful in practice. However, as is often the case in mathematics, an outwardly simple model can be deceptive. On the one hand, a rigorous analysis of the aforementioned diagonal calibration model requires nontrivial mathematical tools; on the other, this “simple” model arises in numerous important applications [9].

One way to tackle this situation is by solving a nonlinear least squares problem

\[\min\limits_{\theta, x} \| D(\theta) A x - y\|\:^2. \tag3 \]

This scenario is far more intricate than linear least squares, where a well-defined solution often exists. The objective function’s nonconvexity may result in many local minima. In fact, recovering \(x\) is not necessarily an easy task even if \(\theta\) is completely known, especially when the linear system is underdetermined. If \(D\) depends linearly on \(\theta\), then (3) becomes a bilinear problem, which should make its solution easier; alas, bilinear problems are non-deterministic polynomial-time-hard in general. We will describe how recent advances in optimization can resolve these issues in many cases of interest.

Compressive sensing [2, 5] has become a game changer in modern signal processing. Using convex optimization, we can exploit the signal’s sparsity and accelerate the sensing process tremendously. Unlike additive perturbations in the measurement matrix, sparse signal reconstruction from compressive measurements is sensitive to multiplicative perturbations. The linear dependence of \(D\) on \(\theta\) results in a *bilinear compressive sensing* problem: how do we estimate the calibration parameter \(\theta\) and sparse \(x\) from their *bilinear* measurements? For example, after proper discretization in direction-of-arrival estimation, the location of nonzero entries of \(x\) represents the direction of waves and \(D(\theta)\) denotes the antennas’ unknown gains [6, 7].

*SparseLift*to tackle such bilinear compressive sensing problems [9]. Linear algebra reveals that the recovery of two vectors \(\theta\),\(x\) is equivalent to estimating a sparse rank-1 matrix \(\theta x^{\top}\) in this bilinear inverse problem. Due to the bilinearity, the measurement \(y\) is actually a

*linear*function on the matrix space. We denote this linear map as \(\mathcal{A}\) and estimate \(\theta x^{\top}\) by exploiting the sparsity of the “lifted” rank-1 matrix \(\theta x^{\top}\) via \(ell_1\)-regularization:

\[\min \|Z\|_1, \quad \text{s.t.} \quad \mathcal{A}(Z) = y \tag4 \]

where \(\|Z\|_1\) is the \(\ell_1\)-norm of the vectorized \(Z\). Ideally we have \(\theta x^{\top}= Z\), but in practice we compute the leading left and right singular vectors of \(Z\) to extract \(\theta\) and \(x\) from \(Z\) (up to a scalar). Indeed, SparseLift recovers \((\theta\), \(x)\) successfully if the number of constraints equals \(\tilde{\Omega}(\|\!\:\theta x^{\top}\!\|_0)\), where \(\|\theta x^{\top}\|_0\) is the cardinality of nonzero entries in \(\theta x^{\top}\). Many variations and extensions of (4) naturally exist.

**Figure 2.**Blind deconvolution.

**2a.**Blurred image of a flying swan with unknown blurring kernel.

**2b.**Recovered image via regularized gradient descent method. Original unblurred image (not pictured) courtesy of Steve Byland.

Besides direction-of-arrival estimation with unknown antenna gains, we can express various other problems—either directly or after some proper transform—in the form of (2). The blind deconvolution problem is perhaps the most widely-known example. How can we reconstruct two signals \((f\), \(g)\) from their convolution \(y = f\ast g\)? This highly ill-posed inverse problem pervades many areas of science and technology, such as image deblurring, wireless communication [1], and spike detection in neuroscience. The blind deconvolution problem is equivalent to the self-calibration model (2) in the frequency domain. If we take the Fourier transform of \(f\), then \(\widehat{y} = \widehat{f}\circ\widehat{g}\), where \(\widehat{f}\) represents the Fourier transform of \(f\) and "\(\circ\)" signifies entrywise multiplication. Now this exactly fits (2) by setting \(D(\theta) = \widehat{f}\) and \(A = \widehat{W}\), where \(g= Wx\) is the image/function, \(W\) is a dictionary to represent \(g\) (e.g., wavelet basis), and \(x\) denotes the coefficients (possibly sparse or approximately sparse).

We propose a simple gradient-descent-based algorithm to this blind deconvolution problem [8]. The algorithm consists of two steps: construction of a suitable initial guess via the spectral method and application of the gradient descent method to the objective function. The proposed algorithm is robust in the presence of noise and comes with rigorous convergence guarantees under relatively mild assumptions.

Figure 2 depicts an example. Figure 2a portrays convolution of an image with an unknown motion blurring kernel, while 2b shows the reconstructed image via the aforementioned method [8]. This approach outperforms convex alternatives in computation time and sampling complexity. The framework also allows us to solve joint blind deconvolution and demixing problems that can arise in multiuser communication scenarios of the IoT [10, 13].

While we have barely scratched the surface of self-calibration in this article, we hope we have communicated its versatility as a playground for mathematicians — full of difficult theoretical problems, interesting numerical challenges, and cutting-edge applications.

**References**

[1] Ahmed, A., Recht, B., & Romberg, J. (2014). Blind deconvolution using convex programming. *IEEE Trans. Inform. Theory*, *60*(3), 1711-1732.

[2] Candès, E.J., Romberg, J., & Tao, T. (2006). Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. *IEEE Trans. Inform. Theory*, *52*(2), 489-509.

[3] Chaudhuri, S., Velmurugan, R., & Rameshan, R. (2014). *Blind Image Deconvolution: Methods and Convergence.* New York, NY: Springer.

[4] Ding, Z., & Li, Y. (2001). *Blind Equalization and Identification*. New York, NY: CRC Press.

[5] Donoho, D.L. (2006). Compressed sensing. *IEEE Trans. Inform. Theory*, *52*(4), 1289-1306.

[6] Friedlander, B., & Strohmer, T. (2014). Bilinear compressed sensing for array self-calibration. In 2*014 48th Asilomar Conference on Signals, Systems, and Computers*. Pacific Grove, CA.

[7] Friedlander, B., & Weiss, A.J. (1991). Self-calibration for high-resolution array processing. In S. Haykin (Ed.), *Advances in Spectrum Analysis and Array Processing *(Vol. II) (pp. 349-413). Upper Saddle River, NJ: Prentice Hall.

[8] Li, X., Ling, S., Strohmer, T., & Wei, K. (2018). Rapid, robust, and reliable blind deconvolution via nonconvex optimization. *Appl. Comput. Harmon. Anal*.

[9] Ling, S., & Strohmer, T. (2015). Self-calibration and biconvex compressive sensing. *Inv. Prob.*, *31*(11), 115002.

[10] Ling, S., & Strohmer, T. (2017). Regularized gradient descent: A non-convex recipe for fast joint blind deconvolution and demixing. *Inform. Infer. J. IMA*., iax022.

[11] Philipp, H.T., Ayyer, K., Tate, M.W., Elser, V., & Gruner, S.M. (2012). Solving structure with sparse, randomly-oriented x-ray data. *Opt. Exp*., *20*(12), 13129-13137.

[12] Singer, A. (2018). Mathematics for cryo-electron microscopy. Preprint, *arXiv*:*1803.06714*.

[13] Wunder, G., Boche, H., Strohmer, T., & Jung, P. (2015). Sparse signal processing concepts for efficient 5G system design. *IEEE Acc*., *3*, 195-208.