| March 01, 2021

A Deeper Way to Practice Deep Learning

Lessons from the 2020 SIAM Conference on Imaging Science

Deep learning (DL) is a revolution. The performance of deep learning solutions in the arena of image processing and computational imaging has taken a clear lead, pushing aside a wealth of classical knowledge that has accumulated over many decades of extensive research. While theoreticians acknowledge this performance boost, many find the seeds of DL unsettling due to their purely empirical nature, which does not have a strong theoretical backbone. DL also lacks model predictability and explainability, the absence of which could complicate real-world applications in fields like medical imaging and autonomous vehicles. Yet despite these shortcomings, increasingly more researchers from the mathematical imaging community are joining this new line of study.

But what about the classics in imaging science? Decades of powerful, theoretically sound, and successful methods have been built from different branches of mathematics, including variational approaches, partial differential equations (PDEs), harmonic analysis, sparsity-based models, and integral operators. Scientists have applied and intertwined these branches in various ways, resulting in powerful imaging techniques. Should we simply resign ourselves to the idea that these approaches might become obsolete?

Luckily, this DL fever has spread throughout the field of mathematical imaging in a more controlled, thoughtful, and “deep” manner than originally anticipated. Recall that DL’s initial framework involves choosing an arbitrary network architecture and training it end-to-end to match inputs to outputs in a supervised fashion. This strategy employs black box solutions to increase performance by leveraging massive amounts of data and computational power while neglecting physical connections and data models. In contrast, while the imaging research community has embraced DL, many individuals have been inclined to pursue work that also remains harmonious with the classics. Recent research in the field includes novel DL architectures that are based on well-posed traditional imaging tools. In this way, DL provides new vantage points for understanding conventional models while simultaneously presenting fresh opportunities for constructing a comprehensive general theory for DL. The cooperation of these two worlds is inspiring the discovery of important connections, questions, and complementary approaches (see Figure 1).

Figure 1. The careful combination of classical imaging and deep learning (DL) methods is inspiring important connections, questions, and complementary approaches. Image courtesy of Stacey Levine and Michael Elad.

The 2020 SIAM Conference on Imaging Science repeatedly and extensively reinforced these themes. The plenary talks offered snapshots of DL’s impact across imaging domains, as well as the thoughtfulness with which leading researchers are effectively merging DL with the “classics.”

DL architectures that are motivated by variational and PDE-based models are generating impressive results for image synthesis, restoration, and reconstruction. Gabriel Peyré’s (CNRS and École Normale Supérieure) address kicked off this recurring theme. His talk connected the field of optimal transport with Ian Goodfellow’s generative adversarial networks, wherein fitting densities that are parametrized by deepnets become powerful frameworks for both image generation and discrimination. Thomas Pock’s (Graz University of Technology) lecture on variational networks, which was inspired by the successful total variation functional, continued the same theme. Pock discussed image restoration architectures based on energy functionals that surpass their variational counterparts with respect to performance, while still remaining well-positioned to establish stability and generalization results and afford a much-desired Bayesian interpretation. Maarten V. de Hoop (Rice University) linked the Fourier integral operator and wave equation to DL architectures that researchers use for image reconstruction. He explained how these relationships give way to important generalizability guarantees — a critical challenge when one employs supervised data-driven DL models in practice

Several plenaries demonstrated the power of fusing classical and DL approaches, utilizing DL only where classical or physics-based models are lacking. Laura Waller (University of California, Berkeley) presented both the state of the art and current challenges in physics-based computational microscopy. She also offered keen insight into specific parts of the problem that can benefit from data-driven DL models and spoke about how the fusion of these approaches is pushing boundaries. The efficacy of using data priors to more cleverly train models via smaller datasets was central to Michal Irani’s (Weizmann Institute of Science) talk, which described the ability of patch-based methods to improve a degraded image by learning intrinsically (without requiring training data that is external to the image). Irani’s presentation also illustrated how combining intrinsic learning with external data-driven DL models can supply users with the best of both worlds.

The invited presentations likewise provided key insights into foundational questions that lie at the intersection of learning-based and classical approaches. William T. Freeman (Massachusetts Institute of Technology) identified the features of the human visual system that are most critical for replication in an artificial neural vision system; he connected these features with a range of examples and applications. On the computational side, Yuejie Chi (Carnegie Mellon University) bridged the gap between theory and practice in nonconvex approaches for the solution of low-rank matrix estimation problems, which are foundational in many machine learning and classical scenarios. Her talk addressed gradient descent-type algorithms with guarantees for computational complexity, statistical performance, and robustness properties while also emphasizing the need for more unified theory.

A collection of minitutorials by Daniel Cremers (Technical University of Munich), Michael Moeller (University of Siegen), Jeffrey Fessler (University of Michigan), and Peyman Milanfar (Google Research) all continued to build the bridge between classical and data-driven approaches by tackling applications in image restoration, medical image reconstruction, and computational photography. Minisymposia talks reiterated these themes and intertwined the classics with this new DL paradigm. The aforementioned examples provide just a snapshot of DL’s well-represented influence within the field of imaging science.

While many theoreticians initially doubted DL, this new paradigm no longer seems offensive — so long as scientists handle it thoughtfully. Analogues and connections to the classical body of imaging literature—ranging from vision modeling to informed DL architectures—are rich and growing. Such relationships lead to provable guarantees, as well as efficient and well-motivated optimization tools that are critical to network training. They are also unveiling connections that allow seemingly “black box solutions” to become more akin to “illuminating approaches.”

Our community’s perspective seems less like “build it deeper and see what happens” and more like “build it carefully and seek a balance between performance, mathematical foundations, and insight.” It is impossible to ignore DL’s potential, nor should we. But we are realizing that the classical knowledge and know-how in image processing and computer vision will play a central role in paving the way towards next-generation practice and understanding of DL solutions.

In light of these realizations, one might wonder whether DL has to be involved in every imaging science advancement. We do not believe that this is the case. Indeed, our community is currently making important advancements in various directions with purely classical approaches. New theoretical results in optimization, optimal transport, wave equations, harmonic analysis, variational methods, PDEs, patch-based methods, sparse representations, and other areas continue to impact the field in important ways. In fact, the imaging science community’s commitment to the classics—both within and outside the DL regime—is allowing DL to take its proper place in a productive and contextualized manner and is foundational to the field overall.

In summary, the imaging science research community is pursuing its own take on DL. It is a new playground, but we are utilizing our vast arsenal of classical skills so that we do not tackle each piece of equipment as if we have never seen it before. We are treating these new tools like playdough—molding architectures to complement our wealth of knowledge—while using DL to shape and evolve the classics, ultimately enabling the creation of things we never thought possible.

Stacey Levine and Michael Elad were the organizing committee co-chairs for the 2020 SIAM Conference on Imaging Science. Stacey Levine is a professor in the Department of Mathematics and Computer Science at Duquesne University. Michael Elad is a professor in the Computer Science Department at the Technion – Israel Institute of Technology. He is editor-in-chief of the SIAM Journal on Imaging Sciences.