| September 03, 2019

Smoothing and Thresholding: A Flexible Methodology for Image Segmentation

By Xiaohao Cai, Raymond Chan, Hongfei Yang, and Tieyong Zeng

Image segmentation aims to group objects with similar characteristics within an image. It is a fundamental task of image processing and computer vision, and has numerous engineering, medical, and commercial applications. Segmentation also serves as a preliminary step for higher-level computer vision tasks, like object recognition and interpretation.

The Mumford-Shah model, which achieves segmentation by solving a minimization problem, is one of the most influential segmentation models. However, the highly non-convex minimization problem makes the model numerically difficult to solve. A recently proposed smoothing and thresholding (SaT) model [1, 3, 7]—based on the Mumford-Shah model—provides a flexible methodology to produce superior segmentation results with fast and reliable numerical implementations.

For ease of discussion, we first consider grayscale image segmentation. One can represent a grayscale image with a function \(f\: : \: \Omega \rightarrow \mathbb{R}\), where \(\Omega\) is the image domain and \(f(x)\) is the image's density at \(x\). We aim to decompose the domain \(\Omega\) into \(K\) disjoint sub-domains \(\{\Omega_i\}^K_{i=1}\), where \(K\) depicts the number of phases. The SaT model is the first model to demonstrate that smoothing the observed image \(f\) before thresholding the smoothed image to obtain the subdomains can yield desirable segmentation results [3, 7]. The smoothing process in [3] solves the convex minimization problem

\[\inf_{g\in W^{1,2}(\Omega)} \Big\{\frac{\lambda}{2} \int_\Omega (f- Ag)^2 \textrm{d}x + \frac {\mu}{2} \int_\Omega |\bigtriangledown g|^2 \textrm{d}x + \int_\Omega |\bigtriangledown g| \textrm{d}x \Big\}, \tag1 \]

where \(\lambda\) and \(\mu\) are positive parameters. \(A\) is the blurring operator if the observed image is blurred; it represents the identity operator in the absence of blurring. The minimizer of \((1)\) is a smoothed approximation of \(f\). The first term is the data-fitting term, the second term ensures smoothness of the minimizer, and the third term guarantees regularity of the minimizer’s level sets.

After we obtain \(g\) in \((1)\), assume that we are given the thresholds

\[ \textrm{min}\{g\} = \rho_0 < \rho_1 < \cdots < \rho_{K -1} < \rho_K = \textrm{max} \{g\}.\]

We then threshold \(g\) by setting \(x \in \Omega\) in the subdomain \(\Omega_i\) if \(\rho_{i-1} \le g(x) < \rho_i\). Applying a popular clustering method—called the K-means method—on the intensity of \(g\) can yield the values \(\{\rho_i\}^{K-1}_{i=1}\). One can also obtain them by trial and error to produce a finer segmentation (see Figure 1).

Figure 1. Segmentations with Gaussian noise and blur. 1a. Original image. 1b. Degraded image. 1c. Chan-Vese method [9]. 1d. Smoothing and thresholding (SaT) segmentation with K-means thresholding [3]. Image adapted from [3].

The SaT approach has several advantages. First, the smoothing model with \((1)\) is strictly convex, which guarantees a unique solution that many optimization methods can efficiently solve. Secondly, the thresholding step is independent of the smoothing step; this lets the SaT approach accomplish segmentations with arbitrary phases, allowing one to easily try different thresholds without recalculating \((1)\). On the contrary, other segmentation methods require determination of the number of phases \(K\) prior to calculation, making it computationally expensive to regenerate a new segmentation for changes to \(K\).

Figure 2. Segmentations of a fractal image corrupted with Gamma noise and blur. 2a. Degraded image. 2b. Max-flow method [16]. 2c. Frame-based method [10]. 2d. Smoothing and thresholding (SaT) with user-provided thresholds [7]. For clarity, only the upper-left corner of the segmentations are shown. We see that the SaT method produces the best result, with the yellow segmentation line closest to the real boundary. Image adapted from [7].

The SaT approach is also very flexible. One can easily modify the smoothing step to better segment images with specific properties. For example, [7] applies a modified SaT method to images corrupted with multiplicative noises and blurs, and [13] uses a modified SaT method on tubular and medical images (see Figure 2). We combined the SaT approach with a Retinex model—which explains how human eyes perceive constant colors under various illuminations—to obtain a segmentation method for images with intensity inhomogeneity (see Figure 3) [8].

Figure 3. Three phase segmentations with intensity inhomogeneity. 3a. Original image. 3b. Level-set method [12]. 3c. Smoothing and thresholding (SaT) with Retinex segmentation [8]. The sky and the ground both exhibit uneven illuminations. Our SaT method with Retinex can successfully separate the three phases. Image adapted from [8].

There is clear evidence of the SaT approach’s superior performance. If we set the parameter \(\mu\) in \((1)\) to zero, we can show that the SaT method is equivalent to the famous Chan-Vese segmentation method, which is a simplified Mumford-Shah model [4]. Furthermore, numerical experiments show that a properly-selected \(\mu\) typically increases segmentation accuracies. Extending grayscale image segmentation techniques to methods for color images is not inconsequential. A color image is usually represented by a vector-valued function \(f = (f_1, f_2, f_3) : \Omega \rightarrow \mathbb{R}^3\), where \(f_1\), \(f_2\), and \(f_3\) generally signify red, green, and blue channels respectively. The difficulty of color image segmentation partly stems from the strong inter-channel correlation. The smoothing, lifting, and thresholding (SLaT) method is a novel extension of the SaT approach [1]. One first solves \((1)\) for \(f_1\), \(f_2\), and \(f_3\) to obtain three smooth functions: \(g_1\), \(g_2\), and \(g_3\). This is followed by transformation of \((g_1, g_2, g_3)\) to another color space \((\bar{g_1},\bar{g_2},\bar{g_3})\), which can reduce inter-channel correlation. The Lab color space is generally a good choice for this lifting process. During the thresholding step, performing K-means thresholds the lifted image with six channels \((g_1, g_2, g_3,\bar{g_1},\bar{g_2},\bar{g_3})\) to obtain the respective phases. The SLaT method is easy to implement and yields promising results (see Figure 4).

Figure 4. Color image segmentation for degraded images. The top row depicts degraded color images. The first three images are degraded by various noise and blur, and the last two are degraded by 60 percent information loss and noise. The middle row illustrates the convex minimal partition method [14], and the bottom row portrays the smoothing, lifting, and thresholding (SLaT) method [1]. Image adapted from [1].

Model \((1)\) in the smoothing step is also used in [15] (with \(\mu=0\)) and [11] for image restoration. Given this unexpected relationship, we recently explored smoothing steps where advanced image restoration models inspired minimization models. For example, the authors of [6] replace the last two terms (the regularization terms) in \((1)\) with a carefully-designed, non-convex regularization term. Non-convex regularization terms are known to produce better results in image restoration. Precise construction enables the new smoothing step to maintain an overall convex objective function with non-convex regularization (the convex non-convex (CNC) approach). Numerical experiments demonstrate improvements in segmentation results via the SaT approach with CNC regularization.

Figure 5. Hyperspectral image classification of the Indian Pines data set. 5a. Ground truth. 5b. Training set (10 percent of total pixels). 5c. Classification with smoothing and thresholding (SaT) (98.83 percent overall accuracy). Image adapted from [5].

Recent studies successfully apply the SaT approach to classification tasks, which assign a label to each data point. The core of a wide range of classification methods is a support-vector machine (SVM). One first trains the SVM on a set of training data to understand the relationship between labels and data points. After training, feeding the SVM data points from a test data set obtains scores for each label. The label with the largest score is assigned to the corresponding new data point. Our studies indicate that combining the SaT approach with SVM models can improve classification accuracies and achieve state-of-the-art results. We have tested classification models with SVM for hyperspectral image classifications and point cloud classifications (see Figures 5 and 6).

Figure 6. Three-moon classification. 6a. Ground truth. 6b. Training set (five percent of total data points). 6c. Classification with smoothing and thresholding (SaT) (99.5 percent accuracy) Image adapted from [2].

In conclusion, the SaT technique provides an efficient, flexible methodology for image segmentation and is easily adapted for various segmentation tasks. It connects the segmentation problem with the image restoration problem. Recent research also demonstrates the SaT method’s application to classification problems. We hope to inspire use of the method in more cross-disciplinary work.

Raymond Chan presented this work during an invited talk at the 2018 SIAM Conference on Imaging Science, held last year in Bologna, Italy.

References
[1] Cai, X., Chan, R., Nikolova, M., & Zeng, T. (2017). A three-stage approach for segmenting degraded color images: smoothing, lifting and thresholding (SLaT). J. Sci. Comput., 72(3), 1313-1332.
[2] Cai, X., Chan, R., Xie, X., & Zeng, T. (2019). A SaT strategy based segmentation method for high-dimensional data and point clouds. To be published.
[3] Cai, X., Chan, R., & Zeng, T. (2013). A two-stage image segmentation method using a convex variant of the Mumford-Shah model and thresholding. SIAM J. Imag. Sci., 6(1), 368-390.
[4] Cai, X., & Steidl, G. (2013). Multiclass Segmentation by Iterated ROF Thresholding. Energy Minimization Methods in Computer Vision and Pattern Recognition 2013, 237-250.
[5] Chan, R., Kan, K.K., Nikolova, M., & Plemmons, R.J. (2018). A two-stage method for spectral-spatial classification of hyperspectral images. Preprint, arXiv:1806.00836.
[6] Chan, R., Lanza, A., Morigi, S., & Sgallari, F. (2018). Convex non-convex image segmentation. Numer. Math., 138(3), 635-680.
[7] Chan, R., Yang, H., & Zeng, T. (2014). A two-stage image segmentation method for blurry images with Poisson or multiplicative gamma noise. SIAM J. Imag. Sci., 7(1), 98-127.
[8] Chan, R., Yang, H., & Zeng, T. (2019). Total variation and tight frame image segmentation with intensity inhomogeneity. Preprint, arXiv:1904.01760.
[9] Chan, T.F., & Vese, L.A. (2001). Active contours without edges. IEEE Trans. Imag. Process., 10(2), 266-277.
[10] Dong, B., Chien, A., & Shen, Z. (2011). Frame based segmentation for medical images. Comm. Math. Sci., 9(2), 551-559.
[11] Hintermüller, M., & Stadler, G. (2006). An infeasible primal-dual algorithm for total bounded variation-based inf-convolution-type image restoration. SIAM J. Sci. Comput., 28(1), 1-23.
[12] Li, C., Huang, R., Ding, Z., Gatenby, J.C., Metaxas, D.N., & Gore, J.C. (2011). A level set method for image segmentation in the presence of intensity inhomogeneities with application to MRI. IEEE Trans. Image Process., 20(7), 2007-2016.
[13] Liu, C., Ng, M.K.-P., & Zeng, T. (2018). Weighted variational model for selective image segmentation with application to medical images. Patt. Recog., 76, 367-379.
[14] Pock, T., Chambolle, A., Cremers, D., & Bischof, H. (2009). A convex relaxation approach for computing minimal partitions. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (pp. 810-817). Miami, FL: IEEE.
[15] Rudin, L.I., Osher, S., & Fatemi, E. (1992). Nonlinear total variation based noise removal algorithms. Physica D: Nonlin. Phenom., 60(1), 259-268.
[16] Yuan, J., Bae, E., & Tai, X.-C. (2010). A study on continuous max-flow and min-cut approaches. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 2217-2224). San Francisco, CA: IEEE.

Xiaohao Cai is a research associate at the University College London’s Mullard Space Science Laboratory. His research interests focus on numerical analysis, inverse problems in imaging science, and their applications. Raymond Chan is the Dean of Science at the City University of Hong Kong. His research interests encompass numerical linear algebra, imaging science, and financial mathematics. Hongfei Yang is a senior research associate in the Department of Mathematics at the City University of Hong Kong. His research interests include imaging science, machine learning, and function algebras. Tieyong Zeng is an associate professor in the Department of Mathematics at the Chinese University of Hong Kong. His research interests focus on the mathematics of data science, including image processing, inverse problems, optimization, and machine learning.