Learning-based Faulty State Estimation in Lithium-ion Batteries

By Geetika Vennam and Avimanyu Sahoo

Next-generation lithium-ion batteries (LIBs) in electric vehicles offer high energy density, high coulombic efficiency, and low self-discharge [5]. However, safety and reliability remain major concerns for these devices. A battery management system (BMS) controls battery charging and monitors internal states—such as the state of charge (SOC) and state of health (SOH)—to ensure battery safety and power. Accurate battery models and effective SOC and SOH estimation algorithms are crucial for battery behavior analysis and thermal management, ultimately extending the battery’s runtime and preventing overcharge/over-discharge. Advanced BMSs also include fault diagnosis algorithms that detect faults like voltage drift, overcharge/over-discharge current, and high temperatures [3].

Figure 1. State of health (SOH)-coupled model of a lithium-ion battery. 1a. Equivalent circuit model. 1b. Thermal model. 1c. Capacity fade model. Figure courtesy of Geetika Vennam.

Several recent studies use a two-state electrothermal model to detect internal thermal faults. These works incorporate core and surface temperatures as residuals and utilize adaptive thresholds to account for uncertainties in the battery model [1, 2, 9]. Such an approach employs the SOC information [1] and internal resistance estimates [2, 9] to compute the core temperature, which is not directly measurable. Furthermore, one study relies on an online battery internal resistance estimator to represent changes in core temperature due to fault, which means that the algorithm requires the physics of the failure mechanism [9].

Although these model-based schemes can detect internal thermal faults, they do not consider SOH effects on the equivalent circuit model (ECM) or the battery’s thermal dynamics. The ECM and thermal parameters vary with internal degradation phenomena and external factors like \(C_{\text{rate}}\), temperature, and depth of discharge, which may render these fault detection schemes ineffective [7]. In addition, core temperature estimations must include the change in resistance and SOC due to normal and accelerated health degradation and internal faults. The aforementioned models [1, 2, 9] also do not address the side reaction faults, such as dendrite growth, solid electrolyte interphase layer formation, and lithium plating [4, 7].

An SOH-coupled ECM thermal aging model for fault detection that incorporates adaptive residual thresholds can help reduce the number of false alarms by accounting for battery degradation [6]. This type of model may also improve the accuracy of key parameter estimations under fault conditions. To improve BMS capability and autonomy, we propose an SOH-coupled electrothermal-aging-model-based fault detection scheme and a neural network (NN)-based faulty state estimation scheme [6].

Model Formulation

Figure 1 depicts an SOH-coupled model that is integrated with ohmic resistance and terminal voltage dynamics for fault detection [8]. When we define the state vector \(x = [x_1~~x_2~~x_3~~x_4~~x_5~~x_6~~x_7~~x_8]^T \) with \(x_1=SOC\), \(x_2=V_{c_{p1}}\), \(x_3=V_{c_{p2}}\), \(x_4=T_c\), \(x_5=T_s\), \(x_6=SOH\), \(x_7=R_0\), and \(x_8=V_t\), we can then write the state space representation of the SOH-coupled nonlinear electrothermal aging model in control affine form as

\[\begin{aligned} \dot{x}& = f(x) + g(x)u\\ y &= Cx. \end{aligned} \tag1\]

Here, the state vector is \(x \in \mathbb{R}^8\) and the control input is \(u=[u_1 ~u_2]^T\in\mathbb{R}^2\), with \(u_1=I\) and \(u_2=T_a\).

Internal Fault Mapping

We consider the detection of four types of faults—convective cooling resistance faults, internal thermal resistance faults, thermal runaway faults [1], and side reaction faults [7]—via the state space model in \((1)\) (see Figure 2). However, internal faults also affect parameters that pertain to state \(x_2\), which again affects the states \(x_4\), \(x_5\), \(x_6\), \(x_7\), and \(x_8\). As such, internal faults are represented as additive faults on all model states \((1)\).

The state space model \((1)\) with faults is

\[\begin{aligned} \dot{x}_f& = f(x_f) + g(x_f)u + {\Gamma} (x_f,u)\\ \color{black}y_f &= Cx_f, \end{aligned} \tag2\]

where \(x_f\) are the faulty states and \(\Gamma (x_f,u) = [\gamma_1(t) ~\gamma_2(t) ~\gamma_3(t) ~\gamma_4(t) ~\gamma_5(t) ~\gamma_6(t) ~\gamma_7(t) ~\gamma_8(t) ]^T\) represents the faults that are added to the battery dynamics [8].

Figure 2. Various fault mappings. Figure courtesy of the authors.

SOH-coupled Model-based Fault Detection Scheme

Figure 3 illustrates our proposed SOH-inclusive model-based fault diagnostics scheme. A nonlinear observer estimates the states of the SOH-coupled model, after which a comparison between the battery outputs (terminal voltage and surface temperature) and observer outputs generates two residuals. We then compare these residuals against an adaptive threshold value to detect the fault.

To estimate the healthy states via a nonlinear observer, we can rewrite the dynamics in \((1)\) as

\[\begin{aligned} \dot{x}& = Kx + f(x)- Kx + g(x)u\\ &= Kx +\Pi(x) + g(x)u, \end{aligned} \tag3\]

where \(K\in \mathbb{R}^{n\times n} \) is Hurwitz matrix \(\Pi(x)=f(x)-Kx\). According to \((3)\), we can represent the nonlinear observer with

\[\begin{aligned} \dot{\hat{x}} & =K\hat x+\Pi(\hat{x})+{g}(\hat{x})u+L^T(y-\hat{y}) \\ \hat{y} & = C\hat{x}, \end{aligned} \tag4\]

where \(\hat{x}\) is the estimated state, \(L \in \mathbb{R}^{m \times n}\) is the observer gain, and \(\Pi(\hat{x}) = {f}(\hat{x})-K\hat{x}\) [8].

Figure 3. Fault detection scheme of a lithium-ion cell. Figure courtesy of Geetika Vennam.

Adaptive Threshold Design for Fault Detection

We propose adaptive thresholds [1, 10] to account for the changes in residuals due to aging, shifts in operating temperature, and unmodeled dynamics (uncertainties). We calculate the adaptive thresholds \(R_{es_{1th}}\) and \(R_{es_{2th}}\) as

\[\begin{aligned} & R_{es_{1th}}= \tilde{y}_1(0)e^{- \sigma_{5}t}+\Psi_1 \\ & R_{es_{2th}}=\tilde{y}_2(0)e^{- \sigma_{8}t}+\Psi_2. \\ \end{aligned} \tag5\]

Here, we calculate \(\Psi_1,\Psi_2\) from the filter dynamics:

\[\begin{aligned} \dot{\Psi}_1 & =- \sigma_{5}\Psi_1+\eta_{5\text{max}},\\ \dot{\Psi}_2 & =- \sigma_{8} \Psi_2 +\eta_{8\text{max}}, \end{aligned} \tag6\]

where \(\sigma_5\) and \(\sigma_8\) are the convergence rates under no-fault conditions. The adaptive thresholds in \((5)\) prevent the generation of false alarms in the presence of parameter changes due to aging and modeling uncertainty, thus satisfying \(\eta_i(x,u) \leq \eta_{{i}_{\text{max}}}~\forall i=1,2, \cdots, 8\) [8].

Faulty State Estimation Via a Learning-based Observer

After the detection of an internal fault, we can implement a second observer that estimates the faulty states. In the state space representation of the LIB under fault conditions in \((2)\), the fault vector function \(\Gamma(x_f,u)\) is difficult to model and often comprises an unknown complex nonlinear function of the states and control input. Modeling the fault function also requires the physics of the failure mechanism, which is difficult to predict.

Figure 4. Two-layer neural network that approximates the nonlinear fault function. Figure courtesy of [8].

We can invoke the universal approximation property to approximate this function in a compact set. Alternatively, we are able to represent the function \(\Gamma(x_f,u)\) in parametric form via a multilayer NN called a random vector functional link NN (see Figure 4). In this architecture, we can randomly select and fix the input layer weights \(V\) during initialization; doing so simply updates the output layer weight \(\theta\). The estimated states, measured states, and control input serve as the NN’s input. We can use prior information about the system to select the activation function \(\sigma(\cdot)\) to form a basis for the approximation. In terms of NN weights, we express the fault vector function as

\[\Gamma (x,u)=\theta^T\sigma(V^T,x)+\epsilon(x,u), \tag7\]

where \(\theta=\mbox{diag}\{{\theta}_{\gamma_1},{\theta}_{\gamma_2},\cdots,{\theta}_{\gamma_8}\}\) is the unknown target NN weight matrix with each \(\theta_{\gamma_i}^T\in \mathbb{R}^{1\times l}\). The basis or activation function is denoted by \(\sigma=\left[\begin{array}{cccccccc} \sigma_{\gamma_1}^T & \sigma_{\gamma_2}^T& \cdots & \sigma_{\gamma_8}^T\end{array}\right]^T_{ln\times 1}\), with each \(\sigma_{\gamma_i}\in \mathbb{R}^{l\times 1}, \forall i=1,...,8\). Here, \(l\) is the number of neurons in the NN architecture and \(\epsilon(x,u)\) is the approximation error.

Using this approximation of the fault function, we can express the fault dynamics \((2)\) of the cell as

\[\begin{aligned} \dot{{x}}_f&= Kx_f +\Pi(x_f) + g(x_f)u+\theta^T \sigma(x_f,u)+\epsilon(x_f,u) \\ y_f&=Cx_f. \end{aligned} \tag8\]

Here, \(x_f\) is the faulty state of the model, \(y_f=[y_{1f}~y_{2f}]^T\) is the faulted battery output, and \(\epsilon(x_f,u)\) is the approximation error. We can subsequently write the NN-based fault observer as

\[\begin{aligned} &\dot{\check{x}}=K \check {x}+\Pi(\check{x})+ g (\check{x})u+\hat{\theta}^T \sigma(\check{x},u)+{L}^T C\tilde{x} \\ &\check{y} = C\check{x}, \end{aligned} \tag9\]

where \(\check{x}\) and \(\check{y}\) are respectively the state and output estimation vectors in the presence of faults and \(\hat{\theta}\) represents the estimated weights of the NN’s output layer. The state estimation error \(\tilde{x}=x_f-\check{x}\) is the difference between the faulty battery state and observer state in \((9)\).

In the context of LIBs, a major challenge when training NN weights is the limited availability of measurements to design the update law: The only accessible outputs are the battery’s terminal voltage and surface temperature under fault. To address this challenge, we employ the estimated healthy states \((\hat{x})\) by the nonlinear observer in \((4)\) as a substitute for the unmeasurable states—in addition to the faulty measured output—in order to account for the aging-based change in cell parameters. The augmented output vector is a combination of the healthy states and faulty measured outputs: \(\bar{X}=[\hat{SOC},\hat{V}_{cp1},\hat{V}_{cp2},\hat{T}_c, y_{1f},\hat{SOH},\hat{R}_0, y_{2f}]^T\). We then employ the augmented error vector \(\Xi=\bar{X}-\check{x}\) to tune the NN weight estimates.

The weight update law is based on the subsequent stability analysis:

\[\dot{\hat{\theta}}=-\sigma(\check{x},{u})\Xi^T\upsilon-\sigma(\check{x},{u})\sigma(\check{x},{u})^T \hat{\theta}\Upsilon, \tag{10}\]

where \(\upsilon\) and \(\Upsilon \in \mathbb{R}^{n \times n}\) are the learning gains [8].

Figure 5. Residual responses under a convective cooling resistance fault that is injected at 206 seconds. 5a. Surface temperature error. 5b. Output voltage error. Figure courtesy of Geetika Vennam.

Simulation Results

We verified the effectiveness of the fault detection scheme by injecting an abrupt convective cooling resistance fault \(0.4 Ru\) at \(t=206\) seconds. As evidenced in Figure 5, the residuals remain within the adaptive threshold value before the occurrence of the fault, further validating the estimation accuracy of the nonlinear observer. The \(T_s\) and \(V_t\) residuals exceed the adaptive threshold value upon the fault’s occurrence, hence implying the detection of a convective cooling resistance fault and its subsequent diagnosis at an incipient stage.

After detection of the fault, the NN-based observer deploys to learn the convective cooling resistance fault in an online fashion. After the fault is detected at \(t= 209\) seconds, the NN learns the fault dynamics — which are verified by the convergence of the residuals \(T_s\) and \(V_t\) towards zero with respective root mean square errors (RMSEs) of \(0.0041\) and \(0.0036\) (see Figures 6a and 6b). The residuals thus return to below the threshold value. Figure 6c illustrates the state estimation errors’ convergence towards zero with RMSEs within the one percent band, and Figure 6d provides the convergence of the estimated NN weight updates.

Figure 6. Residual responses with a neural network (NN) under a convective cooling resistance fault that is injected at 206 seconds. 6a. Surface temperature error. 6b. Output voltage error. 6c. State estimation error. 6d. NN weight estimation. Figure courtesy of Geetika Vennam.

Conclusions

Our NN-based fault detection scheme finds thermal and side reaction faults in LIBs. A nonlinear observer guarantees the estimation of the states and parameter of the SOH-coupled electrothermal aging model, while a comparison between output residuals from the observer and an adaptive threshold detects thermal and side reaction faults. Faulty state estimation with an NN-based observer demonstrates the convergence of the state estimation errors and NN weight updates, which proves the learning-based scheme’s potential to learn the system’s internal fault dynamics and accurately estimate the battery’s core temperature, SOC, and SOH during faults. This fault detection and learning algorithm can inspire new directions in battery health monitoring, power/energy management, and risk mitigation for pack overheating and thermal runaway due to internal faults.

Geetika Vennam delivered a contributed presentation on this research at the 2023 SIAM Conference on Applications of Dynamical Systems, which took place in Portland, Ore., last year.

References
[1] Dey, S., Biron, Z.A., Tatipamula, S., Das, N., Mohon, S., Ayalew, B., & Pisu, P. (2016). Model-based real-time thermal fault diagnosis of lithium-ion batteries. Control Eng. Pract., 56, 37-48.
[2] Dong, G., & Lin, M. (2021). Model-based thermal anomaly detection for lithium-ion batteries using multiple-model residual generation. J. Energy Storage, 40, 102740.
[3] Hannan, M.A., Lipu, M.S.H., Hussain, A., & Mohamed, A. (2017). A review of lithium-ion battery state of charge estimation and management system in electric vehicle applications: Challenges and recommendations. Renew. Sustain. Energy Rev., 78, 834-854.
[4] Hu, X., Zhang, K., Liu, K., Lin, X., Dey, S., & Onori, S. (2020). Advanced fault diagnosis for lithium-ion battery systems: A review of fault mechanisms, fault features, and diagnosis procedures. IEEE Ind. Electron. Mag., 14(3), 65-91.
[5] Tarascon, J.-M., & Armand, M. (2010). Issues and challenges facing rechargeable lithium batteries. In V. Dusastre (Ed.), Materials for sustainable energy: A collection of peer-reviewed research and review articles from Nature Publishing Group (pp. 171-179). Singapore: World Scientific Publishing.
[6] Vennam, G., Sahoo, A., & Ahmed, S. (2022). A novel coupled electro-thermal-aging model for simultaneous SOC, SOH, and parameter estimation of lithium-ion batteries. In 2022 American Control Conference (ACC) (pp. 5259-5264). Atlanta, GA: Institute of Electrical and Electronics Engineers.
[7] Vennam, G., Sahoo, A., & Ahmed, S. (2022). A survey on lithium-ion battery internal and external degradation modeling and state of health estimation. J. Energy Storage, 52(A), 104720.
[8] Vennam, G., Sahoo, A., & Yen, G.G. (2023). Learning-based faulty state estimation using SOH-coupled model under internal thermal faults in lithium-ion batteries. IEEE Trans. Transp. Electrif.
[9] Wei, J., Dong, G., & Chen, Z. (2019). Lyapunov-based thermal fault diagnosis of cylindrical lithium-ion batteries. IEEE Trans. Ind. Electron., 67(6), 4670-4679.
[10] Zhang, X., Polycarpou, M.M., & Parisini, T. (2002). A robust detection and isolation scheme for abrupt and incipient faults in nonlinear systems. IEEE Trans. Automat. Control, 47(4), 576-593.

Geetika Vennam holds a master’s degree and Ph.D. in electrical engineering from Oklahoma State University and an Integrated M.Tech. degree in electrical engineering from the Shanmugha Arts, Science, Technology & Research Academy. She is a postdoctoral researcher at Idaho National Laboratory. Vennam’s research interests are lithium-ion battery fault diagnostics and prognostics, fast charging in lithium-ion batteries, and adaptive control.

Avimanyu Sahoo holds a Ph.D. in electrical engineering from Missouri University of Science and Technology and an M.Tech. from the Indian Institute of Technology (BHU) Varanasi. He is currently an assistant professor in the Department of Electrical and Computer Engineering at the University of Alabama in Huntsville and was previously an associate professor in the Division of Engineering Technology at Oklahoma State University. Sahoo’s research focuses on the development of intelligent battery management systems for lithium-ion battery packs and communication-efficient distributed intelligent control schemes for cyber-physical systems using approximate dynamic programming, reinforcement learning, and distributed adaptive state estimation.