\[\begin{equation}\tag{1} CSS_k = \{\sum_{i=1,n} ~~[\sum_{j=1,n} (\partial y'_k/\partial b_j)b_j(\omega^{1/2})_{ki}]^2\}^{1/2},\\ k = 1,np; \end{equation}\]
\[\begin{equation}\tag{2} PCC_{k,j} = {\boldsymbol{V}}_{k,j} ({\boldsymbol{b}})/[V_{j,j} ({\boldsymbol{b}})^{1/2}{\boldsymbol{V}}_{k,k} ({\boldsymbol{b}})^{1/2}]\\ {\boldsymbol{V}}_{kj} ({\boldsymbol{b}}) = [s^2({\boldsymbol{X}}^T{\boldsymbol{\omega}} {\boldsymbol{X}})^{–1}]_{k,j}, ~~k = 1,np; ~~ j = 1,np. \end{equation}\]
For the results presented in this work, \(n\) is the number of observations; \((\partial y′_k/\partial b_j)\) the sensitivity of the \(k\)th simulated value \(y′_k\) to the \(j\)th parameter \(b_j\); and \(np\) the number of parameters. Sensitivities were calculated by MODFLOW-2000 [5] with the sensitivity-equation method in \(np + 1\) model runs, or by perturbation with central differences in \((2 × np) + 1\) model runs using UCODE_2005 [11]. \(\boldsymbol{V}_{i,j}(\boldsymbol{b})\) is the entry in the parameter variance–covariance matrix for parameters \(i\) and \(j\); this is a variance for \(i = j\), a covariance for \(i \neq j\). \(\boldsymbol{X}\) is a matrix of sensitivities with entries equal to \((\partial y′_k/b_j)\), \(\boldsymbol{\omega}\) is the weight matrix, and \(s^2\) is the unbiased regression variance. PCC for extremely correlated parameters can be calculated through creative use of round-off error [6,7].
Figure 3. Importance of observations to predictions of transport within the Nevada National Security Site (NNSS). Left, the NNSS is outlined in gray and the model boundary in black, and the transport locations are represented schematically. The observation–prediction (OPR) statistic is used to measure observation importance [16]. The existing old 501 hydraulic head observations are ranked. Right, evaluation of one potential new head measurement anywhere in model layer 1. The most important observations in the southwestern part of the model occur largely because the rocks there are defined in the model as hydraulically similar to the rocks in the NNSS, but their occurrence here under steep head gradients facilitates estimation of the parameter value. Each of these results required 49 parallelizable model runs. (Modified from [7] and [15].)
Figure 3 addresses the question: What observations are important to predictions? The importance of existing old observations and potential new observations is considered. For both, the importance of different observations depends on choices made in model construction, and results shown in Figure 3 reveal consequences of such decisions. The computationally frugal observation–prediction (OPR) statistic used is defined as how much a calculated confidence interval would increase if existing observations were removed and how much it would decrease if new observations were added [16]. The equations are:
\[\begin{equation}\tag{3a} OPR_i = 100 \times (s_{z(i)}- s_z)/s_z \end{equation}\]
\[\begin{equation}\tag{3b} s_{z(i)} = [(\partial z/\partial \boldsymbol{b}^T[s^2(\boldsymbol{X}_{(i)}^T \boldsymbol{\omega}_{(i)}\boldsymbol{X}_{(i)})^{-1}](\partial z/\partial \boldsymbol{b})]^{1/2} \end{equation}\]
\[\begin{equation}\tag{3c} s_{z} = [(\partial z/\partial \boldsymbol{b})^T[s^2(\boldsymbol{X}^T \boldsymbol{\omega}\boldsymbol{X})^{-1}](\partial z/\partial \boldsymbol{b})]^{1/2}, \end{equation}\]
where \(i\) identifies the observation, and \(\boldsymbol{X}_{(i)}\) and \(\boldsymbol{\omega}_{(i)}\) indicate that the sensitivity matrix \(\boldsymbol{X}\) and weight matrix \(\boldsymbol{\omega}\) have been modified by the addition of rows and columns related to new observation \(i\). The importance of existing observations is evaluated by the removal of rows and columns of \(\boldsymbol{X}\) and \(\boldsymbol{\omega}\), as indicated by \((–i)\); in practice, entries in the weight matrix are set to zero. The weight matrix is determined through an analysis of errors, as required to obtain minimum variance parameter estimates ([7], Appendix C). The use of a standard deviation in equation (3) instead of a confidence-interval width is consistent with an assumed Gaussian distribution.
The brief analysis and references presented here suggest that increasingly, as models become more robust, a full uncertainty toolbox that includes computationally frugal methods, such as methods based on local derivatives, along with computationally demanding methods, such as MCMC, and presentation of results in the context of fundamental questions in ways that facilitate comparisons between different models and hypotheses, will best serve the needs of environmental modeling.
Acknowledgments: The references were selected from areas of environmental modeling to form an introduction for interested readers. A comprehensive list would be inordinately long for this publication; additional information can be found in works cited in some of the references listed. The authors thank Jeremy White of the U.S. Geoloical Survey and Luis Tenorio of the Colorado School of Mines for their reviews of this article. Mary Hill was funded by the U.S. Geological Survey programs NAWQA (National Water Quality Assessment), GWRP (Groundwater Resources Program), and NRP (National Research Program). Ming Ye and Dan Lu were funded by NSF–EAR grant 0911074 and DOE Early Career Award DE–SC0008272.
References
[1] R.C. Aster, B. Borshers, and C.H. Thurber, Parameter Estimation and Inverse Problems, Academic Press, Amsterdam, 2012.
[2] J. Doherty, PEST, Watermark Computing, Brisbane, Australia, 2012.
[3] A. Efstratiadis and D. Koutsoyiannis, One decade of multi-objective calibration approaches in hydrological modelling: A review, Hydrol. Sci. J., 55:1 (2010), 58–78.
[4] R.T. Hanson, L.K. Kauffman, M.C. Hill, J.E. Dickinson, and S.W. Mehl, Advective Transport Observations with MODPATH-OBS—Documentation of the MODPATH Observation Process Using Four Types of Observations and Predictions, in U.S. Geological Survey Techniques and Methods, 6–A42, 2012.
[5] M.C. Hill, E.R. Banta, A.W. Harbaugh, and E.R. Anderman, MODFLOW-2000, the U.S. Geological Survey modular ground-water model: User’s guide to the observation, sensitivity, and parameter-estimation process and three post-processing programs, U.S. Geological Survey Open-File Report 00–184, 2000, http://water.usgs.gov/nrp/gwsoftware/modflow2000/modflow2000.html.
[6] M.C. Hill and O. Østerby, Determining extreme parameter correlation in ground-water models, Ground Water, 41:4 (2003), 420–430.
[7] M.C. Hill and C.R. Tiedeman, Effective Calibration of Ground Water Models, with Analysis of Data, Sensitivities, Predictions, and Uncertainty, John Wiley & Sons, Hoboken, NJ, 2007.
[8] D. Kavetski and M.P. Clark, Ancient numerical daemons of conceptual hydrological modeling: 2. Impact of time stepping schemes on model analysis and prediction, Water Resour. Res., 46:W10511 (2010), doi:10.1029/2009WR008896.
[9] D. Lu, M. Ye, and M.C. Hill, Analysis of regression confidence intervals and Bayesian credible intervals for uncertainty quantification, Water Resour. Res., 48:W0951 (2012), doi:10.1029/2011WR011289.
[10] D.S. Oliver, A.C. Reynolds, and N. Liu, Inverse Theory for Petroleum Reservoir Characterization and History Matching, Cambridge University Press, UK, and New York, 2008.
[11] E.P. Poeter, M.C. Hill, E.R. Banta, S. Mehl, and S. Christensen, UCODE_2005 and six other computer codes for universal sensitivity analysis, calibration, and uncertainty evaluation, in U.S. Geological Survey Techniques and Methods, 6–A11, 2005, http://typhoon.mines.edu/freeware/ucode/.
[12] S. Razavi, B.A. Tolson, and D.H. Burns, Review of surrogate modeling in water resources, Water Resour. Res., 48:W07401 (2012), doi:10.1029/2011WR011527.
[13] B. Renard, D. Kavetski, E. Leblois, M. Thyer, G. Kuczera, and S.W. Franks, Toward a reliable decomposition of predictive uncertainty in hydrological modeling: Characterizing rainfall errors using conditional simulation, Water Resour. Res., 47:W11516 (2011), doi:10.1029/2011WR010643.
[14] A. Saltelli, M. Ratto, T. Andres, F. Campolongo, J. Cariboni, et al., Global Sensitivity Analysis: The Primer, John Wiley & Sons, Hoboken, NJ, 2008.
[15] C.R. Tiedeman, D.M. Ely, M.C. Hill, and G.M. O’Brien, A method for evaluating the importance of system state observations to model predictions, with application to the Death Valley regional groundwater flow system, Water Resour. Res., 40:W12411 (2004), doi:10.1029/2004WR003313.
[16] M.J. Tonkin, C.R. Tiedeman, D.M. Ely, and M.C. Hill, OPR-PPR, a computer program for assessing data importance to model predictions using linear statistics, in U.S. Geological Survey Techniques and Methods, 6–E2, 2007, http://water.usgs.gov/software/OPR-PPR.html.
[17] J.A. Vrugt, C.J.F. ter Braak, M.P. Clark, J.M. Hyman, and B.A. Robinson, Treatment of input uncertainy in hydrologic modeling: Doing hydrology backward with Markov chain Monte Carlo simulation, Water Resour. Res., 44 W00B09 (2008), doi:10.1029/2007WR006720.
[18] E.F. Wood, et al., Hyperresolution global land surface modeling: Meeting a grand challenge for monitoring Earth’s terrestrial water, Water Resour. Res., 47:W05301 (2011), doi:10.1029/2010WR010090.
[19] M. Ye, P.D. Meyer, and S.P. Neuman, On model selection criteria in multimodel analysis, Water Resour. Res., 44:W03428 (2008), doi:10.1029/2008WR006803.
Mary C. Hill is a senior research hydrologist at the U.S. Geological Survey. Dmitri Kavetski is a professor of civil and environmental engineering at the University of Adelaide. Martyn Clark is a scientist at the National Center for Atmospheric Research. Ming Ye and Dan Lu are an associate professor and a postdoctoral fellow, respectively, in the Department of Scientific Computing at Florida State University.