Uncertainty Propagation for Common Operations (fiducia.stats)¶

Created on Tue Jul 21 12:11:12 2020

Common statistic operations

@author: Myles T. Brophy

Functions¶

`simpsVariance`(yUnc[, x, dx])	Propagates variance (\(\sigma^2\)) through Simpson’s rule numerical integration.
`trapzVariance`(yUnc[, x, dx])	Error propogation for Trapezoidal rule integration using uniform or non-uniform grids.
`gradientVariance`(yUnc[, x, dx])	Propogates uncertainty for the gradient operator of an array of a given step size.
`dotVariance`(a, b[, aUncertainty, bUncertainty])	Propogate uncertainty for the dot product of matrix a and 1D vector b.
`interpVariance`(x, xp, fpUnc[, leftVar, …])	Propagate uncertainty for linear interpolation.

Derivations for propagating uncertainty for some common operations.

Weighted Summation¶

Gien a vector \(X_i\) with N elements

\[\operatorname{Var}\left(\sum_{k=1}^{N} a_i X_i\right) = \sum_{k=1}^{N} a_i^2 \operatorname{Var}(X_i) + 2 \sum_{1\leq i}\sum_{<j<n}a_i a_j \operatorname{Cov}(X_i, X_j)\]

A simplified version of this can be written as

\[\operatorname{Var}(ax+by) = \sigma^2_{\text{summing independent variables}} = a^2\sigma_x^2 + b^2\sigma_y^2 + 2ab\sigma_{xy}^2\]

Trap Rule Variance¶

For \(N\) steps of \(\Delta x_{k} = x_{k+1} - x_{k}\) where \(f(x_k) = y_k\), where each \(y_k\) is independent, an integral can be approximated as

\[\int_a^b f(x) dx \approx\sum_{k=1}^{N} \frac{\Delta x_{k-1}}{2} (y_{k-1} + y_k)\]

To find the variance in the general case, use the last 3 equations.

\[ \begin{align}\begin{aligned}\operatorname{Var}\left(\sum_{k=1}^{N} \frac{\Delta x_{k-1}}{2}(y_{k-1} + y_k) \right) = \operatorname{Var}\left(\sum_{k=1}^{N} \frac{\Delta x_{k-1}}{2} y_{k-1} + \sum_{k=1}^{N} \frac{\Delta x_{k-1}}{2} y_{k} \right)\\=\operatorname{Var}\left(\frac{1}{2} \sum_{k=1}^{N} \Delta x_{k-1} y_{k-1}\right) + \operatorname{Var}\left(\frac{1}{2} \sum_{k=1}^{N} \Delta x_{k-1} y_{k}\right) + 2\operatorname{Cov}\left(\frac{1}{2} \sum_{k=1}^{N} \Delta x_{k-1} y_{k-1}, \frac{1}{2} \sum_{k=1}^{N} \Delta x_{k-1} y_{k}\right)\end{aligned}\end{align} \]

The two first \(\operatorname{Var}\) terms are simple

\[ \begin{align}\begin{aligned}\frac{1}{4}\left(\sum_{k=1}^{N} \operatorname{Var}(\Delta x_{k-1} y_{k-1})\right) + \frac{1}{4}\left(\sum_{k=1}^{N} \operatorname{Var}(\Delta x_{k-1} y_{k})\right)\\= \frac{1}{4} \left(\sum_{k=1}^{N} \Delta x_{k-1}^2 \operatorname{Var}(y_{k-1}) + \sum_{k=1}^{N} \Delta x_{k-1}^2 \operatorname{Var}( y_{k})\right)\\= \frac{1}{4} \left( \sum_{k=1}^{N} \Delta x_{k-1}^2 \sigma_{k-1}^2 + \sum_{k=1}^{N} \Delta x_{k-1}^2 \sigma_{k}^2 \right) = \frac{1}{4} \sum_{k=1}^{N} \Delta x_{k-1}^2 (\sigma_{k-1}^2 + \sigma_{k}^2)\end{aligned}\end{align} \]

where \(\sigma_k\) is the uncertainty of \(y_k\). Now for the Covariance of the two summations

\[\operatorname{Cov}\left(\frac{1}{2} \sum_{k=1}^{N} \Delta x_{k-1} y_{k-1}, \frac{1}{2} \sum_{k=1}^{N} \Delta x_{k-1} y_{k}\right) = \frac{1}{4} \sum_{k=1}^{N-1} \Delta x_{k-1} \Delta x_{k} \sigma_{k}^2\]

Combine all this and the fourth equation to get

\[\frac{1}{4} \left(\sum_{k=1}^{N} \Delta x_{k-1}^2 (\sigma_{k-1}^2 + \sigma_{k}^2) + 2 \sum_{k=1}^{N-1} \Delta x_{k-1} \Delta x_{k} \sigma_{k}^2 \right)\]

In the simple case of a uniform grid, where all \(\Delta x_k = \Delta x\) this can be expanded to

\[\frac{\Delta x^2}{4} \left(\sigma_0^2 + 4\sigma_1^2 + 4\sigma_2^2 + ... + 4\sigma_{N-1}^2 + \sigma_N^2 \right)\]

Gradient Variance¶

From the numpy.grad documentation, the gradient for discrete step sizes is approximated by

\[\nabla f(x_i) = \frac{h_{i-1}^2 f(x_i+h_i) + (h_i^2 - h_{i-1}^2) f(x_i) - h_i^2 f(x_i - h_{i-1})}{h_i h_{i-1}(h_i + h_{i-1})}\]

with the gradient at the first and the last data point being

\[\nabla f(x_1) = \frac{y_2 - y_1}{h_1}, \nabla f(x_N) = \frac{y_N - y_{N-1}}{h_{N-1}}\]

for a list of \(x_i\) data points and \(h_d = x_{i+1} - x_i = h_i\) and \(h_s = x_i - x_{i-1} = h_{i-1}\). From this we can say that \(f(x_i) = y_i\) and \(f(x_i + h_i) = y_{i+1}\) and \(f(x_i - h_s) = y_{i-1}\). Using the variance of weighted sums where the weights are the \(h\) terms we get

\[\operatorname{Var}(\nabla y_i) = \sigma_{\nabla y_i}^2= \frac{h_{i-1}^4 \sigma_{i+1}^2 + (h_i^2 - h_{i-1}^2)^2 \sigma_i^2 - h_i^4 \sigma_{i-1}^2}{(h_i h_{i-1} (h_i + h_{i-1}))^2}\]

where \(\sigma_i\) corresponds to the uncertainty in \(y_i\). Keep in mind that there are \(N\) \(y\) values and \(N-1\) \(h\) values because \(h_i\) is the difference between the \(N\) data points. Note that unlike in the Trap rule integration, the covariant term is 0 because no repeated uncertainty terms appear in the sum. At the borders \(i =0,N\) the variance is

\[\operatorname{Var}(\nabla y_1) = \frac{\sigma_2^2 - \sigma_1^2}{h_1^2}, \operatorname{Var}(\nabla y_N) = \frac{\sigma_N^2 - \sigma_{N-1}^2}{h_{N-1}^2}\]

Dot Product Variance¶

The dot product of vectors \(X,Y\) is defined as

\[X \cdot Y = \sum_{i=1}^N x_i y_i\]

Uncertainty propagation when multiplying two variables, \(f(u,v) = a u v\), with \(a\) as a constant, is given by

\[\operatorname{f(u,v)} = (au\sigma_v)^2 + (av\sigma_u)^2 + 2a^2 uv\sigma_{uv}^2\]

For the dot product we assume there is no covariance between \(X\) and \(Y\) because they are independent. This gets us

\[\]

operatorname{Var}(X cdot Y) = sum_{i=1}^N (x_i sigma_{y_i})^2 + (y_i sigma_{x_i})^2

Linear interpolation¶

Linear interpolation of a point \(x_0 < x < x_1\) is given by

\[y = \frac{y_0 ( x_1 - x) + y_1 (x- x_0)}{x_1 - x_0}\]

First thing we can do, to make this calculation easier, is assume that there is no uncertainty in the \(x_i\) terms. This is a short cut, but it`s all that’s required for the application in which this linear interpolation is being implemented. Starting with the numerator, we can use the weighted summation rules to say

\[\operatorname{Var}(y_0 (x_1 - x) + y_1 (x - x_0))= (x_1-x)^2 \sigma_{y_0}^2 + (x - x_0)^2 \sigma_{y_1}^2\]

where there is no covariant term, because we assume \(y_0\) and \(y_1\) and independent. Then, including the denominator as a constant (because we assume all \(x\) values have no uncertainty, we get a variance of

\[\operatorname{Var}(y) = \frac{1}{(x_1 - x_0)^2} ( (x_1-x)^2 \sigma_{y_0}^2 + (x - x_0)^2 \sigma_{y_1}^2 )\]