0%

误差理论:协方差传播定律

在实际工作中,往往会遇到某些量的大小并不是直接测定的,而是由观测值通过一定的函数关系间接计算出来的。那么,观测值的函数的中误差与观测值的中误差之间存在着怎样的关系?实际上,这样的关系可以通过协方差的运算规则得到,因此被称作协方差传播定律,又称误差传播定率。

观测值线性函数的方差

设有观测值\(X\in\mathbb{R}^n\),其均值为\(\mu_X\in\mathbb{R}^n\),协方差矩阵为\(\Sigma_{XX}\in\mathbb{R}^{n\times n}\),又设有\(X\)的线性函数 \[ Z = K^{\mathrm{T}}X+k_0 \] 其中\(Z\in\mathbb{R}\)\(K\in\mathbb{R}^n\)是常量,\(k_0\)是常数。

现求\(Z\)的方差\(\sigma_Z^2\in\mathbb{R}\)

根据上述的线性关系,可求得 \[ \begin{eqnarray} E(Z) & = & E(K^\mathrm{T}X+k_0) \\ & = & K^{\mathrm{T}} E(X) + k_0 \\ & = & K^{\mathrm{T}} \mu_X + k_0 \end{eqnarray} \] 于是, \[ \begin{eqnarray} \sigma_Z^2 & = & E((Z - E(Z))^2) \\ & = & E((K^{\mathrm{T}}X + k_0 - K^{\mathrm{T}} \mu_X - k_0)^2) \\ & = & E((K^{\mathrm{T}}X- K^{\mathrm{T}} \mu_X) (K^{\mathrm{T}}X- K^{\mathrm{T}} \mu_X)^{\mathrm{T}}) \\ & = & K^{\mathrm{T}} E((X - \mu_X)(X - \mu_X)^{\mathrm{T}}) K \\ & = & K^{\mathrm{T}} \Sigma_{XX} K \end{eqnarray} \]\[ \sigma_Z^2 = K^{\mathrm{T}} \Sigma_{XX} K \] 上式就是协方差传播定律。当\(X\)各分量线性无关时,\(\Sigma_{XX}\)是对角的,即 \[ \left( \Sigma_{XX} \right)_{ij} = \left\{ \begin{eqnarray} \sigma_i^2, & \; &i=j \\ \\ 0,& \; & i\neq j \end{eqnarray} \right. \] 此时\(\sigma_Z^2\)可以表示为\(X\)各分量方差的加权和: \[ \begin{eqnarray} \sigma_Z^2 & = & k_1^2 \sigma_1^2 + k_2^2 \sigma_2^2 + \cdots + k_n^2 \sigma_n^2 \\ & = & \sum_{i=1}^n k_i^2 \sigma_i^2 \end{eqnarray} \]

多个观测值线性函数的协方差矩阵

设有观测值\(X\in\mathbb{R}^n\),其均值为\(\mu_X\in\mathbb{R}^n\),协方差矩阵为\(\Sigma_{XX}\in\mathbb{R}^{n\times n}\),又设有\(X\)\(t\)个线性函数 \[ \begin{eqnarray} Z_1 & = & k_{11}X_1 + k_{12}X_2 + k_{1n}X_n + k_{10} \\ Z_2 & = & k_{21}X_1 + k_{22}X_2 + k_{2n}X_n + k_{20} \\ & \cdots & \\ Z_t & = & k_{t1}X_1 + k_{t2}X_2 + k_{tn}X_n + k_{t0} \end{eqnarray} \]\(Z = (Z_1, Z_2, \cdots,Z_t)^\mathrm{T}\),则上式可写作 \[ Z = K^\mathrm{T}X + K_0^\mathrm{T} \] 其中, \[ K^\mathrm{T} = \left( \begin{array}{} k_{11} & k_{12} & \cdots & k_{1n} \\ k_{21} & k_{22} & \cdots & k_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ k_{t1} & k_{t2} & \cdots & k_{tn} \\ \end{array} \right) \in \mathbb{R}^{t\times n}, \\ K_0^\mathrm{T} = \left( k_{10},\ k_{20},\ \cdots,\ k_{t0} \right)^\mathrm{T} \in \mathbb{R}^t \] 按照上一节中的推导,\(Z\)的协方差矩阵为 \[ \Sigma_{ZZ} = K^\mathrm{T} \Sigma_{XX} K^\mathrm{T} \] 这与上一节得到的协方差具有一致的形式。实际上,该式是协方差传播定律的一般形式。

非线性函数的情况

设有观测值\(X\)的非线性函数 \[ Z = f(X) = f(X_1, X_2, \cdots, X_n) \] 已知观测值\(X\)的近似值\(X^0 = (X_1^0, X_2^0, \cdots, X_n^0)^\mathrm{T}\)\(X\)的协方差矩阵\(\Sigma_{XX}\),现求\(Z\)的方差\(\sigma_Z^2\)

利用泰勒展开将\(f(X)\)\(X^0\)处线性化:将其在\(X^0\)处展开 \[ Z = f(X_1^0, X_2^0, \cdots, X_n^0) + \left.\cfrac{\partial f}{\partial X_1}\right|_0 (X_1 - X_1^0) + \left.\cfrac{\partial f}{\partial X_2}\right|_0 (X_2 - X_2^0) + \cdots + \left.\cfrac{\partial f}{\partial X_n}\right|_0 (X_n - X_n^0) + o(X-X^0) \] 由于\(X^0\)是对\(X\)的近似估计,因此可以认为其差的高阶项是小量,故忽略之,于是上式可写为 \[ Z = \left.\cfrac{\partial f}{\partial X_1}\right|_0 X_1 + \left.\cfrac{\partial f}{\partial X_2}\right|_0 X_2 + \cdots + \left.\cfrac{\partial f}{\partial X_n}\right|_0 X_n + f(X_1^0, X_2^0, \cdots, X_n^0) - \sum_{i=1}^{n} \left.\cfrac{\partial f}{\partial X_i}\right|_0 X_i^0 \]\[ \begin{eqnarray} K^\mathrm{T} = \left[ \begin{array}{} k_1 & k_2 & \cdots\ & k_n \end{array} \right] = \left[ \begin{array}{} \left.\cfrac{\partial f}{\partial X_1}\right|_0 & \left.\cfrac{\partial f}{\partial X_2}\right|_0 & \cdots & \left.\cfrac{\partial f}{\partial X_n}\right|_0 \end{array} \right] \\ k_0 = f(X_1^0, X_2^0, \cdots, X_n^0) - \sum_{i=1}^{n} \left.\cfrac{\partial f}{\partial X_i}\right|_0 X_i^0 \end{eqnarray} \] 那么\(Z\)则表示为与第一节中一致的形式 \[ Z = K^\mathrm{T}X + k_0 \] 于是可以求得\(Z\)的方差\(\sigma_Z^2\)\[ \sigma_Z^2 = K^\mathrm{T} \Sigma_{XX} K \] 类似地,如果有\(t\)个非线性函数 \[ \begin{eqnarray} Z_1 & = & f_1(X_1, X_2, \cdots, X_n) \\ Z_2 & = & f_2(X_1, X_2, \cdots, X_n) \\ & \cdots & \\ Z_t & = & f_t(X_1, X_2, \cdots, X_n) \\ \end{eqnarray} \] 此时

\[ K^\mathrm{T} = \left[ \begin{array}{} \left.\cfrac{\partial f_1}{\partial X_1}\right|_0 & \left.\cfrac{\partial f_1}{\partial X_n}\right|_0 & \cdots & \left.\cfrac{\partial f_1}{\partial X_n}\right|_0 \\ \left.\cfrac{\partial f_2}{\partial X_1}\right|_0 & \left.\cfrac{\partial f_2}{\partial X_n}\right|_0 & \cdots & \left.\cfrac{\partial f_2}{\partial X_n}\right|_0 \\ \vdots & \vdots & \ddots & \vdots\\ \left.\cfrac{\partial f_t}{\partial X_1}\right|_0 & \left.\cfrac{\partial f_t}{\partial X_n}\right|_0 & \cdots & \left.\cfrac{\partial f_t}{\partial X_n}\right|_0 \\ \end{array} \right] \in \mathbb{R}^{t\times n} \]

进而有 \[ \Sigma_{ZZ} = K^\mathrm{T}\Sigma_{XX}K \]

应用举例:时间序列的最小二乘拟合

现考虑一个关于时间序列拟合的例子,使用普通最小二乘的方法拟合出一个时间序列线性趋势、三角周期的振幅和初始相位,即\(f(t) = a + bt + Asin(2\pi t + \phi)\),以及相应的标准差。

由于最小二乘解决问题须要模型是线性的,不能直接拟合模型中的\(A, \phi\)及相应的标准差\(\sigma_A, \sigma_\phi\),因此我们考虑的模型应该是 \[ f(t) = a + bt + c\sin (2\pi t) + d\cos (2\pi t) \] 进而计算振幅和初始相位 \[ \begin{eqnarray} A=\sqrt{c^2 + d^2} \\ \phi = \arctan{\cfrac{d}{c}} \end{eqnarray} \] 现已知最小二乘的结果 \[ \hat{f} (t) = \hat{a} + \hat{b}t + \hat{c}\sin (2\pi t) + \hat{d}\cos (2\pi t) \] 协方差矩阵是 \[ \Sigma = \left[ \begin{array}{} \sigma_a^2 & \sigma_{ab} & \sigma_{ac} & \sigma_{ad} \\ \sigma_{ba} & \sigma_b^2 & \sigma_{bc} & \sigma_{bd} \\ \sigma_{ca} & \sigma_{cb} & \sigma_c^2 & \sigma_{cd} \\ \sigma_{da} & \sigma_{db} & \sigma_{dc} & \sigma_d^2 \end{array} \right] \] 其中\(\sigma_{ij} = \sigma_{ji},\ i,j \in \set{a,b,c,d}\)

线性项及相应不确定度至此可以直接得到答案了。同时可以求得振幅和相位\(\hat{A}=\sqrt{\hat{c}^2 + \hat{d}^2},\ \hat{\phi} = \arctan{\cfrac{\hat{d}}{\hat{c}}}\),现在我们关注它们的标准差:

\(A,\phi\)都是\(c,d\)的非线性函数,先求对应的矩阵\(K\)

\[ K^\mathrm{T} = \left.\left[ \begin{array}{} \cfrac{\partial A}{\partial c} & \cfrac{\partial A}{\partial d} \\ \cfrac{\partial \phi}{\partial c} & \cfrac{\partial \phi}{\partial d} \end{array} \right] \right|_{(c=\hat{c}, d=\hat{d})} = \left[ \begin{array}{} \cfrac{\hat{c}}{\sqrt{\hat{c}^2 + \hat{d}^2}} & \cfrac{\hat{d}}{\sqrt{\hat{c}^2 + \hat{d}^2}} \\ \cfrac{-\hat{d}}{\hat{c}^2 + \hat{d}^2} & \cfrac{\hat{c}}{\hat{c}^2 + \hat{d}^2} \end{array} \right] \]

进一步可以求得

\[ \begin{eqnarray} \Sigma_{A\phi} & = & \left[ \begin{array}{} \sigma_{A}^2 & \sigma_{A\phi} \\ \sigma_{\phi A} & \sigma_{\phi}^2 \end{array} \right] \\ & = & K^\mathrm{T} \Sigma_{cd} K \\ & = & \left[ \begin{array}{} \cfrac{\hat{c}}{\sqrt{\hat{c}^2 + \hat{d}^2}} & \cfrac{\hat{d}}{\sqrt{\hat{c}^2 + \hat{d}^2}} \\ \cfrac{-\hat{d}}{\hat{c}^2 + \hat{d}^2} & \cfrac{\hat{c}}{\hat{c}^2 + \hat{d}^2} \end{array} \right] \left[ \begin{array}{} \sigma_{c}^2 & \sigma_{cd}\\ \sigma_{dc} & \sigma_{d}^2 \end{array} \right] \left[ \begin{array}{} \cfrac{\hat{c}}{\sqrt{\hat{c}^2 + \hat{d}^2}} & \cfrac{-\hat{d}}{\hat{c}^2 + \hat{d}^2} \\ \cfrac{\hat{d}}{\sqrt{\hat{c}^2 + \hat{d}^2}} & \cfrac{\hat{c}}{\hat{c}^2 + \hat{d}^2} \end{array} \right] \\ & = & \left[ \begin{array}{} \cfrac{1}{\hat{c}^2 + \hat{d}^2} \left( \hat{c}^2 \sigma_c^2 + 2\hat{c}\hat{d}\sigma_{cd} + \hat{d}^2 \sigma_d^2 \right) & * \\ * & \cfrac{1}{\left(\hat{c}^2 + \hat{d}^2\right)^2 } \left( \hat{d}^2 \sigma_c^2 - 2\hat{c}\hat{d}\sigma_{cd} + \hat{c}^2 \sigma_d^2\right) \end{array} \right] \end{eqnarray} \]

至此得到相应的方差,开方即得相应的标准差。