Error propagation

From FusionWiki
Jump to: navigation, search

Proper reporting of experimental measurements requires the calculation of error bars or "confidence intervals". The appropriate and satisfactory calibration of data and analysis of errors is essential to be able to judge the relevance of observed trends. Below, a brief definition of the main concepts and a discussion of generic ways to obtain error estimates is provided. [1] [2] Of course, any particular measuring device generally requires specific techniques.

The measurement process

The measuring device performs measurements on a physical system P. As a result, it produces estimates of a set of physical parameters {p}. One may think of p as loose numbers (e.g., a confinement time), data along a spatial chord at a single time (e.g., a Thomson scattering profile), data at a point in space with time resolution (e.g., magnetic field fluctuations from a Mirnov coil), or data having both time and space resolution (e.g., tomographic data from Soft X-Ray arrays). The actual measurement hardware does not deliver the parameters {p} directly, but produces a set of numbers {s}, usually expressed in Volts, Amperes, or pixels.

Calibration

The first task of the experimentalist is to translate the measured signals {s} into the corresponding physical parameters {p}. The second task is to provide error estimates (discussed below). Generally, the translation of {s} into {p} requires having a (basic) model for the experiment studied and its interaction with the measuring device. In the simplest cases, the relation between {s} and {p} is linear (e.g. conversion of the measured voltages from Mirnov coils to magnetic fields). Taking s and p to be vectors, such a conversion can be written as

$ p = A \cdot(s - b), $

where A is a (possibly diagonal) calibration matrix and b a vector for offset correction. However, in fusion science it is more common that the conversion from s to p involves some (non-linear) numerical modelling of the physical (and measurement) system. In this case, rather than assuming a linear relation, one assumes a non-linear map Mp between s and p: p = Mp(s). The subscript p indicates that Mp may depend on p. In principle, determining p from s now requires an iterative numerical approach. The map Mp should be tested to check that it is not ill-conditioned (i.e. small variations in s produce large variations in p), since that would render the measurements useless; ill-conditioning leads to error amplification. Should this be the case, then the measurement set-up should be changed until the undesired ill-conditioning is removed.

Error estimate (experimental error known)

When the error level in s is known (from experimental measurements performed on the measuring device itself), some techniques are available to calculate the error in p. In the linear case of Eq. (1), and even in slightly more complex situations, standard error propagation techniques can be used to compute the error in p from the error in s. Standard error propagation proceeds as follows:

$ z = f(x, y, ...)\, $
$ (\Delta z)^2 = \left ( \frac{\partial f}{\partial x}\right )^2 \Delta x^2 + \left ( \frac{\partial f}{\partial y}\right )^2 \Delta y^2 + ... $

This formula holds exclusively for a Gaussian (normal) distribution of errors (assuming the errors are small and that the independent variables x, y, ... are indeed independent). [3] One should be aware that many situations exist where error distributions are not normal (see below). One can easily check whether the error distribution is normal by doing repeated experiments under the same conditions and observing the resulting distribution of s. When propagating errors, one should be aware that the calibration matrix A and the offset correction b may also contain errors, which should therefore also be propagated. An important topic, and cause of many errors in the calculation of error propagation (and parameter fitting) is the issue of collinearity (linear dependencies between elements of s and/or p). The presence of collinearity may affect error levels enormously. A quick check of possible problems in this sense can be made using the Monte Carlo approach (see below). Several techniques are available to handle collinearity, such as Principal component analysis (basically, by orthogonalization of the correlation matrix of s). The Monte Carlo approach also provides a simple method for error estimation for the much more difficult problem of a non-linear mapping Mp. This technique proceeds as follows. To compute the error bar of p, simulated measurements s are varied randomly within their (known) error bars, using the (known) error distribution, and the standard deviation of the resulting p is determined, the latter being equal to the error estimate. This technique also provides a quick method to check for possible problems such as ill-conditioning, cited above. When the model relating s and p is known, as well as the error distributions (and the latter may either be Gaussian or not), a more systematic approach to error propagation is provided by a technique known as the maximum likelihood method. [4] This technique is simply the generalisation of standard error propagation to general error distributions (i.e. not limited to Gaussians).

Systematic and random errors

It is important to distinguish systematic and random errors. For example, the Monte Carlo technique discussed above will only yield random errors. The estimations of p may, however, be systematically wrong because the model Mp used is wrong or based on a flawed hypothesis. To check this, it is vital to cross-check the obtained values of p against the parameters obtained from another, independent measurement device. If this is not possible, the data should be reinterpreted using an independent analysis code. There is no alternative to determining systematic errors, except these two techniques (cross-checking between diagnostics and/or using independent models). Careful analysis of possible hidden or tacit assumptions is always recommendable.

Error estimate (experimental error unknown)

Unfortunately, and although this should never occur, often the error in the original signals s is not even known. We note, in any case, that a lower limit can always be set on s, namely, the bit resolution of the ADC acquiring the signal, or the pixel size. If there is no reasonable way to estimate the error in s, then a comparison with other diagnostics is even more important than before. A final recourse to error estimation is provided by performing repeated measurements in identical discharges. Repeating the measurement s on experiments that have carefully been prepared in the same state (p) will provide a set of values s that varies across the experiments. The standard deviation of s is then equal to the error bar of s. This technique can also be applied to a single experiment for time-resolved measurements. A steady-state discharge can be used for this purpose. The fluctuation amplitude of the signal s will then be equal to its error bar. We note, however, that this poor man's approach to error estimation will always provide an upper limit of the error bars, since the actual (physical) variability of the signal is added to the random error, whereas it provides no indication of the systematic error.

Test of statistical validity of the model

If a model is characterised by a number of free (fit) parameters αi, i = 1, ..., n and used to predict (or fit) some measurements, then, once error estimates are available, it can (and should) be subjected to a χ2-test. The value of χ2 obtained should be close to the number of free parameters; if it isn't, the number of free parameters n should be modified until it is.

Fluctuations and noise

The separation of noise and fluctuations is a highly non-trivial topic. The simplest case is when the physically interesting phenomenon is slowly varying in time. Random noise is usually characterised by a high frequency, so that a filter in frequency space can then separate signal and noise neatly. [5] However, when the physically interesting information is fluctuating, this signal-noise separation by frequency is not feasible, and much care is needed when analysing data. The application of a set of techniques is required to understand such signals (cross correlation, conditional averaging, spectral analysis, bi-spectral analysis, [6], Biorthogonal decomposition, determination of fractal dimension, mutual information, reconstruction of chaotic attractor, [7] ...).

Non-Gaussian statistics

The distribution of random variations of a signal s around its mean value need not be Gaussian. E.g., photon statistics are typically of the Poisson type, which is especially important for low signal levels. [8] In other cases, the random component of the signal s is simply a non-linear function of a (Gaussian) noise source, causing the distribution to be skewed or distorted. Or the random component of the measured signal could correspond to the maximum or minimum value of a (Gaussian) random number, leading to extremal (Gumbel) distributions. [9] The log-normal distribution is also quite common (e.g. in potential fluctuations). [10] However, all the previous distributions can be obtained by suitable manipulations of Gaussian random variables. A totally different class of statistics is known as Lévy distributions (of which the Gaussian distribution is only a special case), which is the class of distributions satisfying the requirement that the sum of independent random variables with a distribution P again has a distribution P (generalisation of the Central Limit Theorem). Such distributions are expected to appear in self-organised systems (such as plasmas). In general, the detection of this type of non-Gaussian statistics is difficult. Some techniques are however available, such as renormalisation, rescaled-range analysis, [11] the detection of long-range time dependence, [12] finite-size Lyapunov exponents, [13] etc. Sometimes it is possible to obtain information on the nature of the errors by averaging experimental data (in space or time) - this is the renormalisation technique referred to above. When averaging over N samples, the variation of the N-averaged (or smoothed) data is less than that of the original data. The way in which the variance (and other statistical moments) decreases with N provides information both on the type of statistics involved (Gaussian or otherwise) and on the random or non-random nature of the data variability (random contributions decaying to zero as N → ∞).

Integrated data analysis

Often, various different diagnostics provide information on the same physical parameter (e.g., in a typical fusion plasma experiment, the electron temperature Te is possibly measured by Thomson Scattering, ECE, and indirectly also by SXR, although mixed with information on the electron density ne and Zeff. The electron density is measured directly by Thomson Scattering, the HIBP, reflectometry, and interferometry, and indirectly by SXR). Part of this information is local, and part is line-integrated. Instead of cross-checking these diagnostics for one or a few discharges, one could decide to make an integrated analysis of data. This means using all information available to make the best possible reconstruction of, e.g., the electron density and temperature that is compatible with all diagnostics simultaneously. To do this, the following conditions must apply: 1) The data should not contradict each other mutually. This requires a previous study concerning the mutual compatibility, i.e. data validation. 2) The data should be available with proper calibration and independent error estimates in a routine fashion. This means regular calibrations of the measuring device and crosschecks. 3) A suitably detailed model of the physical system should be available, capable of modelling all experimental conditions and all corresponding measurement data. Techniques based on e.g. Bayesian statistics then allow finding the most probable value of all physical parameters in the model, compatible with all measured signals.

Summary

A proper analysis of error propagation requires having a reasonable model Mp that relates the measured signals s to the corresponding physical parameters p. Such a model can initially be rudimentary, implying the probable existence of large systematic errors. The systematic observation and analysis of the results p and their properly propagated random errors Δp, and their comparison with similar results from other diagnostics should allow improvement of the model, thus reducing the systematic error and improving the agreement between independent measurements (and/or models). This gradual improvement of the physics model is the basis of scientific progress.

References

  1. W. Press, S. Teukolsky, W. Vetterling, and B. Flannery, Numerical Recipes in FORTRAN (Cambridge University Press, 1992), 2nd ed.
  2. P. Bevington and D. Robinson, Data Reduction and Error Analysis for the Physical Sciences (McGraw-Hill, UK, 2003), 3rd ed. ISBN 978-0072472271
  3. Error Propagation (MathWorld)
  4. Particle Data Group, Eur. Phys. J. C 3, 1 (1998)
  5. D. Newland, An Introduction to Random Vibrations, Spectral and Wavelet Analysis (Dover, New York, 1993) ISBN 0486442748
  6. J. van den Berg, ed., Wavelets in Physics (Cambridge University Press, 1999) ISBN 978-0521593113
  7. H. Abarbanel, R. Brown, J. Sidorowich, and L. S. Tsimring, Rev. Mod. Phys. 65, 1331 (1993)
  8. B. van Milligen, I. Classen, and C. Barth, Rev. Sci. Instrum. 74, 3998 (2003)
  9. B. van Milligen, R. Sánchez, B. Carreras, V. Lynch, B. LaBombard, M. Pedrosa, C. Hidalgo, B. Gonçalves, and R. Balbín, Phys. Plasmas 12, 052507 (2005)
  10. F. Sattin, N. Vianello, and M. Valisa, Phys. Plasmas 11, 5032 (2004)
  11. B. Carreras, B. van Milligen, M. Pedrosa, R. Balbín, C. Hidalgo, D. Newman, E. Sánchez, R. Bravenec, G. McKee, I. García-Cortés, et al., Phys. Plasmas 6, 1885 (1999)
  12. B. Carreras, D. Newman, B. van Milligen, and C. Hidalgo, Phys. Plasmas 6, 485 (1999)
  13. B. Carreras, V. Lynch, and G. Zaslavski, Phys. Plasmas 8, 5096 (2001)