Quarter 2: Error analysis and Statistical methods | Dr. Ivan Marti Vidal | home
Lecture 1: Introduction. Random variables. Probability density function (pdf). Statistical estimates of expected values from a finite dataset. The Central Limit Theorem. Uncertainties in the statistical estimates. The Chi Squared distribution.
Lecture 2: Error propagation. Correlation between random variables. The Covariance matrix. Propagation of uncertainties through the space of measurements (Jacobian approach). Non-linear error propagation; turning randomness into systematics.
Lecture 3: Data modelling (part 1). Principle of Maximum Likelihood (ML). Propagation of uncertainties into the space of fitting parameters. The case of Gaussian-distributed measurements; least-squares fitting.
Lecture 4: Data modelling (part 2). Nonlinear least squares. ML in the case of non-gaussian noise.
Lecture 5: Tests of hypotheses. Signal, null hypothesis, and alternative hypothesis. The critical probability. Collecting evidence: introduction to Bayesian analysis.
Lecture 6: Methods of Monte Carlo (MC). Statistics from simulated data. MC applied to multi-dimensional integration with non-trivial boundaries. The theorem of the inverse cumulative function. Simulation of a dataset with a generic probability density function.
Lecture 7: Thermal noise in 2D images. The effect of beam convolution vs. image size. Spurious sources in deep surveys.
Lecture 1: April 12, 09:30-12:00 in 0.02 | video
Lecture 2: April 13, 09:30-12:00 in 0.01 | video
Lecture 3: April 14, 09:30-12:00 in EK.09 | video
Lecture 4: April 15, 09:30-12:00 in 0.01 | video
Lecture 5: April 19, 09:30-12:00 in 0.01 | video
Lecture 6: April 20, 09:30-12:00 in 0.01 | video
Lecture 7: April 20, 09:30-12:00 in 0.01 | video
Scripts of lecture 7 Monte Carlo simulations of the thermal noise in an image (parallelized code). Useful script to estimate the chance of false detection of faint sources (e.g., in a deep survey). There is a second script that computes the theoretical probability of false detection (to be compared to the results from the first script) as a function of the image size, beam width, and noise level.
Scripts of lecture 6(b) Monte Carlo simulations of source distributions in an isotropic Universe. We will see a nice application of the theorem of the inverse cumulative function.
Scripts of lecture 6 An example of the power of Monte Carlo to perform integrals within non-trivial boundaries. Suppose that you have a sample of sources taken from a survey in a given portion of the sky. You want to study the source clustering, so you want to compute the number of sources around a given point as a function of the distance to that point. Then, dividing the number of sources by the volume covered at each distance, you can obtain an estimate of the density of sources as a function of distance. However, the finite sky coverage of the survey implies that the covered volume at a given distance will not be that of a sphere, since some sources that should be counted will fall outside the coverage of the survey. This effect is known as "window effect". This script shows an example of how to deal with it.
Scripts of lecture 5 A silly script to check the smart Bayes relation (or the "inverse conditional probability" relation). In this script, we compute the chance of a supernova to be radioloud, based on different kinds of "evidence". Attend to the lecture for a deep discussion on this!
Scripts of lecture 4(b) What happens if you observe a faint source with an interferometer and all our phases get corrupted? Would you throw away all the data? Could you still make something with them? Would you estimate the flux density from the amplitude average? Sure? Nice example of the Maximum Likelihood Principle and how to apply it in the case of non-Gaussian random distributions. (What's the trick in this problem?... With a good estimate of the noise level in your visibilities, you can obtain a precise estimate of the source flux density just from the amplitudes).
Scripts of lecture 4 A basic example of a nonlinear least-squares fit. The (synthetic) data included represent the time evolution of the position angle of a jet. A simple model of precession must be fitted (a sine wave with a given amplitude and period). This script is only intended to help understand the basics of nonlinear least-squaresfitting. Other more ellaborated (and robust) programs should be used to solve real-life problems.
Scripts of lecture 2 Estimates of the uncertainties of the brightness temperature and spectral index of a source, based on the Monte Carlo approach. The results are then compared to those coming from the Jacobianapproach (i.e., the linear approximation of the error propagation). You will see an interesting, unexpected, "side" result: random noise can map into systematic effects in your estimates!
Scripts of lecture 1 Scripts to play with the central-limit theorem. You can check how the distribution of the averages from any set of random variables tends to be Gaussian, no matter the original distribution of the data. You can also check the Law of Large Numbers and the distribution of standard deviations of the averages (related to the Chi-Square distribution, to which we will come back more deeply in lecture 4).