The Square-Root Ensemble Kalman Filter Demonstration with the Lorenz model

Timestep of Lorenz model
No of ensemble members
Initial conditions for truth x, y, z
Initial prior err std dev x, y, z
Model error No model error (perfect model)
Model error std devs x, y, z
Observations Do not observe x
Do observe x with error std dev
Do not observe y
Do observe y with error std dev
Do not observe z
Do observe z with error std dev
Timesteps of assimilation period
Timesteps of forecast period
Approx No. of observation times
Random number seed


This is δt, the time-step of the numerics that solves the Lorenz 63 equations.

This is N, the number of ensemble members. The bounds on this parameter are 2 ≤ N ≤ 25. The more ensemble members are used, the better the possible representation of the uncertainty captured by the EnSRKF. The state of the Lorenz 63 model has three variables, n = 3, so it is possible for the uncertainty to be well captured with small values of N. For systems with large n this is not necessarily the case.

These are the three components of xtruth(0). This state is used in the demo as a starting point to run the Lorenz model forward in time to generate synthetic observations throughout a specified time window. The data assimilation tries to recover the truth from these observations without referring back to the truth. The truth run and the subsequent data assimilation run are together known as an ’identical twin experiment’.

When performing the data assimilation, the ensemble must start from some values. The initial ensemble members are set according to the following (for each ensemble member l):
xfl(0) =  xtruth(0) + N(0,  B),  where B =  σ2x 0 0 0 σ2y 0 0 0 σ2z .
N(0,  B) generates a three-element random vector from a normal (Gaussian) distribution that has mean zero and variance B, and σx, σy and σz are the initial prior error standard deviation values specified. (N.B., even though the truth is specified here, the information content of the truth is degraded by the noise.)

If model error is switch off, any run of the model is made exactly according to the Lorenz 63 model according to the following (for each ensemble member l):
xfl(t + δt) = ℳ(xfl(t + δt)).
This is the perfect model scenario.
If model error is switched on, then noise is systematically added as the model runs as time progresses according to the following (for each ensemble member l):
xfl(t + δt) = ℳ(xfl(t + δt)) + N(0,  Q),  where Q =  σ̂2x 0 0 0 σ̂2y 0 0 0 σ̂2z .
Here σ̂x, σ̂y and σ̂z are the model error standard deviation values specified. This is the imperfect model scenario.

These specify which variables (x, y and/or z) are observed at specified time intervals, and the precision that these observations are made. The observations are synthesized from the truth before the assimilation is started according to:
y(t) = xtruth(t) + N(0,  R),  where R =  σ2 obx 0 0 0 σ2 oby 0 0 0 σ2 obz .
Here σobx, σoby and σobz are the observation error standard deviations specified (for this example where all three components are observed).

This is Δtassim which specifies the number of time-steps from the beginning of the run over which observations will span. Over these time-steps the data will be regularly assimilated, which should constrain the model trajectory close to the truth.

This is Δtforewhich specifies the number of time-steps (after the assimilation period has finished) that the ensemble will be run as free forecasts.

This is Nobsbatches, the number of batches of observations to be assimilated over the Δtassim time-steps. These batches are uniformly spaced, and the observations assimilated are as specified above. (N.B. This is an approximate value - the demo may adjust this value to ensure that Δtassim ⁄ Nobsbatches has an integer value.)

The random number seed specifies the sequence of pseudo random numbers that are generated. To repeat an experiment, with different synthetic observations, different ensemble initial conditions and different model error noise, change this value.