Milestones in Atmospheric Data Assimilation

What is Data Assimilation?

Atmospheric data assimilation is the analytical process of estimating the entire state of the atmosphere from a set of observations. This is considered to be a crucial element to weather forecasting. No forecast model can be wholy correct and the assimilation procedure aims to equip the numerical model with accurate initial conditions, thus encouraging the model to advance in a realistic direction. It is not just forecasting which benefits. The process is applied to the creation of accurate and continuous research data sets, and also to the diagnosis of model errors.

The technology has benefited from 70 years of research and is still a developing area. A major lesson is that while data assimilation is very important to forecasting and to atmospheric science, it has to be done well to be of value.

1. The Method of Least Squares

The method of least squares is the central axiom used in most data assimilation schemes used in leading weather centres. It is a basic mathematical idea which tries to minimize the departure between the state of the model and the incoming observations. Invented by Gauss in the late eighteenth century, it was first used in astronomy to determine the orbital parameters of planets and other bodies about the Sun. This application is parallel with the meteorological problem in the sense that both are classes of an inverse problem.

2. Numerical Weather Prediction

Numerical weather prediction was attempted first by Lewis F. Richardson in 1922. In his well known experiment, the equations of atmospheric dynamics - which were meant to represent the air flow over central Europe - were solved numerically. Before digital computers this was a lengthy procedure and involved teams of assistants. Although the general philosophy of weather forecasting today is the same as that of Richardson, his trial failed. The observations had not been assimilated properly, leading to an unbalanced (badly initialized) initial state. The initial errors were amplified by the model. This lead to the belief that numerical weather prediction was not feasible, and further research stalled.

3. Fresh Attempts

Digital computers developed during the Second World War opened new opportunities for researchers to tackle the weather forecasting problem. A pioneering group at the Princeton Institute of Advanced Study in the United States, consisting of J. Charney, R. Fjortoft and J. von Neumann used the ENIAC (Electronic Numerical Integrator and Computer) - at the time a state of the art computer - to solve a set of flow equations. They did not use the same equation set as Richardson, but instead applied a simplified equation (the barotropic vorticity equation), which could not support the unbalanced modes which destroyed Richardson's analysis. This choice behaved as a kind of filter which has analogues in the more modern assimilation schemes. The success of their experiment put numerical weather prediction back on the agenda.

4. Objective Analysis

Meteorological observations are generally non-uniformly distributed and have errors associated with them. Systematically putting them into a form which can be used by a numerical model was first done by H. Panofsky in 1949. He used a primitive form of initialization which involved simple interpolation of the measurements to the model's grid positions by curve fitting and with a weighting which depended upon the accuracy of each measurement. This analysis used the method of least squares.

5. The Data Assimilation Cycle

Simple interpolation of existing observations to data voids is not necessarily consistent with the overall fluid flow. B. Gilchrist and G. Cressman recognized that although some regions lacked observations, information could be provided by the previous forecast for the purposes of initialization. This furthered the objective analysis procedure and gave rise to the notion of the data assimilation cycle incorporating a background field. These ideas were developed further by P. Bergthorsson and B. Doos (1955) and P. Thompson (1961).

By the early 1960s, the major national weather centres of the world were using these numerical weather prediction ideas - with the data assimilation cycle - on an operational basis to guide forecasters.

6. Optimal Interpolation

The `interpolation' procedures used to form initial conditions were formalized in the 1970s to account properly for both model and observation errors. The use of observations with models, combined in a statistical sense via the method of least squares, was intended to give a `best-fit'. This was named "optimal interpolation". Given the set of approximations needed to make the method practical meant that the results of optimal interpolation were far from optimal. One major limitation was the computer power available which meant that a truly global analysis was not possible, and the assimilation of data into separate domains was used instead.

7. Variational Data Assimilation

A new approach to producing the best fit, which was coming on line operationally in the mid 1990s was variational data assimilation ("Var."). The method itself - which finds the best fit by again minimizing the square of the deviation between the analysis and the back- ground/observations in an iterative fashion (using a descent algorithm) - was actually conceived in the 1970s. The Var. procedure allows observations to be assimilated which are not necessarily model variables. This advantage means that radiances, as measured by the many `Earth observation' satellites, could be exploited to their full. The advances in computational power available by the 1990s meant that a truly global analysis could be made. The `Holy Grail' of data assimilation is the so-called 4d-Var. in which observations are digested in the model at their proper time. A `cut-down' version, 3d-Var., is often used instead, which requires somewhat less computer power, but does not resolve the true observation times. All measurements made inside a time window (normally 6 hours) are taken as simultaneous.

Pioneers of the Var. idea were P. Morel, G. Marchuk, O. Talagrand and P. Coutier.

8. The Future

Data Assimilation remains a rich and constantly evolving area for development with many centres around the world trying to use and refine the method. With the availability of increasing amounts of good quality measurements, particularly from satellites, data assimilation has proved to be the best way of using these data for better weather forecasts and for research data sets to help scientists understand our environment.

There is always scope for improvements to the methodologies and algorithms. While researchers are now aware of the `correct' approach to data assimilation, the computational cost is still usually prohibitive. While computational power is increasing, there is always a need for more accurate, efficient and imaginative solutions.