1. IntroductionΒΆ

The proposed CF aggregation rules allow for the aggregation of CF fields across multiple dimensions. These rules are based solely on the fields’ metadata, therefore allowing such aggregations to be reliably automated. This raises the possibility of large amounts of aggregations being created and therefore storing these collections is clearly desirable.

This document describes how the data arrays of such aggregations could be stored in the memory of an application and proposes a convention – the NCA (netCDF aggregate) convention – for their efficient file storage.

The key element of the NCA files is that they are CF-like netCDF files, which only require extra processing to realise their aggregated data arrays.

These ideas have been implemented in cf-python. In particular, its Large Amounts of Massive Arrays (LAMA) functionality stores arrays in this fashion and it can read and write files according the NCA convention as described here.

This style of aggregation storage is a generalization of NetCDF Markup Language (NcML) storage, which has long-standing use in the community, in that it allows:

  • Simultaneous aggregation across more than one dimension.
  • Aggregation accounting for different but equivalent variable properties (order of dimensions, number of size 1 dimensions, senses in which dimensions run, units of the data values, missing data values)
  • Storage of changes to arbitrary (not necessarily contiguous) parts of the aggregated data array.
  • Aggregation of arrays stored in a mixture of formats (in-memory, netCDF, PP, etc.).

Previous topic

On the storage of arbitrarily aggregated data arrays (v0.2.2)

Next topic

2. A conceptual framework for the in-memory storage of aggregated arrays

This Page