Aggregation of CF fields


Version 1.2 (updated from version 1.1 (2015/01/15) for changes to the CF data model)
18 May 2016


These rules may be used for deciding whether or not two arbitrary CF field constructs may be aggregated into one, larger field construct. A field construct (hereafter a field) is as defined in the CF data model, as are all other terms written in bold on their first appearance.

Aggregation may be thought of as the combination of one field into another to create a new field that occupies a larger domain. In practice, this means combining two fields so that their data arrays are concatenated along exactly one axis, as are their coordinate arrays which span that axis, in such a way that the aggregated field conforms to the CF data model.

These rules are based solely on the fields' metadata and more than two fields may ultimately be aggregated along one or more axes by repeated aggregations between pairs of fields.


CF aggregation rules

If all of the following statements are true for two arbitrary fields then the two fields may be aggregated to a new field.
  1. Both fields have identical standard name properties.

    Definition:
    A coordinate construct is either a dimension coordinate construct or an auxiliary coordinate construct.

    Definition:
    A pair of matching coordinate constructs is a coordinate construct from each field of the same type (dimension or auxiliary), with equivalent calendar properties (if present) and identical standard names.

  2. Both fields have the same number of coordinate constructs, all of which have a standard name property, and each coordinate construct's standard name is unique within its field. Each coordinate construct in one field forms a pair of matching coordinate constructs with a unique coordinate construct in the other field.

    Definition:
    A domain axis's associated coordinate constructs are those coordinate constructs which span the domain axis.

  3. Each domain axis in both fields has at least one associated 1D coordinate construct.

    Definition:
    A pair of matching axes is a domain axis from each field chosen such that the two domain axes have associated coordinate constructs that are all matching pairs.

  4. Each domain axis in one field forms a pair of matching axes with a unique domain axis in the other field.

  5. There is exactly one pair of matching axes for which one or more of the 1D matching coordinate constructs have different values for their coordinate arrays and, if present, their boundary coordinate arrays.

    Definition:
    The pair of matching axes for which one or more of the 1D matching coordinate constructs have different values for their coordinate arrays and, if present, their boundary coordinate arrays is called the pair of aggregating axes.

    Definition:
    A pair of matching cell measure constructs is a cell measure construct from each field with equivalent units and corresponding domain axes that are matching pairs.

  6. Both fields have the same number of cell measure constructs, all of which have a units property. Each cell measure construct in either field forms a pair of matching cell measure constructs with a unique cell measure construct in the other field.

  7. Each pair of matching coordinate constructs and matching cell measure constructs that do not span their aggregating axes have identical values for their coordinate arrays and, if present, their boundary coordinate arrays.

  8. If the pair of matching aggregating axes has a pair of associated dimension coordinate constructs, then there are no common values in their coordinate arrays. If the matching dimension coordinate constructs have boundary coordinate arrays then no cells from one dimension coordinate construct lie entirely within any cell of the other dimension coordinate construct.

  9. If one field has a cell methods construct then so does the other field, with the equivalent methods in the same order. Corresponding domain axes in each cell methods are matching pairs.

    Definition:
    A pair of matching domain ancillary constructs is a domain ancillary construct from each field for identical standard coordinate conversion terms and corresponding domain axes that are matching pairs.

  10. Both fields have the same number of domain ancillary constructs. Each domain ancillary construct in either field forms a pair of matching domain ancillary constructs with a unique domain ancillary construct in the other field.

    Definition:
    A pair of matching field ancillary constructs is a field ancillary construct from each field with identical standard names and corresponding domain axes that are matching pairs.

  11. Both fields have the same number of field ancillary constructs. Each field ancillary construct in either field forms a pair of matching field ancillary constructs with a unique field ancillary construct in the other field.

  12. Both fields have the same number of coordinate reference constructs. For each coordinate reference construct in one field there is a coordinate reference construct in the other field with identical name and the same set of terms, taking optional terms into account. Corresponding terms which are scalar or vector parameters are identical, taking into account equivalent units. Corresponding terms which are domain ancillary constructs form a pair of matching domain ancillary constructs.

Examples

In the following examples, CDL descriptions are given of two fields to be aggregated and their aggregated field, if it exists. It makes no difference to the aggregation which of the fields comes first. Differences and similarities between the two fields which either allow or disallow aggregation are highlighted with in-line comments. For brevity, data are only given for selected coordinate constructs.

Example 1

The following two fields are aggregatable along their time axis. Note that the matching coordinates for the aggregating axis comprise a CF-netCDF coordinate and a CF-netCDF scalar coordinate.

Field 1:

dimensions:
  t = 12 ;                                         // size 12 time dimension
  lat = 111 ;
  lon = 106 ;
  bounds = 2 ;
variables:
  double t(t) ;                                    // time dimension coordinate
    t:standard_name = "time" ;
    t:units = "hours since 2012-1-1" ;
    t:calendar = "standard" ;        
    t:bounds = "time_bnds" ;
  double t_bnds(bounds) ; 
  double lat(lat) ;
    lat:standard_name = "grid_latitude" ;
    lat:units = "degrees" ;
  double lon(lon) ;
    lon:standard_name = "grid_longitude" ;
    lon:units = "degrees" ;
  double latitude(grid_latitude, grid_longitude) ;
    latitude:standard_name = "latitude" ;
    latitude:units = "degrees_north" ;
  double longitude(grid_latitude, grid_longitude) ;
    longitude:standard_name = "longitude" ;
    longitude:units = "degrees_east" ;
  char rotated_lat_lon ;                          
    rotated_lat_lon:grid_mapping_name = "rotated_latitude_longitude" ;
    rotated_lat_lon:grid_north_pole_latitude = 38.f ;
    rotated_lat_lon:grid_north_pole_longitude = 190.f ;
  float tas(lon, lat, t) ;                     
    tas:standard_name = "air_temperature" ;
    tas:units = "K" ;
    tas:cell_methods = "t: mean (interval: 1.0 day)" ;
    tas:grid_mapping = "rotated_lat_lon" ; 
data:
  t = 0.5, 1.5, 2.5, 3.5, 4.5, 5.5,                // 12 time values (in hours)
    6.5, 7.5, 8.5, 9.5, 10.5, 11.5 ;                                                     
Field 2:
dimensions:                                        // no time dimension
  lat = 111 ;
  lon = 106 ;					 
  bounds = 2 ;					 
variables:						 
  double time ;                                    // time scalar coordinate
                                                   //   with a different netCDF
                                                   //   variable name
    time:standard_name = "time" ;                  // identical standard_name
    time:units = "days since 2011-12-1" ;          // equivalent units
    time:calendar = "gregorian" ;	           // equivalent units
    time:bounds = "time_bnds" ;                    // bounds present 
  double aux_time_bnds(time, bounds) ;
  double lat(lat) ;
    lat:standard_name = "grid_latitude" ;
    lat:units = "degrees" ;
  double lon(lon) ;
    lon:standard_name = "grid_longitude" ;
    lon:units = "degrees" ;
  double latitude(grid_longitude, grid_latitude) ; // different dimension order
    latitude:standard_name = "latitude" ;
    latitude:units = "degrees_north" ;
  double longitude(grid_latitude, grid_longitude) ;
    longitude:standard_name = "longitude" ;
    longitude:units = "degrees_east" ;
  char rotated_lat_lon ;                           // identical grid mapping
    rotated_lat_lon:grid_mapping_name = "rotated_latitude_longitude" ;
    rotated_lat_lon:grid_north_pole_latitude = 38.f ;
    rotated_lat_lon:grid_north_pole_longitude = 190.f ;
  float tas(lat, lon) ;                            // different dimension order,
                                                   //   no time dimension
    tas:standard_name = "air_temperature" ;        // identical standard_name
    tas:units = "C" ;                              // equivalent units
    tas:cell_methods = "time: mean (interval 24.0 hours)" ;
                                                   // equivalent cell methods
    tas:coordinates = "time" ;                     // scalar coordinate
    tas:grid_mapping = "rotated_lat_lon" ;         // grid mapping present
data:
  time = 31.52083333;                              // 1 time value (in days)
Aggregation:
dimensions:                                           
  time = 13 ;                                      // size 13 time dimension
  lat = 111 ;
  lon = 106 ;
  bounds = 2 ;
variables:
  double time ;                                   
          time:standard_name = "time" ;           
          time:units = "hours since 2012-1-1" ;
          time:calendar = "standard" ;        
          time:bounds = "time_bnds" ;             
  double aux_time_bnds(time, bounds) ;
  double lat(lat) ;
          lat:standard_name = "grid_latitude" ;
          lat:units = "degrees" ;
  double lon(lon) ;
          lon:standard_name = "grid_longitude" ;
          lon:units = "degrees" ;
  double latitude(grid_latitude, grid_longitude) ;
          latitude:standard_name = "latitude" ;
          latitude:units = "degrees_north" ;
  double longitude(grid_latitude, grid_longitude) ;
          longitude:standard_name = "longitude" ;
          longitude:units = "degrees_east" ;
  char rotated_lat_lon ;                           
          rotated_lat_lon:grid_mapping_name = "rotated_latitude_longitude" ;
          rotated_lat_lon:grid_north_pole_latitude = 38.f ;
          rotated_lat_lon:grid_north_pole_longitude = 190.f ;
  float tas(lon, lat, time) ;                       
          tas:standard_name = "air_temperature" ;  
          tas:units = "K" ;                        
          tas:cell_methods = "time: mean (interval: 1.0 day)" ;
          tas:grid_mapping = "rotated_lat_lon" ; 
data:
  time = 0.5, 1.5, 2.5, 3.5, 4.5, 5.5,             // 13 time values (in hours)
    6.5, 7.5, 8.5, 9.5, 10.5, 11.5, 12.5 ;

Example 2

The following two fields are aggregatable along their level dimension. Note that in this case, the pair of aggregating axes has two pairs of matching coordinates (atmosphere_hybrid_sigma_pressure_coordinate and model_level_number) and that the pair of matching coordinates for the non-aggregating time axis comprise a dimension coordinate and a scalar coordinate.

Field 1:

dimensions:                                        // no time dimension
  level = 9 ;
  latitude = 145 ;
  longitude = 192 ;
  bounds = 2 ;
variables:
  double time ;                                    // scalar coordinate
    time:standard_name = "time" ;
    time:units = "days since 1860-12-1" ;
    time:calendar = "gregorian" ;
  double time_bounds(bounds) ;
  double sigma(level) ;
    sigma:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
    sigma:units = "1" ;
    sigma:positive = "down" ;
  float lat(lat) ;
    lat:standard_name = "latitude" ;
    lat:units = "degrees_north" ;
  float lon(lon) ;
    longitude:standard_name = "longitude" ;
    longitude:units = "degrees_east" ;
    longitude:bounds = "lon_bounds" ;
  float lon_bounds(lon, bounds) ;
  int model_level_number(level) ;
    model_level_number:standard_name = "model_level_number" ;
    model_level_number:units = "1" ; 
  float eastward_wind(level, lat, lon) ;           // no time dimension
    x_wind:standard_name = "eastward_wind" ;
    x_wind:units = "m s-1" ;
    x_wind:coordinates = "time model_level_number" ;
    x_wind:cell_methods = "time: point" ;
data:
  sigma = 0.2997, 0.2497, 0.1996, 0.14950, 0.0992, 0.0568, 0.02959,
    0.0147, 0.0046 ;
  model_level_number = 11, 12, 13, 14, 15, 16, 17, 18, 19 ;
Field 2:
dimensions:
  time = 1                                       // size 1 time dimension
  level = 10 ;
  latitude = 145 ;
  longitude = 192 ;
  bounds = 2 ;
variables:
  double time(time) ;                            // coordinate
    time:standard_name = "time" ;
    time:units = "days since 1860-12-1" ;
    time:calendar = "gregorian" ;
    time:bounds = "time_bounds" ;
  double time_bounds(bounds) ;
  double sigma(level) ;
    sigma:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
    sigma:units = "1" ;
    sigma:positive = "down" ;
  float lat(lat) ;
    lat:standard_name = "latitude" ;
    lat:units = "degrees_north" ;
  float lon(lon) ;
    longitude:standard_name = "longitude" ;
    longitude:units = "degrees_east" ;
    longitude:bounds = "lon_bounds" ;
  float lon_bounds(lon, bounds) ;
  int model_level_number(level) ;                // auxiliary coordinate
                                                 //   for level dimension
    model_level_number:standard_name = "model_level_number" ;
    model_level_number:units = "1" ;
  float eastward_wind(time, level, lat, lon) ;  
    x_wind:standard_name = "eastward_wind" ;
    x_wind:units = "m s-1" ;
    x_wind:coordinates = "time model_level_number" ;
    x_wind:cell_methods = "time: point" ;
data:
  sigma = 0.997, 0.9749, 0.9304, 0.8698, 0.7922, 0.6995, 0.5995,
    0.5045, 0.4221, 0.3546 ;
  model_level_number = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ;
Aggregation:
dimensions:
  level = 19 ;
  latitude = 145 ;
  longitude = 192 ;
  bounds = 2 ;
variables:
  double time ;
    time:standard_name = "time" ;
    time:units = "days since 1860-12-1" ;
    time:calendar = "gregorian" ;
  double time_bounds(bounds) ;
  double sigma(level) ;
    sigma:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
    sigma:units = "1" ;
    sigma:positive = "down" ;
  float lat(lat) ;
    lat:standard_name = "latitude" ;
    lat:units = "degrees_north" ;
  float lon(lon) ;
    longitude:standard_name = "longitude" ;
    longitude:units = "degrees_east" ;
    longitude:bounds = "lon_bounds" ;
  float lon_bounds(lon, bounds) ;
  int model_level_number(level) ;
    model_level_number:standard_name = "model_level_number" ;
    model_level_number:units = "1" ;
  float eastward_wind(level, lat, lon) ;
    x_wind:standard_name = "eastward_wind" ;
    x_wind:units = "m s-1" ;
    x_wind:coordinates = "time model_level_number" ;
    x_wind:cell_methods = "time: point" ;
data:
  sigma = 0.997, 0.9749, 0.9304, 0.8698, 0.7922, 0.6995, 0.5995,
    0.5045, 0.4221, 0.3546, 0.2997, 0.2497, 0.1996, 0.14950, 0.0992,
    0.0568, 0.02959, 0.0147, 0.0046 ;
  model_level_number = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
    15, 16, 17, 18, 19 ;

Example 3

The following two fields are aggregatable along their region dimension. Note that in this case, the aggregating axes have no associated dimension coordinate arrays, but they do have a logically 1D auxiliary coordinates. Arrays which span the aggregating axis are simply concatenated, as auxiliary coordinate array values may be non-monotonic and non-unique.

Field 1:

dimensions:
  time = 12 ;
  region = 2 ;                                   // no region coordinate
  depth = 40 ;
  lat = 180 ;
  strlen = 14 ;
variables:
  double time(time) ;
    time:standard_name = "time" ;
    time:units = "days since 1860-1-1" ;
    time:calendar = "gregorian" ;
  double depth(depth) ;
    depth:standard_name = "depth" ;
    depth:units = "m" ;
    depth:positive = "down" ;
  double lat(lat) ;
    lat:standard_name = "latitude" ;
    lat:units = "degrees_north" ;
  char geo_region(region, strlen) ;
    geo_region:standard_name = "region" ;
    geo_region:long_name = "Ocean Basin" ;
  float stfmmc(time, region, depth, lat) ;
    stfmmc:standard_name = "ocean_meridional_overturning_streamfunction" ;
    stfmmc:units = "m3 s-1" ;
    stfmmc:coordinates = "geo_region" ;
data:
  geo_region =
    "atlantic_ocean",
    "indian_ocean  " ;
Field 2:
dimensions:
  time = 12 ;
  region = 2 ;
  depth = 40 ;
  lat = 180 ;
  strlen = 14 ;
variables:
  double time(time) ;
    time:standard_name = "time" ;
    time:units = "days since 1860-1-1" ;
    time:calendar = "gregorian" ;
  double depth(depth) ;
    depth:standard_name = "depth" ;
    depth:units = "m" ;
    depth:positive = "down" ;
  double lat(lat) ;
    lat:standard_name = "latitude" ;
    lat:units = "degrees_north" ;
  char geo_region(region, strlen) ;
    geo_region:standard_name = "region" ;
    geo_region:long_name = "Ocean Basin" ;
  float stfmmc(time, region, depth, lat) ;
    stfmmc:standard_name = "ocean_meridional_overturning_streamfunction" ;
    stfmmc:units = "m3 s-1" ;
    stfmmc:coordinates = "geo_region" ;
data:
  geo_region =
    "pacific_ocean ",
    "global_ocean  " ;
Aggregation:
dimensions:
  time = 12 ;
  region = 4 ;
  depth = 40 ;
  lat = 180 ;
  strlen = 14 ;
variables:
  double time(time) ;
    time:standard_name = "time" ;
    time:units = "days since 1860-1-1" ;
    time:calendar = "gregorian" ;
  double depth(depth) ;
    depth:standard_name = "depth" ;
    depth:units = "m" ;
    depth:positive = "down" ;
  double lat(lat) ;
    lat:standard_name = "latitude" ;
    lat:units = "degrees_north" ;
  char geo_region(region, strlen) ;
    geo_region:standard_name = "region" ;
    geo_region:long_name = "Ocean Basin" ;
  float stfmmc(time, region, depth, lat) ;
    stfmmc:standard_name = "ocean_meridional_overturning_streamfunction" ;
    stfmmc:units = "m3 s-1" ;
    stfmmc:coordinates = "geo_region" ;
data:
  geo_region =
    "atlantic_ocean",
    "indian_ocean  ",
    "pacific_ocean ",
    "global_ocean  " ;

Example 4

The following two fields are not aggregatable because the forecast_reference_time auxiliary coordinate is only present in one of the fields. If the first field did not have this auxiliary coordinate then the two fields would be aggregatable.

Field 1:

dimensions:   
  time = 12 ;
  latitude = 145 ;
  longitude = 192 ;
variables:
  double time ;                              
    time:standard_name = "time" ;
    time:units = "days since 1860-12-1" ;
    time:calendar = "gregorian" ;
  double ref_time(time): 
    sigma:standard_name = "forecast_reference_time" ;
    time:units = "days since 1860-12-1" ;
    time:calendar = "gregorian" ;
  float lat(lat) ;
    lat:standard_name = "latitude" ;
    lat:units = "degrees_north" ;
  float lon(lon) ;
    longitude:standard_name = "longitude" ;
    longitude:units = "degrees_east" ;
  float eastward_wind(time, lat, lon) ;  
    x_wind:standard_name = "eastward_wind" ;
    x_wind:units = "m s-1" ;
    x_wind:coordinates = "ref_time" ;
data:
  time = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 ;
Field 2:
dimensions:   
  time = 12 ;
  latitude = 145 ;
  longitude = 192 ;
variables:
  double time ;                              
    time:standard_name = "time" ;
    time:units = "days since 1860-12-1" ;
    time:calendar = "gregorian" ;
  float lat(lat) ;
    lat:standard_name = "latitude" ;
    lat:units = "degrees_north" ;
  float lon(lon) ;
    longitude:standard_name = "longitude" ;
    longitude:units = "degrees_east" ;
  float eastward_wind(time, lat, lon) ;       // no auxiliary coordinate
    x_wind:standard_name = "eastward_wind" ;
    x_wind:units = "m s-1" ;
data:
  time = 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23 ;

Example 5

The following two fields are not aggregatable because the aggregating time axes' dimension construct coordinate arrays for share a value.

Field 1:

dimensions:   
  time = 12 ;
  latitude = 145 ;
  longitude = 192 ;
variables:
  double time ;                              
    time:standard_name = "time" ;
    time:units = "days since 1860-12-1" ;
    time:calendar = "gregorian" ;
  float lat(lat) ;
    lat:standard_name = "latitude" ;
    lat:units = "degrees_north" ;
  float lon(lon) ;
    longitude:standard_name = "longitude" ;
    longitude:units = "degrees_east" ;
  float eastward_wind(time, lat, lon) ;  
    x_wind:standard_name = "eastward_wind" ;
    x_wind:units = "m s-1" ;
data:
  time = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 ;         // contains 11
Field 2:
dimensions:   
  time = 12 ;
  latitude = 145 ;
  longitude = 192 ;
variables:
  double time ;                              
    time:standard_name = "time" ;
    time:units = "days since 1860-12-1" ;
    time:calendar = "gregorian" ;
  float lat(lat) ;
    lat:standard_name = "latitude" ;
    lat:units = "degrees_north" ;
  float lon(lon) ;
    longitude:standard_name = "longitude" ;
    longitude:units = "degrees_east" ;
  float eastward_wind(time, lat, lon) ;
    x_wind:standard_name = "eastward_wind" ;
    x_wind:units = "m s-1" ;
data:
  time = 11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, ;  // contains 11


David Hassell and Jonathan Gregory