Dependency perl code

How it works

The dependency perl code (DPC) runs through six stages, described below.

1) Read in instructions

The DPC reads in information from the dependency file. The dependency file (typically called dependencies.dat) contains variable definitions and the instruction lines, which is the main part of this file. The instruction lines contain three parts, separated by semi-colons:

  1. the object(s) (usually just one file per instruction line) that require(s) creating
  2. the files that the object(s) depends on, known as the parent files. The object(s) are the children to these parent files.
  3. the instructions for creating the object(s). This is frequently a script that might run, say, a matlab script.
In addition to the dependency file, a list of problem files is found in the problems file (problem_files.dat), and a list of file types which want compressing are found in the compress file (compress.dat).

The list of problem files is simply a list of all the files that the DPC should not try to create. Sometimes a file cannot be created, because maybe a parent file has some exceptional circumstances preventing its creation. Rather than the DPC trying to create these files every time it is run, it's better if these files are added to the list of problem files. Also it can be useful to have a list of problems files to aid in solving problems.

The compress file contains the type of files that want compressing. Files are of the same type if the only difference in their names is a date component.

The configuration files need to be written by the user, see Configuration files for more information on all these files.

2) Determine level of file

Having read the dependency file, the DPC sorts all the files mentioned in the dependency file into levels.

The top level, level 1, is all the files that have no parents. Typically they'll be the raw data that starts the creation of subsequent files.

Object files whose parent files are all in level 1 are sorted into level 2. Object files whose parent files are all in level 2 are sorted into level 3 and so on. When parent files come from several levels, the object is placed one level below the lowest parent level (which is the highest level number). So an object with parents in levels 2, 3 and 4 will be placed in level 5.

It's necessary to put objects in levels, because the DPC will deal with one level at a time, starting with level 2 (level 1 files aren't created with the DPC). By ordering the objects in levels, the DPC ensures that all parent files are created before attempting to create their children.

3) Find parent files for a given level

The DPC can be run over all the levels greater than level 1, or the user can specify that it should run over one specific level (using the --level argument). Regardless of which option is taken, for each level the DPC will find all the parent files for the objects at this level.

4) Find which parent files are newer than their children

Having found the parent files, the DPC will compare the age of the parent files with its children at the given level (children at other levels are dealt with another time). Unless --update=new is selected, if any of the parents are newer than a child or the child doesn't exist yet, the child will be put in a list of files which require updating. If --update=new is selected, only the child files that don't currently exist will be put in this list.

5) Create files

The DPC loops through the objects in the list that require updating. Before carrying out the instructions for creating the object files, any compressed parent files for a object file are uncompressed. The DPC will then carry out the instruction for creating the object, and report if the object has been successfully created.

Sometimes there is more than one object file, which share both the same parents and instruction for their creation. To continue the family analogy further, these files are known as siblings to one another. If an attempt is made to create one sibling, no further attempt will be made to create the other siblings.

6) Compress files

Having created all the levels (although it's possible just to apply the compress stage, using the --level=compress argument), the final stage is compressing files. The DPC examines all the files that the compress files suggests should be compressed, and compresses any of these files if their children are all newer than them.

How is the DPC different to a Makefile?

The DPC works in a similar way to a makefile: objects are defined with the parent files they depend on, along with instructions for creating the objects. However, the DPC has some additional features:

  • informs if the creation of a file has been successful
  • it can determine dates from file names
  • it can loop across days of the month and other time periods
  • it's easier to loop across variables, such as measurement sites.
  • it can define exceptions, in the problem file, that won't be created
  • it can compress only the files whose children have already been updated.
  • it is accompanied by a web interface, which provides a visual aid for how the all the files are updated.

Things to do now

Contact

Page navigation