Dependency perl code
Defining variables
Rather than having to type out the same variable on numerous occasions, it often convent in scripting to define a variable. Especially if this variable may change from time to time, because it is much more convenient to alter a variable in one place rather than in numerous other places. The situation in the dependency file is no different.
Returning to the example on the Dependency file page where data is moved from /tmp directory to a home directory (/home/data). Say we want to define the home directory, the dependency file could look like:
#-------------------------------------------- # Define the directories #-------------------------------------------- Define;homeDir;/home/data; #-------------------------------------------- # Move data to home directory #-------------------------------------------- $homeDir$/raw.dat;/tmp/raw.dat;mv /tmp/raw.dat $homeDir$/raw.dat;
where the first line that isn't a comment is a definition line. It defines the string $homeDir$ to represent the string /home/data. Notice, that homeDir is sandwiched between two dollar ($) symbols - and not one, as is the case with most scripts.
Loop variables
Again from the world of scripting, it's convenient to loop across variables. Suppose we have three files containing pressure, temperature and salinity which are called raw_pres.dat, raw_temp.dat and raw_sal.dat respectively. And these file require moving to the home directory. The dependency file might look like
#-------------------------------------------- # Define the directories #-------------------------------------------- Define;homeDir;/home/data; #-------------------------------------------- # Move data to home directory #-------------------------------------------- $homeDir$/raw_pres.dat;/tmp/raw_pres.dat;mv /tmp/raw_pres.dat $homeDir$/raw_pres.dat; $homeDir$/raw_temp.dat;/tmp/raw_temp.dat;mv /tmp/raw_temp.dat $homeDir$/raw_temp.dat; $homeDir$/raw_sal.dat;/tmp/raw_sal.dat;mv /tmp/raw_sal.dat $homeDir$/raw_sal.dat;
but by using a loop variable, this could be written as
#-------------------------------------------- # Define the directories #-------------------------------------------- Define;homeDir;/home/data; #-------------------------------------------- # Define the measurement variables #-------------------------------------------- Define;measVar;pres,temp,sal; #-------------------------------------------- # Move data to home directory #-------------------------------------------- $homeDir$/raw_%measVar%.dat;/tmp/raw_%measVar%.dat;mv /tmp/raw_%measVar%.dat $homeDir$/raw_%measVar%.dat;
where the last two lines should be in the same line. Here, we have a loop variable %measVar%, and notice that it is sandwiched between two percentage (%) symbols. Here the use of loop variable hasn't saved typing in the dependency file, but with a bigger list it would.
Combining the standard and loop variables (the $ and % variables)
Sometimes it is convenient to combine the standard variables (sandwiched by $) and the loop variables (sandwiched by %). Suppose the pressure, temperature and salinity files are coming from a remote site where the variable names are in upper case, so the files are raw_PRES.dat, raw_TEMP.dat and raw_SAL.dat. To cope with this, the dependency file could be modified to be look like
#-------------------------------------------- # Define the directories #-------------------------------------------- Define;homeDir;/home/data; #-------------------------------------------- # Define the measurement variables #-------------------------------------------- Define;measVar;pres,temp,sal; #-------------------------------------------- # Upper case version of measurement variables #-------------------------------------------- Define;uc-pres;PRES; Define;uc-temp;TEMP; Define;uc-sal;SAL; #-------------------------------------------- # Move data to home directory #-------------------------------------------- $homeDir$/raw_%measVar%.dat;/tmp/raw_$uc-%measVar%$.dat;mv /tmp/raw_$uc-%measVar%$.dat $homeDir$/raw_%measVar%.dat;
where the last two lines should be in the same line. The DPC resolves the loop variables before the standard variables. Hence /tmp/raw_$uc-%measVar%$.dat becomes a loop over /tmp/raw_$uc-pres$.dat, /tmp/raw_$uc-temp$.dat and /tmp/raw_$uc-sal$.dat, before being changed to a loop over /tmp/raw_PRES.dat, /tmp/raw_TEMP.dat and /tmp/raw_SAL.dat.
External and internal loops
So far on this page, the loops in the instruction line have been in all three parts: the object, the parent-files and the command. In the following example the loop (%meas%) is only in the parent-files and command part, but not in the object part,
#-------------------------------------------- # Define the directories #-------------------------------------------- Define;homeDir;/home/data; #-------------------------------------------- # Define the measurement variables #-------------------------------------------- Define;measVar;pres,temp,sal; #-------------------------------------------- # Combine the raw data into one combined file in the home directory #-------------------------------------------- $homeDir$/combined.dat;/tmp/raw_%measVar%.dat;cat /tmp/raw_%measVar%.dat > $homeDir$/combined.dat;
where the last two lines should be in the same line. In this case, %meas% is an internal loop and the object file /home/data/combined.dat depends on /tmp/raw_pres.dat, /tmp/raw_temp.dat and /tmp/raw_sal.dat, and the command to create it is
cat /tmp/raw_pres.dat /tmp/raw_temp.dat /tmp/raw_sal.dat > /home/data/combined.dat
In this example the loop variable, %meas%, has been be used to loop across parent files, but for one instruction line. In the examples above this one, the loop variable, %meas%, has been used to create, effectively, three different instruction lines. And this is an external loop.
This works because the DPC matches the loops found in the object part with those found in the parent-files and command part. If there is a match, it's an external loop and it increases the number the instructions. If there isn't a match, it's an internal loop which is carried out with the parent-files and command part.
The use of $ and % variables within the define command
The use of % variables in the define command is slightly different to the use described above for an instuction line. While the use of $ variables in the define command is the same, for example:
#-------------------------------------------- # Define the sites #-------------------------------------------- Define;siteList1;BATH,BRISTOL,CARDIFF; Define;siteList2;LEEDS,SHEFFIELD,YORK; Define;siteLists;$siteList1$,$siteList2$:
will define the variable $siteLists$ as the list: BATH,BRISTOL,CARDIFF,LEEDS,SHEFFIELD,YORK.
It does not make sense to have an external loop in a define command, because it only defines one variable (whereas, in one instruction line a number of instructions can be defined). However, the writer of the dependecy file may want a loop within a $ variable, and under these circumstances the use of % variables is allowed as a loop within a $ variable - which is different from its use described above. For example, the user may require a lower case list of the cities above, so the dependency code could be
#-------------------------------------------- # Define the sites #-------------------------------------------- Define;siteList1;BATH,BRISTOL,CARDIFF; Define;siteList2;LEEDS,SHEFFIELD,YORK; #-------------------------------------------- # Define a lower case version of each site #-------------------------------------------- Define;lc-BATH;bath; Define;lc-BRISTOL;bristol; Define;lc-CARDIFF;cardiff; Define;lc-LEEDS;leeds; Define;lc-SHEFFIELD;sheffield; Define;lc-YORK;york; #-------------------------------------------- # Create a lower case list of sites #-------------------------------------------- Define;siteLists;$lc-%siteList1%$,$lc-%siteList2%$:
where $lc-%siteList1%$ becomes $lc-BATH$,$lc-BRISTOL$,$lc-CARDIFF$ before the $ variables are replaced with bath, bristol and cardiff respectively.