Analysing Dr Hook files

Reading Dr Hook files

As far as I'm aware, there's no specific software designed to read these files, so I've written my own in perl. The perl code can be found in ~mstringe/cgi-bin/drHook and can be run with the following commands

perl ~mstringe/cgi-bin/drHook/drHook.pl --dir=<directory>
--nRoutines=<number of routines> --orderBy=<measure>:<stat> 
--html

where it should all be on one line and where

<directory> is the directory containing the Dr Hook files, e.g. /data/cr/ukesm/mstringe/facec. This is the only argument which is always required.
<number of routines> is the number of routines that are listed in the output. The routines to be listed is determined by the criteria which is defined with the --orderBy=<measure>:<stat> argument. The default option is the top ten.
<measure> can be
- self: the self time - the time spent just in this routine
- percent: the percentage of time spent in this routine (100 * self / absolute total)
- total: the total time - time in routine and the routines which it calls
- cumul: the cumulative time - the time spent in this routine and routine which call it
- numCalls: the number of calls to this routine
- selfPer: (self time) / (number of calls)
- totalPer: (total time) / (number of calls)
the default option is `self'.
<stat> can be
- min: the shortest time/number found for any PE in the Dr Hook files
- mean: taking the times/numbers for all PEs the mean time
- max: the longest time/number found for any PE in the Dr Hook files
- diff: max - min. Gives an indication of the most unbalanced routines, where this number will be high compared to their mean time.
--html is only included if you want the output in form for writing to a web page, when HTML is added to the output.

Note that the PE number system for the statistics produced by drHook.pl run from PE 0 to one less than the total number of PEs. In contract to the drhook.prof* files which run from PE 1 upwards, which is less standard, and so PE 5 refers to the data in drhook.prof.6.

The two most common commands

It's likely that you'll want to measure two things.

Time in each routine (self)

The following command

perl ~mstringe/cgi-bin/drHook/drHook.pl --dir=<directory>
--nRoutines=9999 > selfTime

where the file selfTime will contain the time in each routine with the heaviest users at the top. Hence, just read the top of the file for the most expensive routines or use grep to find a particular routine.

Time in each routine + all the routines it calls + the routines they call etc (total)

The following command

perl ~mstringe/cgi-bin/drHook/drHook.pl --dir=<directory>
--nRoutines=9999 --orderBy=total > totalTime

where the file totalTime will contain the time in each routine + all the routines it spawns (the total time), again with the heavist users at top. Hence, UM_SHELL should be at the top and this should be followed by UM_MODEL_4A.

Monsoon