The job for doing this is u-ae884 on broadwell nodes with safe optimisation.
NMPPE*NMPPN (total tasks, nodes) | Model run length | Total time | Speed (model years per day) |
---|---|---|---|
24 * 24 (576 tasks, 36 nodes) | 10 days | 23:39 (1,419s) | 1.69 |
36 * 20 (720 tasks, 20 nodes) | 10 days | 24:10 (1,450s) | 1.66 |
36 * 24 (864 tasks, 24 nodes)* | 10 days | 20:31 (1,231s) | 1.95 |
My previous fullChemN96 estimates were based on mi-ah651 at UM10.2, which were run on the haswell nodes.
Main features | NMPPE*NMPPN (total tasks, nodes) | Model run length | Total time | Speed (model years per day) |
---|---|---|---|---|
u-ae884, broadwell, safe | 24 * 24 (576 tasks, 16 broadwell nodes) | 10 days | 23:39 (1,419s) | 1.69 |
u-ae884, haswell, safe | 24 * 24 (576 tasks, 18 haswell nodes) | 10 days | 22:45 (1,365s) | 1.76 |
u-ae884, haswell, high | 24 * 24 (576 tasks, 18 haswell nodes) | 10 days | 20:56 (1,256s) | 1.91 |
mi-ah651, haswell, safe | 24 * 24 (576 tasks, 18 haswell nodes) | 10 days | 20:34 (1,234s) | 1.94 |
mi-ah651, haswell, high | 24 * 24 (576 tasks, 18 haswell nodes) | 10 days | 17:20 (1,040s) | 2.31 |
Clearly my previous full chemistry run at UM10.2 with optimisation `high' is much quicker than my current full chemistry run at UM10.4 at optimisation `safe' by about 31%.
Looking at the profiling is seems that UM10.4 is slower than my previous UM10.2 runs for three reasons
I need to find a setup which runs at a reasonable speed, but still uses resources fairly efficiently. I've copied GA7.1 + StratTrop UM10.7, u-ak990, to u-am972 for these tests
Nodes (ATM_PROCX*ATM_PROCY) | Threads | Time for one month | Core hours/model year | Speed (model years/day) |
---|---|---|---|---|
8 (18*16) | 1 | 1:55:28 (6,928s) | 6,646 | 1.04 |
12 (18*24)* | 1 | 1:23:19 (4,999s) | 7,200 | 1.44 |
12 (18*24) | 1 | 1:27:11 (5,231s) | 7,513 | 1.38 |
18 (36*18) | 1 | 1:11:35 (4,295s) | 9,257 | 1.68 |
It looks like we can drop the speed and get more efficiency, but do we really want to run at less than about 1.4 model years/day. I suspect not so I'll stick with 12 nodes for now and the following settings.