For the coupled run, I expect that we'll use about 3 nodes with attached XIOS for the ocean + MEDUSA. It's not currently clear if we should underpopulate the Broadwell nodes or not.
The job I'm using for this is u-ae035.
Cores used per Broadwell node (total cores used) | NEMO_IPROC * NEMO_JPROC (CICE_BLKX * CICE_BLKY) | Times for first 6 months | Average time for 1 month | Speed (model yrs/day) |
---|---|---|---|---|
36 (full, 108) | 12*9 (30*37) | 29:13 (1,753s), 28:40 (1,720s), 28:29 (1,709s), 29:34 (1,774s), 28:33 (1,713s) & 29:02 (1,742s) | 28:55 (1,735s) | 4.15 |
32 (96) | 12*8 (30*42) | 30:12 (1,812s), 30:55 (1,855s), 30:09 (1,809s), 30:12 (1,812s), 30:05 (1,805s) & 30:44 (1,844s) | 30:23 (1,823s) | 3.95 |
28 (84) | 12*7 (30*48) | 31:53 (1,913s), 30:29 (1,829s), 29:58 (1,798s), 30:38 (1,838s), 30:23 (1,823s) & 30:01 (1,801s) | 30:34 (1,834s) | 3.93 |
24 (72) | 9*8 (40*42) | 32:02 (1,922s), 33:59 (2,039s), 34:53 (2,093s), 32:49 (1,969s), 32:59 (1,979s) & 32:17 (1,937s) | 33:10 (1,990s) | 3.62 |
Hence, we can be fairly confident that using all the cores - no underpopulation - is best.
We have a run time for 3 nodes from above, which is a lot faster than our atmosphere will probably be. Maybe we can get away with less nodes for MEDUSA.
Following the timing tests above, I'm fully populating the nodes.
Nodes (cores) | NEMO_IPROC * NEMO_JPROC (CICE_BLKX * CICE_BLKY) | Times for first 6 months | Average time for 1 month | Speed (model yrs/day) |
---|---|---|---|---|
2 (72) | 9*8 (40*42) | 39:26 (2,366s), 40:35 (2,435s), 39:49 (2,389s), 41:20 (2,480s), 41:50 (2,510s) & 39:31 (2,371s) | 40:25 (2,425s) | 2.97 |
1 (36) | 9*4 (40*83) | 1:07:43 (4,063s), 1:10:54 (4,254s), 1:08:02 (4,082s), 1:08:31 (4,111s), 1:07:17 (4,037s) & 1:07:19 (4,039s) | 1:08:18 (4,098s) | 1.76 |
Given that the ocean at ORCA1 is so much cheaper than almost any atmosphere configuration at N96, we'll probably use 3 nodes for the ocean - even though 2 nodes it likely to be enough most of the time. Otherwise, we risk the odd random slow ocean run which causes all the nodes we're using for the atmosphere to wait.
It's obviously worth testing this.