Timing tests for UKESM0.4
The starting point
We now have a prototype UKESM0.4 which has
- full chemistry
- TRIFFID (but not with the extra PFTs)
- MEDUSA
so all the major components at N96-eORCA1 in u-ae697. Let's see how fast
can it run.
The current setup is
- ATMOS: 36*20 (720 tasks, 20 nodes)
- OCEAN: 12*9 (108 tasts, 3 nodes)
- Times for first 3 months: 1:30:53 (5,453s), 1:42:01 (6,121s)
and 1:34:10 (5,650s)
- Average time: 1:35:41 (5,741s)
- Average speed: 1.25 model years/day
The toyatm.timers* files were being produced because the
argument in $NLOGPRT had a 2nd argument of 3 to add in OASIS
timers - so I've removed this. The first month has now run
- 1:29:39 (5379s)
- 1.34 model years/day
Increasing the ocean nodes
Let's try and make sure that the ocean is slowing this down
- ATMOS: 36*20 (720 tasks, 20 nodes)
- OCEAN: 24*12 (CICE_BLK is 15*28, 288 tasts, 8 nodes)
- Times for first month: 1:31:08 (5468s)
- Speed: 1.32 model years/day
This suggests the 3 nodes for ocean is perfectly fine.
ARCHER estimate
I've been asked for an ARCHER estimate, so here goes.
- Assume speed of 1.3 model years/day on 23 nodes, or
23 * 36 = 828 cores.
- Implies that one model year is completed in 18.5 hrs
- Implies 828 * 18.5 = 15,318 core-hours for one model year
- The last information we heard is that we need to multiply this by
2 for ARCHER, so 2 * 15,318 = 30,636 core-hours
- Hence one year needs 30,636 * 15 = 459,540 AUs or about 0.460 MAUs.
Estimate on 25/26 August 2016
Richard Betts has asked for estimate. After the first month (70.5 mins),
the following months of u-af826 have all run for about 102 mins (6,120s).
- 1.18 model years/day
- 23 broadwell nodes is 23 * 36 = 828 cores
- Implies that one model year is completed in 20.3 hours
- Implies 828 * 20.3 = 16,800 core-hours for one model year
Richard also wants the N216-ORCA025 estimate. We haven't run this
configuration yet, but if we assume the run is dominated by the full
chemistry (which it is) then
- my
old full chemistry timings showed at 512 cores that the full
chemistry at N96 runs at 2.03 model years/day, while N216 runs at
0.332 model years/day.
- This suggests that the N216 run is about 6.11 times slower (more than
four times the grid points and a shorter timestep, so sounds about
right).
- ORCA025 compared to ORCA1 will be similar and represents a much
smaller fraction of the nodes
- Implies that 6.11 * 16,800 = 103,000 cores hours for one model
year
Richard would also like estimate for UKESM without full chemistry,
very similar to u-af755.
- The first 3 months are: 68:18 (4,098s), 60:59 (3,659s) and
62:27 (3,747s).
- An average of 3,835s (63:55)
- 1.88 model years/day
- Implies that one model is completed in 12.8 hours
- Running on 8 nodes for ATMOS & 2 nodes for OCEAN, so 10 nodes total
or 360 cores
- Implies 360 * 12.8 = 4,608 core-hours for one model year