Mark Richardson's work
The current optimisation work is all being done by Mark
Richardson at Leeds University.
- Break up code into small segments in UKCA_COAGWITHNUCL (I think),
to make better use of cache. Halves time in this routine.
- Add OpenMP (also allows more OpenMP jobs than OpenMP PEs, so
quickest PEs can process 2nd, 3rd ... jobs).