A more detailed look at Python is available on-line at Johnny Lin's website.

Users outside the department will need Python with the NumPy, matplotlib, basemap and netCDF4 packages installed.

**import numpy as np** - scientific computing

**from scipy import interpolate** - interpolation

**import matplotlib.pyplot as plt** - plotting

**from mpl_toolkits.basemap import Basemap, shiftgrid, addcycli**c - mapping

**from subprocess import call** - calling Unix programs

**from netCDF4 import Dataset as ncfile** - netCDF I/O

**import cf** - David Hassell's cf data I/O

After importing a package functions can then referenced - for example, np.mean(a) for the NumPy mean function.

Python integers are 32 bit and not 16 bit as in IDL. A long integer is specified by putting a L or l after the number. Long integers can grow as large as is needed.

In Python variable names case sensitive while in IDL they are case insensitive. There is no need to declare variable types - just use them. As in IDL you can reassign variables to be different types.

# is the comment symbol. All text after a # is a comment.

Use print a,b,c to see the variables a,b,c. In IDL this would be print, a,b,c.

Array elements are accessed via [] brackets only. In IDL you can use both [] and (). Both Python and IDL indicies start at zero - i.e. the first element is a[0].

Indentation is the way of closing if then statements or loops in Python - there's no need for an endif or endfor as in IDL.

The rules of operator precedence are the same as in IDL - an integer multiplied by a float gives a float.

Python has a C style data interface while IDL has a Fortran style. If we have read in a standard atmosphere grid, in IDL it will appear as longitude, latitude, height, in Python it will appear as height, latitude, longitude. In IDL you would use a[i,*], the equivalent in Python is a[:,i]. See section on netCDF reading later.

0for i in np.arange(4): print i

1

2

3

Python is indentation sensitive so make sure your loop statements are indented. There is no end of loop statement in Python.

Large loops in Python are very slow. The following example is ten times slower than in IDL for a loop of 100 million points. A standard UM grid 96x73x22 for 360 days has 57 million points so this is not an unusual amount of data to be looking at.

First of all we create some test random data from -5 to 5. We will loop over the data and set the b array to be 1 for all points that are greater than zero.

**a=np.random.rand(100000000)*10-5
b=np.zeros(100000000,dtype=int)
pts=np.arange(0,100000000)
for i in pts:
if a[i] > 0: b[i]=1**

This took 160 seconds for a 100 million points. Using

Using NumPy where gives a much quicker answer
**b=np.where(a < 0, a, 1)** took 3.1 seconds

Using WHERE in IDL completed the same task in 0.16 seconds.

The colon delimiter is used to indicate the beginning or end of a sequence. When used singly it means all values.

if condition: statements else: statements

Python is indentation sensitive so make sure your statements are indented. There is no end of if statement in Python.

One line if commands are as follows:

**if condition: do_something()
if condition: do_something(); do_something_else()**

The following is also valid but more difficult to scan.

**a = 1 if x > 15 else 2**

Scipy routine list is at http://docs.scipy.org/doc/scipy/reference/index.html

For FFTs, regressions etc. look in the SciPy documenation.

**np.pi** - pi

**np.nan** - not a number

**np.inf** - infinity

To create a 3,5 array of np.zeros((3,5))+5

Averaging over axes:

**np.mean(temp, axis=2)** - note that axis=0 is the first axis

**np.sum(temp, axis=1)**

**np.ceil(a)** - round up

**np.floor(a)** - round down

**np.int(a)**, **np.float(a)** - convert to integer, float

**np.arange(5)** - array([0, 1, 2, 3, 4])

from datetime import datetime

cdftime = utime('hours since 0001-01-01 00:00:00')

print cdftime.units,'since',cdftime.origin

print cdftime.calendar,'calendar'

d = datetime(2006,9,29,12)

t1 = cdftime.date2num(d)

print t1

d2 = cdftime.num2date(t1)

print d2

hours since 1-01-01 00:00:00

standard calendar

17582028.0

2006-09-29 12:00:00

Define a routine with a def statement.

Don't forget to indent the lines after the procedure def line.def myproc(min=min, max=max): return max-min

**myproc(min=1, max=4)** - 3

**myproc(max=4, min=1)** - 3

**myproc(1,4)** - 3 - this is using Python's keyword position to avoid putting in the min and max keywords.

To return multiple values return them comma separated with the return command

def myproc(min=min, max=max): return max-min, min+(max-min)/2.0

: is a delimeter which here means all values. Python orders its data C style and IDL Fortran style. temp is a lon,lat,height grid and that's how it would appear in IDL. In Python it appears as height,lat,lon. In this example

The data file - gdata.nc

do stuff

cfplot - The beginings of a simple contour / vector plotting package.

Basemap is used here for the map plots.

ex35.py - sine and cosine curves and a legend

ex36.py data - lines with symbols, a non-numeric x-axis and minor tick marks

ex38.py data - one dataset but with two colours to represent different locations

ex40.py data - filling a curve with colour

ex41.py data - graph with two y axes using twinx

ex43.py - using cubic spline fitting from scipy

ex44.py - line fitting

ex45.py data - error plot

ex46.py data - bar plot. Horizontal lines behind bars were created using zorder=-1 to place them behind all the plot elements.

ex1.py data - colour contoured data on a cylindrical projection with a colour bar

ex2.py data - zoomed version of the above

ex3.py data - northern hemisphere stereographic projection

ex4.py data - unfilled contours with solid negative contour lines

ex5.py data - thick zero contour line

ex28.py - drawing lines, text and symbols on a plot

ex13.py data - linear pressure plot

ex14.py data - log pressure plot with a both pressure and height labelled y axes

ex31.py data - vector plot with a key