Performant Python

SciPy

Authors: Neelofer Bangawala, Arno Proeme, Kevin Stratford, Andy Turner, EPCC, University of Edinburgh

Overview:

  • NumPy provides arrays and limited additional functionality
  • SciPy builds on NumPy and provides additional modules:
    • Linear Algebra and wrappers to LAPACK & BLAS scipy.linalg
    • Numerical Integration scipy.integrate
    • Interpolation scipy.interpolate
    • Optimisation scipy.optimize
    • Special functions scipy.special
    • Signal processing scipy.signal
    • Image Processing scipy.ndimage
    • Fourier transforms scipy.fftpack
    • Statistical functions stats
    • Spatial data structures and algorithms scipy.spatial
    • File I/O e.g. to read MATLAB files scipy.io

Linear algebra

  • Wider set of linear algebra operations than in Numpy

    • various decompositions (eigen, singular value)

    • matrix exponentials, trigonometric functions

    • particular matrix equations and special matrices

    • low-level LAPACK and BLAS routines

  • Routines also for sparse matrices
    • storage formats

    • iterative algorithms

Example: Matrix inverse

Consider:

$$ A = \left[ \begin{array} {rrr} 1 & 3 & 5 \\ 2 & 5 & 1 \\ 2 & 3 & 8 \\ \end{array} \right] $$

The inverse of $A$ is

$$ A^{-1} = \frac{1}{25} \left[ \begin{array} {rrr} -37 & 9 & 22\\ 14 & 2 & -9 \\ 4 & -3 & 1\\ \end{array} \right] \approx \left[ \begin{array} {rrr} -1.48 & 0.36 & 0.88\\ -0.56 & 0.08 & -0.36 \\ 0.16 & -0.12 & 0.04\\ \end{array} \right] $$

which may be confirmed by checking $A A^{-1} = I$ where $I$ is the identity.

Exercise Matrix inverse

Find inverse of matrix A (as defined above). Check the result by multiplying out $A A^{-1}$ , which should give identity matrix $I$

In [ ]:
# numpy has a function to produce the 2D identity matrix I
# query: ?np.eye

from scipy import linalg
A = ...

Solution Matrix inverse

In [ ]:
# Execute this cell to see a solution 
%load ./code/inverse.py

Integration scipy.integrate

  • Routines for numerical integration – single, double and triple integrals
  • Can solve Ordinary Differential Equations (ODEs) with initial conditions

Example : Double integral

Calculate $\pi$ using the double integral for the area of a circle with radius $r$:

$$ \int _{x_{min}} ^{x_{max}}\, dx \int _{g(x)} ^{h(x)} f(x,y) \, dy = \int _{-r} ^{r} \int _{-\sqrt(r^2-x^2)} ^{\sqrt(r^2-x^2)} 1 \, dx\, dy = \pi r^2 $$

We will solve this with scipy.integrate.dblquad()

http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.integrate.dblquad.html

In [ ]:
# numerically integrate using dblquad()

import numpy as np
from scipy.integrate import dblquad

# order of variables matters! y before x

def integrand(y, x):
    return 1

def xminlim(x, r):
    return -1*np.sqrt(r*r - x*x)

def xmaxlim(x, r):
    return np.sqrt(r*r - x*x)

# integral for the area of a circle with radius r
def integrate_to_pi(r): 
    (area,err) = dblquad(integrand, -1*r, r, 
                         lambda x: xminlim(x,r), 
                         lambda x: xmaxlim(x,r))
    return area/(r*r)

Integration : Check result

Calculate the result and compare with the standard numpy.pi

In [ ]:
# %load pi_integration_check.py
# calculate pi using numerical integration and check result against numpy constant np.pi

print(integrate_to_pi(1.0))

# compare with numpy pi
print(np.pi - integrate_to_pi(1.0))

# can try timing... (uncomment line below)
# %timeit integrate_to_pi(1.0) 

Double integral

Calculate the double integral

$$ \int_0^{\pi/2} dx \int_0^1 dy \quad f(x,y) $$

where $f(x,y) = y sin(x)$. The answer should be 1/2.

In [ ]:
# Use the same approach here as above

def integrand1(y,x):
    return y*np.sin(x)

Solution Double integral

In [ ]:
# Execute this cell to see a solution 
%load ./code/integration.py

Example Pendulum

Solve Ordinary Differential Equations (ODEs) with initial conditions, for example motion of simple pendulum.

A point mass, $m$, is attached to the end of a massless rigid rod of length $l$. The pendulum is acted on by gravity and friction. We can describe the resulting motion of the pendulum by angle, $\theta$, it makes with the vertical.

<img src="pendulum.png"; style="float: right; width: 40%; margin-right: 3%; margin-top: 0%; margin-bottom: -1%">

Assuming angle $\theta$ always remains small, we can write a second-order differential equation to describe the motion of the mass according to Newton's 2nd law of motion, $m\,a = F$, in terms of $\theta$:

$$ \ddot{\theta} = -\frac{g}{l}\,\theta - \frac{b}{m}\,\dot\theta $$

where $b$ is a constant of friction and $b \ll g$.

To use odeint, we rewrite the above equation as 2 first-order differential equations:

$ \dot{\theta} = \omega $

$ \dot{\omega}= -\frac{g}{l}\,\theta - \frac{b}{m}\,\omega $

Pendulum (cont.)

Define the ODE as a function and set up parameters and initial values.

In [ ]:
# ode as a function

# let y be vector  [theta, omega]
def pendulumNumerical(y, t, b, m, g, length):
    theta, omega = y
    dydt = [omega, -(b/m)*omega - (g/length)*(theta)]
    return dydt
In [ ]:
# Parameters and initial values
m = 1.0                # mass of bob
length = 1.0           # length of pendulum
b = 0.25               # friction constant
g = 9.81               # gravitational constant
theta0 = np.pi-0.01    # initial angle
w0 = 0.0               # initial omega

# create a vector with the initial angle and initial omega
y0 = [theta0, w0]
In [ ]:
# time interval (use more points for exact solution "tex")
stoptime = 10         # total number of seconds
numpoints = 51       # number of points interval

t = np.linspace(0, stoptime, numpoints)
tex = np.linspace(0, stoptime, 10*numpoints)
In [ ]:
# ODE solver parameters
abserr = 1.0e-3      # absolute error tolerance
relerr = 1.0e-1      # relative error tolerance

Pendulum (cont.)

Use odeint to numerically solve the ODE with initial conditions.

In [ ]:
# import odeint solver
from scipy.integrate import odeint
In [ ]:
# get solution. Note args are given as a tuple
solution = odeint(pendulumNumerical, y0, t, args=(b,m,g,length),\
                  atol=abserr, rtol=relerr)

The ODE can be solved analytically. The exact solutions for $\theta$ and $\omega$ are

In [ ]:
# Exact solution for theta
def pendulumTheta(t, theta0, b, m, g, length):
    root = np.sqrt( np.abs( b*b - 4.0*g*m*m/length ) )
    sol = theta0*np.exp(-b*t/2)*( np.cos( root*t/2 ) \
                                 + (b/root)*np.sin( root*t/2) )
    return sol
In [ ]:
# Exact solution for omega
def pendulumOmega(t, theta0, b, m, g, length):
    root = np.sqrt( np.abs( b*b - 4.0*g*m*m/length ) )
    sn = np.sin(root*t/2.0)
    cs = np.cos(root*t/2.0)
    sol = -(b/2)*theta0*np.exp(-b*t/2)*( cs + (b/root)*sn ) \
        + (theta0/2)*np.exp(-b*t/2)*( b*cs - root*sn )
    return sol

Exercise Pendulum

To see how good the numerical solutions for $\theta$ and $\omega$ are, plot the exact solutions against the numerical solutions for the appropriate range of $t$.

You should include a legend to label the different lines/points.

You should find that the numerical solution looks quite good. Can you adjust the parameters above (re-execute all the relevant cells) to make it better?

In [ ]:
%matplotlib inline

Solution Pendulum

In [ ]:
# Execute this cell to see a solution 
%load ./code/pendulum.py

Optimisation

  • Several classical optimisation algorithms
    • Least squares fitting
    • Quasi-Newton type optimisations
    • Simulated annealing
    • General purpose root finding

Least-squares fit

Use scipy.optimize.leastsq to fit some measured data, $\{x_i,\,y_i\}$, to a function:

$$ y\,=\,A\,\sin(2\pi k x \,+\, \theta) $$

where the parameters $A$, $k$, and $\theta$ are unknown. The residual vector, that will be squared and summed by leastsq to fit the data, is:

$$ e_i\,=\, ∣∣ \, y_i \,− \,A\sin(2\pi k x_i + \theta)∣∣ $$

By defining a function to compute the residuals, $e_i$, and, selecting appropriate starting values, leastsq can be used to find the best-fit parameters $\hat{A}$, $\hat{k}$, $\hat{\theta}$.

Least-squares fit

Create a sample of true values, and the "measured" (noisy) data. Define the residual function and initial values.

In [ ]:
# set up true function and "measured" data
x = np.arange(0, 6e-2, 6e-2 / 30)
A, k, theta = 10, 1.0 / 3e-2, np.pi / 6

y_true = A * np.sin(2.0*np.pi*k*x + theta)
y_meas = y_true + 2.0*np.random.randn(len(x))
In [ ]:
# Function to compute the residual
def residuals(p, y, x):
    A, k, theta = p
    err = y - A * np.sin(2 * np.pi * k * x + theta)
    return err

Least-squares fit

For easy evaluation of the model function parameters [A, K, theta], we define a function.

In [ ]:
def peval(x, p):
    return p[0]*np.sin(2.0*np.pi*p[1]*x + p[2])

# starting values of A, k and theta
p0 = [8, 1 / 2.3e-2, np.pi / 3]
print(np.array(p0))

Least-squares fit

Do least squares fitting and plot results

In [ ]:
# do least squares fitting
from scipy.optimize import leastsq

plsq = leastsq(residuals, p0, args=(y_meas, x))
print(plsq[0])
print(np.array([A, k, theta]))
In [ ]:
# and plot the true function, measured (noisy) data 
# and the model function with fitted parameters 
plt.plot(x, peval(x, plsq[0]), x, y_meas, 'o', x, y_true)

plt.title('Least-squares fit to noisy data')
plt.legend(['Fit', 'Noisy', 'True'])
plt.show()

Special functions

  • SciPy contains huge set of special functions
    • Bessel functions
    • Legendre functions
    • Gamma functions
    • Airy functions

We will see special functions used in the following sections.

Example: scipy.special

  • Many problems with circular or cylindrical symmetry have solutions involving Bessel functions
  • E.g., height of a oscillating drumhead related to $J_n(x)$

We will use

http://docs.scipy.org/doc/scipy-0.14.0/reference/special.html

In [ ]:
# drumhead example
from scipy import special

def drumhead_height(n, k, distance, angle, t):
    # kth zero is last element of returned array
    kth_zero = special.jn_zeros(n, k)[-1]
    return (np.cos(t) * np.cos(n*angle) * special.jn(n, distance*kth_zero))
In [ ]:
theta = np.r_[0:2*np.pi:50j]
radius = np.r_[0:1:50j]
print(theta)

x = np.array([r * np.cos(theta) for r in radius])
y = np.array([r * np.sin(theta) for r in radius])
z = np.array([drumhead_height(1, 1, r, theta, 0.5)
              for r in radius])
# contd...

Drumhead (cont.)

Plot the height of a drumhead using a 3-d axis set

In [ ]:
# ...contd
%matplotlib inline
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm

fig = plt.figure(figsize=(8, 4))
ax = Axes3D(fig)
ax.plot_surface(x, y, z, rstride=1, cstride=1, cmap=cm.jet)
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
plt.show()

Summary

SciPy has a wide range of useful functionality for scientific computing

In case it does not have what you need, there are other packages with specialised functionality.

Other packages

  • Pandas

    • Offers R-like statistical analysis of numerical tables and time series
  • SymPy

    • Python library for symbolic computing
  • scikit-image

    • Advanced image processing
  • scikit-learn

    • Package for machine learning
  • Sage

    • Open source replacement for Mathematica / Maple / Matlab (built using Python)