Introduction to Scipy: Interpolation and Integration¶

In this section, we will look at two other common sub-packages of Scipy: scipy.interpolate and scipy.integrate.

Interpolation¶

The simplest interpolation routine in scipy.interpolate is interp1d:

from scipy.interpolate import interp1d

For a mock dataset:

import numpy as np
x = np.array([0., 1., 3., 4.])
y = np.array([0., 4., 2.7, 2.08])

we can interpolate linearly in the data by creating an interpolating function:

f = interp1d(x, y)

type(f)

and we can then interpolate to any value of x within the original bounds:

f(0.5)

f(3.3)

It is also possible to interpolate several values at the same time:

f(np.array([0.5, 1.5, 2.5, 3.5]))

If the interpolating function is called outside the original range, an error is raised:

f(-1.)

You can change this behavior by telling interp1d to not give an error in this case, but to use a set value:

f = interp1d(x, y, bounds_error=False, fill_value=-10.)

f(-1.0)

f(np.array([-1., 1., 3., 6.]))

By default, interp1d uses linear interpolation, but it is also possible to use e.g. cubic spline interpolation:

f = interp1d(x, y, kind='cubic')
f(0.5)

Let's compare a few ways to interpolate:

import matplotlib.pyplot as plt
%matplotlib inline

xp = np.arange(0, 4, 0.1)
f = interp1d(x, y, kind=1) # equivalent to 'linear'
g = interp1d(x, y, kind=2) # equivalent to 'quadratic'
h = interp1d(x, y, kind=3)  # equivalen to 'cubic'

plt.scatter(x, y, label='data')
plt.plot(xp, f(xp), label='linear')
plt.plot(xp, g(xp), label='quadratic spline')
plt.plot(xp, h(xp), label='cubic spline')
plt.xlim(0, 6)
plt.legend()

def orig(x):
    return 3.11 + 0.5*x - 3.11*(x-1)*(x-1) + 0.39*x*x*x

xp = np.arange(0, 4, 0.5)
print(xp)
print(orig(xp))
print(h(xp))

For more information, see the documentation of interp1d. There are also other interpolation functions available (for example for spline interpolation), which you can read up about at scipy.interpolate.

Wikipedia also has some information on splines.

There are also interpolation methods that work on higher dimensional data. The griddata function is of particular interest and only shown here to demonstrate some capabilities of SciPy. You do not have to understand this straight away.

from scipy.interpolate import griddata


def func(x, y):
    return x*(1-x)*np.cos(4*np.pi*x) * np.sin(4*np.pi*y**2)**2


grid_x, grid_y = np.mgrid[0:1:100j, 0:1:200j] # make a grid in the range 0-1 with 100x200 gridpoints

points = np.random.rand(1000, 2) # get 1000 random datapoints in 2 dimensions
values = func(points[:,0], points[:,1]) # evaluate the above function at the random points

interp = griddata(points, values, (grid_x, grid_y), method='linear') # try also 'nearest' 'linear' or 'cubic'

# original data
plt.subplot(121)
plt.imshow(func(grid_x, grid_y).T, extent=(0,1,0,1), origin='lower')
plt.plot(points[:,0], points[:,1], 'k.', markersize=1)
plt.title('Original')

# interpolation
plt.subplot(122)
plt.imshow(interp.T, extent=(0,1,0,1), origin='lower')
plt.title('Interpolated')

# make plot a bit larger
plt.gcf().set_size_inches(10, 10)

Integration¶

The available integration functions can be found at scipy.integrate. See also the tutorial here. You will notice there are two kinds of functions - those that integrate Python functions, and those that integrate numerical data defined by Numpy arrays.

First, we can take a look at one of the functions that can integrate Python functions numerically. If we define a function:

def simple_function(x):
    return 3.*x**2 + 2.*x + 1.

we can integrate it between limits using:

from scipy.integrate import quad
print(quad(simple_function, 1., 2.))

As described in the documentation of quad, the first value returned is the integral, and the second is the error on the integral. If we had solved the integral analytically, we would expect 11, so the result is correct. The name stems from quadrature for working out the area under the curve by splitting it up into known sub-areas.

We can also define data as Numpy arrays that are then integrated:

x = np.linspace(1., 2., 1000)
y = 3.*x**2 + 2.*x + 1.

And in this case we can use for example the simps function to integrate using Simpson's rule:

from scipy.integrate import simps
print(simps(y, x=x))

This can be very useful in cases where one wants to integrate actual data that cannot be represented as a simple function or when the function is only available numerically, i.e. cannot be written down analytically.

Another often used intergration function is trapz which works similarly to simps but does trapezium integration:

from scipy.integrate import trapz
print(trapz(y, x=x))

Exercise 1¶

a) Write a function that takes x, and the parameters for a Gaussian (mean and standard deviation) and returns the value of the Gaussian at x:

# your solution ..

b) Use quad to compute the integral and compare to what you would expect.

# your solution here



# Note the following EXAMPLE
# for the 'args' argument to pass to the function to be integrated
#
def func(x, a, b):
    return blabla

quad(func, 0, 1, args=(1,1))

c) Now create two arrays x and y that contain the Gaussian for fixed values x, and compute the integral using simps.

# your solution here

Compare this to what you found with quad and analytically.

Differential equations¶

An important task in scientific computing is to solve differential equations.

scipy.integrate also has odeint which we use here to solve the damped oscillation

$$\ddot{x} + 2 \gamma \dot{x} + k x = 0$$

from scipy.integrate import odeint

For the odeint solver, the differential equation needs to be split up into a system of first-order equations:

(1) $\dot{x} = v$

(2) $\dot{v} = -k x - 2 \gamma v$

Such a coupled system of first-order differential equations is now "straightforward" to solve numerically given some initial conditions: we simply have to intregrate both equations simultaneously. For example, the position and velocity at a next desired time $t_1 = t_0 + \Delta t$ are

(1b) $x_1 = \int_{t_0}^{t_1}\,\dot{x} \, \mathrm{d}t$

(2b) $v_1 = \int_{t_0}^{t_1}\,\dot{v} \, \mathrm{d}t$

and we only need to know the derivatives of $x$ and $v$ with respect to time $t$. These derivatives are given to odeint via a function (diffeq below) and odeint then uses a smart integration algorithm to advance the set of equations to the defined times passed to the function as a parameter (time in the below example).

def diffeq(x, t, k, g):

    # define/clear derivative
    dxdt = np.zeros(2)

    """ differential equation for damped oscillation
        split into first-order equation 
         x. = v
         v. = - k x - 2 gamma v
    """
    dxdt[0] = x[1]
    dxdt[1] = -k * x[0] - 2 * g * x[1]

    return dxdt

# constants
k = 0.1
g = 0.1

x    = np.array([2.0, -0.8])          # initial position and velocity
time = np.linspace(0.0, 100.0, 100)   # evaluation times

# calling the solver
solution = odeint(diffeq, x, time, args=(k,g))

This is the corresponding analytical solution:

def damped_oscillation(t, x, k, g):
    om = np.sqrt(k - g**2)
    
    A = (x[1] + g*x[0]) / om
    B = x[0]
    
    return (A*np.sin(om*t) + B*np.cos(om*t))*np.exp(-g*t)

plt.plot(time, solution[:,0], ls='-', label='Numerical')
plt.plot(time, damped_oscillation(time,x,k,g), ls='--', label='Analytical')
plt.legend()

Exercise 2¶

a) Plot the motion of a critically damped oscillator $\gamma$ = $2 \sqrt{k}$

# solution here

b) Extend the differential equation by a forcing $f(x)=A_0 \cos (\omega_0 t)$ [choose $A_0$ and $\omega_0$]. Consider the initial behaviour of the system.

# solution here