For the sake of smoothness :), we start by 'formalizing' the notion of definite integration. If you haven't seen my previous tutorial on 'Representation, Operations, and Semantics', be aware that this 'formalization' is an application of the concepts discussed in 'ROS'.
The idea is to take you through my own journey in Computational Physics. In this specific tutorial, you will see how I took the concepts from a Calculus 101 course, and turned it into a 'Computational Idea', which I then implemented in Haskell.
You do not have to do this in Haskell, but I recommend it. Even if you plan to work in 'C' or 'FORTRAN', I believe (that is the reason we are here) Haskell can teach you more about 'Computational Ideas' and their relationship to 'Mathematical Ideas' (and later, we will discuss 'Physical Ideas') than any other languages out there. Also, Haskell allows you to write efficient programs, which is important in Computational Physics. Furthermore, I will teach you (if you keep coming back) what optimization is, and how to do it smartly.
But enough blabla, let's dive in.
From Exact to Approximate Integration
Let us consider definite integrals.
integral f a b
How can we make sense of this concept in a 'computational' manner?
In mathematics, integral
is a function that takes a function as input, an interval, and returns the area under the curve.
In a computational context, integral
is likely to be an Operation
rather than a Representation
.
So, our task is to find a 'computational idea' where integral
is an operation.
Exercise: How would you solve this problem? How would you represent real numbers? How would you represent a function from the real to the real? How would you represent an interval?
Solution: Two for the price of One
These are two representations you might have come up with:
if we represent mathematical functions
f:: Real -> Real
as haskell functionsf::Double -> Double
, and every numberx::Real
by a haskell numberx::Double
, thenIntegral
is a function with the following type:
integral :: (Double -> Double) -> Double -> Double -> Double
By introducing type synonyms, we really start to get a handle on the fundamental notion here:
type Func = Double -> Double
integral :: Func -> Double -> Double -> Double
--If we want to emphasize the fact that a and b form an interval, then we can do this:
type Interval = (Double,Double)
integral :: Func -> Interval -> Double
integral f (a,b) = ...
So the integral is an Operation
which takes a function, an interval, and returns a real number.
if we represent a mathematical function
f
by the list of values which it takes on the interval (a,b), and we represent the independent variablex
by the values for which the function is evaluated, thenIntegral
is a function with the following type:
integral :: [Double] -> [Double] -> Double
integral fs xs = ...
--with the understanding that if f is the mathematical function, then
fs = map f xs
With a few type synonyms:
type Func = [Double]
type Var = [Double]
integral :: Func -> Var -> Double
Note about higher-order Functions
In mathematics, functions on functions are called operators (linear operators in cases like integral
). However, once you accept that functions aren't special, then the word operator
can be replaced by function
. In Haskell, there is nothing special about operators
; they are simply functions, and you define them the same way as functions on data. This single fact is responsible for much of Haskell's power, and many other languages have this feature.
Take-away from our first two tries
So you notice that these two notions of integrations are different. (I'm talking about the Solution to the first Exercise)
Clearly, the first integral really corresponds directly to the mathematical idea of an exact (definite) integral.
Given a function f
, and an interval (a,b)
, then there is really only one exact integral in math, and one exact integral in the first representation.
However, the second representation offers many representations of the same function. (you are free to choose how many point you use to represent your function)
for instance:
f x = x
a = 0
b = 1
fs1 = [0, 1] --representing f with 2 points on the interval (a,b)
fs2 = [0, 0.5, 1] --................... 3 ...........................
xs1 = [0, 1] -- the corresponding variables
xs2 = [0, 0.5, 1]
So what is essential here?
Let us suppose that we always evaluate points at constant increments.
Now, the only freedom we have is the number of points contained on our interval.
Thus, a more useful representation than 1) for approximate integration is
A third representation; Now, we talk about approximate integration.
type Func = Double -> Double
data Interval = I (Double,Double) Int
integral :: Func -> Interval -> Double
integral f (I (a,b) n) = ...
This is a representation of a discrete interval, which now makes the two representations (2 and 3) equivalent.
However, it is possible to define integration without the constraint of constant increment.
More: The second and third representation are compatible
In the case of non-constant increments, the 3rd rep can change to the following, and become equivalent to the 2nd:
type Func = Double -> Double
type Interval = [Double] --This is a Var in disguise, so let us be more explicit:
type Var = [Double]
integral :: Func -> Var -> Double
integral f xs = ...
Incidentally, if one restricts himself to the case of constant increments, then one can modify the 2nd representation to fit the original 3rd:
type Func = [Double]
data Interval = I (Double,Double) Int
integral :: Func -> Interval -> Double
integral fs (I (a,b) n) = ...
So we see that at its core, the approximate definite integral has a structure independent of representation.
Simple Rules
OK, now that we really really know what an integral is, let us look at various algorithms for integration.
For the moment (we will come back when we talk about Stochastic Methods), the mathematical form of algorithms to compute integrals has the following form:
integral f xs = sum [f x_i * w x_i | x_i <- xs]
--or
integral' fs xs = sum $ zipWith (*) ws fs
However, this form doesn't tell us about w
or ws
. Intuitively, you can think of an integral as a sum, where the ws
are the weights of each of the fs
.
Note: why we work with the 2nd rep. from now on.
By the way, the way to go from the 3rd rep. to the 2nd is rather obvious (fs = map f xs
), whereas the other way around seems troublesome. Therefore, we will work with the 2nd, and then, if we want to implement something for the 3rd, we will first convert the input to the 2nd rep, and then use our implementation for the 2nd.
Exercise: how to get from an Interval
to a Var
in the case of equal increments?
Solution:
data Interval = I (Double,Double) Int
toXs (I (a,b) n) =map (\x -> a + (b-a)/fromIntegral (n-1) * fromIntegral x) [0..n-1]
main = print toXs (I (1,2) 10)
Left rule
This is the simplest way to integrate;
- we split the interval (a,b) in (n-1) slices of equal length
h = (b - a)/(n-1)
, - we approximate the function on each of those intervals by a constant.
What is n
again?
n is the number of points in the interval. If n = 3
, a=0
, and b=1
,
then the three points are [0, 0.5, 1]
For this rule, the last point is never used, because we only need one point per interval to approximate a constant.
In the case of the Left rule, the first interval will be approximated by f a
, the second by f (a + h)
, and so forth. Thus, f b
is never evaluated.
Formally,
integral f (I (a,b) n) = let h = (b - a)/(n-1)
xs = map (\i -> a + i * h) [0..n-2]
in sum [h * f x_i | x_i <- xs]
To help us later, we want to define w
, the weight function.
wLeft h = h -- w is a constant, always h ... for the first n - 1 points.
This is not quite right; we would like to be explicit about the fact that if we were to include b
in xs
, then w
would be 0
.
--here, number is the position in the sum starting at 1.
wLeft h n number | (number == n) = 0
| (number >= 1 && number < n) = h
| otherwise = error "number is not in range with n"
Therefore, we can have something like:
integral f (I (a,b) n) = let h = (b - a)/(n-1)
xs = toXs (I (a,b) n)
fs = map f xs
ws = map (wLeft h n) [1..n]
in sum $ zipWith (*) fs ws
-- If we already have the list of function values from somewhere else.
integralOfList fs (I (a,b) n) = let h = (b - a)/(n-1)
ws = map (wLeft h n) [1..n]
in sum $ zipWith (*) fs ws
We see that the structure is rather independent of the specific choice of wLeft
.
Why bother with integralOfList
? (First Reason)
First, we could implement integral
in terms of integralOfList
. (3rd rep vs. 2nd)
Exercise: rewrite integral
in terms of integralOfList
:
data Interval = I (Double,Double) Int
toXs (I (a,b) n) =map (\x -> a + (b-a)/fromIntegral (n-1) * fromIntegral x) [0..n-1]
wLeft h n number | (number == n) = 0
| (number >= 1 && number < n) = h
| otherwise = error "number is not in range with n"
integralOfList fs (I (a,b) n) = let h = (b - a)/ fromIntegral (n-1)
ws = map (wLeft h n) [1..n]
in sum $ zipWith (*) fs ws
integral:: (Double -> Double) -> Interval -> Double
integral f (I (a,b) n) = FIXME
main = print $ integral (\x -> x**2) (I (-1,1) 100)
Why bother with integralOfList
? (Second Reason)
Sometimes, we don't have a f
; for instance, someone could give you a table containing fs
; maybe he did an experiment, or maybe he wrote a big program just to compute fs
. Now, your job is to integrate. Can you do it? Yes you can; with integralOfList
, you can.
And if you have f
, then use this (Solution to the previous exercise):
integral f (I (a,b) n) = let xs = toXs (I (a,b) n)
fs = map f xs
in integralOfList fs (I (a,b) n)
This method---although simple---is never used, because the error is proportional to 1/n
. Let's do better.
Trapezoidal rule
Before, we approximated the function locally as a constant. Now, we will do just a tiny bit better, but the error will be proportional to 1/n^2
(still bad, but much better than 1/n
).
The idea is that for each slice of x
, you approximate f
by a linear function.
This gives you (on a single interval)
integral f [x_0, x_1] = let h = x_1 - x_0
in h * (f x_0 + f x_1)/2
The name of the rule comes from the formula above (the area of each slice is that of a trapezoid).
Exercise: 1) Draw a trapezoid. 2) What shape corresponds to the Left rule?
If you carry this rule through for a whole interval (by summing over all slices), you get the following:
integral [f1,f2,..,fN] (I (a,b) n) = let h = (b - a)/ fromIntegral (n - 1)
in h*f1/2 + h*f2 + .. + h*fN/2
Why different weights for the edges?
When we add the contribution from each interval, every point gets a h/2
, but the points inside (not at the edge) get two contributions (one for each interval they belong to).
To see this more clearly, we could rewrite:
integral [f1,f2,..,fN] (I (a,b) n) = let h = (b - a)/ fromIntegral (n - 1)
in h * (f1 + f2)/2+ h * (f2 + f3)/2 + .. + h * (f(N-1) + fN)/2
So basically, we only need to define w
for the trapezoidal rule in accordance to this, and we are good.
--here, number is the position in the sum starting at 1.
wTrap h n number | (number == 1 || number == n) = h/2
| (number > 1 && number < n) = h
| otherwise = error "number is not in range with n"
Now, the full integral formula:
integralOfList fs (I (a,b) n) = let h = (b - a)/(n-1)
ws = map (wTrap h n) [1..n]
in sum $ zipWith (*) fs ws
The only difference between this one and the one for the Left rule is the weight function.
Common Patterns?
This gives us the idea to define a super-function (sorry) which takes the weight function as a parameter.
integralOfList w fs (I (a,b) n) = let h = (b - a)/(n-1)
ws = map (w h n) [1..n]
in sum $ zipWith (*) fs ws
intLeft = integralOfList wLeft
intTrap = integralOfList wTrap
If you do not understand partial application (last two lines), then rewrite intLeft
to be the same as integralOfListLeft
.
Simpson's Rule
This method only works if the number of points n is an odd number.
The idea is to approximate the function by a quadratic in each slice of three points; it amounts to this:
f x = f x_0 +
(x - x_0) * (f (x_0 + h) - f (x_0 - h))/ (2*h) +
(x - x_0)**2 / 2 * (f (x_0 + h) - 2 * f x_0 + f (x_0 - h))/ h**2 +
O (h**3)
--where ** means exponentiation;
-- and O means `Big-oh notation'.
-- It means that what is left,
-- after we expanded the first
-- Taylor terms, is proportional to `h**3`.
This gives you:
integral f [x_0, x_1, x_2] = let h = (x_2 - x_0) / 2
in h * (f x_0 + 4 * f x_1 + f x_2)/3
If you are skeptical, try to integrate f x
as approximated above with your knowledge of mathematical (exact) integration. Remember, a quadratic can be integrated exactly.
If you carry this rule through for a whole interval (by summing all the slices), you get the following:
integral [f1,f2,..,fN] (I (a,b) n) = let h = (b - a)/ fromIntegral (n - 1)
in (1/3) * h*f1 +
(4/3) * h*f2 +
(2/3) * h*f3 +
(4/3) * h*f4 +
.. +
(1/3) * h*fN
So basically, we only need to define w
for the Simpson's rule in accordance to this, and we are good.
--here, number is the position in the sum starting at 1.
wSimp h n number | (number == 1 || number == n) = (1/3)*h
| (number > 1 && number < n && even number) = (4/3)*h
| (number > 1 && number < n && odd number) = (2/3)*h
| otherwise = error "number is not in range with n"
The full integral formula is:
intSimp fs (I (a,b) n) = if odd n
then integralOfList wSimp fs (I (a,b) n)
else error "The Simpson's rule only works with an odd number of points; the Interval given had an even n."
Exercise: Use both the Simpson's rule and the Trapezoidal rule to create an integration which never fails due to even n
, and has best accuracy.
--You got it
intSuper fs (I (a,b) n) = if odd n
then integralOfList wSimp fs (I (a,b) n)
else integralOfList wTrap fs (I (a,b) n)
--You're a genius. Seriously, that was clever of you.
intGotcha fs (I (a,b) n) = if odd n
then integralOfList wSimp fs (I (a,b) n)
else let h = (b-a) / fromIntegral (n-1)
in integralOfList wSimp (init fs) (I (a,b-h) (n-1))
The Simpson's rule makes the error go as 1/n^4
.
A little Problem just for You
The error function is defined exactly by:
erf x = 2 / sqrt pi * integral (\y -> exp (-y**2)) (0,x)
Now, suppose we fix x=1
,
I am telling you that: erf 1 = 0.842700792949715
.
Your mission---should you choose to accept it---is to use the three methods we saw 1. Left 2. Trapezoidal 3. Simpson to investigate.
More precisely, for each of these methods, you can define the relative error as a function of n
:
--this is not actual code; part of your mission is to define relative error properly.
relError n = abs (methodAnswer n - exactAnswer) / exactAnswer
Then, you must:
Compute this function for all three methods for n = 3, 5, 9, 16, 32, 64, 128.
Make a table, or plot your results.
Compare the three methods.
(Very important) Have fun.
Hold your breath for the upcoming Gaussian Quadrature in 'Advanced Integration'.
Note on Performance.
As we will see later, the Gaussian quadrature is much better than Simpson's rule, but suppose we didn't know about such quadrature; how could we improve the performance of Simpson's rule?
So far, we used lists to represent our function. The compiler does an amazing job for you when you're working with lists, but (this is completely irrelevant now, and we will come back to it when it becomes relevant) lists are not the best representation for things of fixed length.
Because our integrator doesn't return a list (in fact, it doesn't have to take a list as input to begin with), we could replace lists by some similar representation.
A better representation, which has the same interface (has the same name for similar functions) and operations, is the Vector representation. If you are a performance junky, I challenge you to implement Simpson's rule without using any lists; instead, use Vector
from Data.Vector.Unboxed
. You can click on the green word for more info on Data.Vector.
A few thoughts on Optimization (we will come back to it)
When building complex programs, it is always better to get something simple working quickly. However, Computational Physics contains many (most) problems where your code needs to be super-fast in order to be useful.
My take on the subject is the following: 1. build the simplest program you can think of, with crystal-clear semantics,
- make sure it does what it's supposed to,
- explain (using meaningful names in your program, and comments) what it does, and how it does it.
- use abstraction and observation to identify the key features of your program. (more on that later)
- use optimized representations (Data.Vector is an optimized representation of Data.List; it does the same things, has a few more utilities purely for performance), and optimized libraries written by someone else. However, you must not change the semantics developed at step 1 unless you have an excellent reason (you found even clearer semantics; the original semantics turn out to be incomplete, which means you can't do certain things you wanted to do; the new semantics are as clear, as powerful, and allow for a dramatic performance increase).
- learn about optimization.
- optimize without changing the semantics. At this point, you should have a copy of your code which has the same semantics, but isn't optimized, you can think of it as a backup. Here is why:
- to show other people; say something like "this code isn't my actual code, but you can write your program which uses my code AS IF this were the actual code."
- to test whether your 'optimized' code is 1) actually much faster 2) actually equivalent in terms of meaning (semantics)
- use parallelism, GPUs, FFI.
- (if you are good at this step, this step should be inserted between 5. and 6.) Go back to the drawing board, and find a new algorithm, or read the literature for similar problems, or, and this is a sad alternative, understand that your problem can't be solved on today's computers (have you tried super-computers?). At this point, it is time to change your problem of study. If you are emotionally attached to the problem, it might be worthwhile to try setting it up differently.
You might be surprised at how rare you have to go past step 5 (not so rare in Computational Physics, but still rare). However, you sometimes have to go all the way to step 9. Later, we will learn to predict, at step 1, up to where a certain program is likely to take us. Even later, we might learn to predict to where we'll need to go without writing any code (just by looking at the problem).
Anyway, this is my take on optimization in a nutshell. If you stick to this course, we will eventually revisit this hard and crucial subject, but not until we actually need it.