This is a brief introduction to methods for finding numerical solutions to initial value problems in ordinary differential equations. This piece is biased towards the main application of interest for me, namely computing streamlines of vector fields.

Consider the vector field below.

The problem we are going to solve is that of computing *integral
lines*, curves that are everywhere tangent to the vector field. More
concretely, we are given a vector field $v: R^2 \to R^2$ and are
looking for a parametric curve given by two functions of $t$, $x(t),
y(t)$ such that the initial value is given: $x(0)=x_0$, $y(0) = y_0$,
and the tangent vector of the curve is equal to the vector
field. Concretely speaking, we want that, at all values of $t$,
$v(x(t),y(t)) = (x’(t), y’(t))$ (we’re using $f’$ to denote the
derivative of $f$ with respect to $t$ specifically).

This particular vector field has an analytical definition, $v(x, y) =
(\cos(x + 2y), \sin(x - 2y))$. That we can write down a simple
equation for it does not help very much in this case, because
generally speaking, it will be hopeless to find a closed-form solution
for all but the simplest initial value problems. So what we look for
instead is a *numerical approximation* of the continuous curve $x(t),
y(t)$ by a finite set of samples.

Euler integration is by far the simplest method for solving ODEs numerically. The idea behind Euler integration is very simple: if at a given point we know the value of a vector field, and the vector field is continuous, then we expect it to not change very much in the neighborhood. More drastically, we assume it’s locally constant, so we can a step in the direction of the vector field. That lands us in a new spot in the plane, from which we sample the vector field again, and so on. The only free parameter is the size of the step we take, which we call $h$ here:

```
def euler_integration(vector_field, initial_x, initial_y, h):
t = 0
px = initial_x
py = initial_y
while True:
report_point(px, py, t)
(vx, vy) = vector_field(px, py)
px += vx * h
py += vy * h
t += h
```

(In practice, of course, we’d stop the process with some criterion) At first sight, this seems a reasonable way to compute approximations: as $h$ gets smaller, we take smaller steps, and if the vector field is continuous, then we should expect Euler integration to get progressively better. Here’s an example. Drag the black circle around to see different curves approximated with Euler integration:

Things seem ok, right? But look at what happens when we try this much simpler vector field:

This is the vector field $v(x,y) = (-y, x)$. The integral lines of
this vector field are circles (this is easy to see from $\cos’ =
-\sin, \sin’ = \cos$), but the approximation lines we get from Euler’s
integration are spirals. Intuitively, this is relatively easy to
understand: a finite step along the vector field always increases the
distance from the origin. More importantly, the *fraction* with which
the distance from the origin increases is the same at every step (by a simple
argument of congruent triangles). This means that, as a function of
$t$, our approximation $\tilde{x}(t), \tilde{y}(t)$ is such that the
distance from the origin increases *exponentially* as $t$ increases.

This happens for real-world, more complicated vector fields. Consider the Lotka-Volterra equations, a pair of non-linear differential equations which model predator-prey relationships, $u’ = u(2-v), v’ = v (u - 1)$.

The integral lines of this model should be cycles, and yet,

This is a serious problem. But how do we solve it?

One way to make the solution more accurate is to reduce the size of
each step. However, reducing the step size by half doubles the amount
of points which are used to cover a given interval in $t$. Note,
also, that in the Lotka-Volterra example above that the lengths of
each of the steps can change quite a bit. So if reduce the parameter
$h$ so that the largest distance between each pair of sampled values
is small, then the *smallest* distance will be *very* small, which means
that covering a lot of $t$ will take a huge number of steps.

Instead, we could try to estimate the error we’re accruing at every step. One estimate of this error is one half of the length of the difference between the two consecutive vectors we sampled along the vector field, multiplied by $h$ (this comes from a Taylor series expansion of the approximations). So, if we set our parameter to be, instead of $h$, a maximum tolerance for error $T$, then we can make the steps bigger every time our error estimate says our integration is conservative, and make the steps smaller every time our error estimate says our integration is too aggressive:

Notice how now the points are more evenly spaced around the curve, which means that as we make the step-size smaller, we use our samples more efficiently. Still, it remains the case that no matter how small we want our error is, it keeps getting compounded fairly fast. We need a different approach.

The first algorithm that is good enough to be used in practice is so
simple that it’s a bit magical. The idea is as follows. At each time
step, we evaluate the vector field at the current point, which gives a
future position using the Euler integration rules. Now, instead of
simply jumping over to that position, we evaluate the vector field
there, and take the *average* of the two vectors as the vector to use.

```
def euler_integration_pc(vector_field, initial_x, initial_y, h):
t = 0
px = initial_x
py = initial_y
while True:
report_point(px, py, t)
(vx, vy) = vector_field(px, py)
(vx2, vy2) = vector_field(px+h*vx, py+h*vy)
px += ((vx + vx2) / 2) * h
py += ((vy + vy2) / 2) * h
t += h
```

This seems like a minor change, but notice this:

Even with a very step size, the accumulation error has
effectively disappeared! What gives? To get an intuition for the
situation, let’s consider a different algorithm. Imagine that, instead
of taking the average of `(vx, vy)`

and `(vx2, vy2)`

, we just took `(vx2, vy2)`

as the value:

```
def weird_euler_integration(vector_field, initial_x, initial_y, h):
t = 0
px = initial_x
py = initial_y
while True:
report_point(px, py, t)
(vx, vy) = vector_field(px, py)
(vx2, vy2) = vector_field(px+h*vx, py+h*vy)
px += vx2 * h
py += vy2 * h
t += h
```

What happens then?

Now, instead of spiraling away from the center of the cyclic behavior,
the approximation spirals *into* it. This is intuitively clear in the
case of the circular vector field: using the vector from the “future” position
of Euler integration makes the curve “turn too early”. Clearly, using
the vector field from the present position makes the curve “turn too
late” (since that’s the problem with Euler integration to begin with).
So it must be the case that *some* average of the two
approaches cancels the error out. What’s not so clear is that the
correct weight is 1) independent of the actual vector field, and 2)
equal to $1/2$. While we won’t go into the derivation here, the
easiest way to arrive at it is to first check that the averaging rule
works assuming that the vector field is a linear function of its
coordinates, and then generalize by taking appropriate Taylor series:
you’ll get that the error for the method is essentially proportional
to $h^2$, and to the dominating term of the *quadratic* parameter of
the Taylor series fit.

What if we want a method that converges well if the vector field has significant quadratic terms? We use more sophisticated versions of the above idea, probing neighboring values of the vector field and averaging them appropriately. The most common such method is known as RK4, standing for Runge-Kutta’s 4th order method.

TBF.

TBF.

TBF.

- Butcher, Numerical Methods for Ordinary Differential Equations. Wiley, 2nd ed, 2008.