(Brief intro to using linear function spaces and reconstruction kernels to represent continuous functions in a computer. Heavily geared to data visualization, of the specific “scientific” variety.)

In data visualization (and more generally, in the subfields of data
analysis that tend to care about spatial data), we often want to
use computers to store, manipulate, and represent
*functions*. Consider, for example, a map of surface temperature over
a region. For any point in the region, we can in principle read off
the temperature. There are, then, infinitely many points (uncountably
many, in fact). This presents a challenge for representing these
functions as arrays of data in a computer. At the same time, we know
it’s possible to write computer procedures that produce answers
for an infinity of possible inputs:

```
def f(x):
return x * x
def g(x):
if x <= -1/2: return 0
if x >= 1/2: return 0
return 1
```

Functions like `f`

and `g`

are easy to write, but we want to avoid
having to write new functions for every new data observation that we
have. Consider the difference between a temperature map in the summer
and one in the winter. Ideally, we would not have to change the way we
write our functions very much to switch from a representation of the
summer map to the winter map.

Our solution is to define a *space* of possible functions that is
created by linear combinations of simpler functions. Of course, you’ve
seen this kind of thing before. The space of all linear functions on
the reals $R \to R$ can be created by giving different $a$ and $b$
values to `line`

:

```
def constant(x):
return 1
def line(x):
return x
def line(a, b):
def f(x):
return a * constant(x) + b * line(x)
return f
# line(1, 3)(5) == 16
```

Each different function in this space, then, is completely characterized by the array of weights we use, making the space very convenient for storage and representation on computers.

One standard way of creating new functions (so that our basis gets richer and we can create more interesting linear function spaces) is to take powers of existing functions. For example, taking linear combinations of non-negative powers of the identity function yields the space of all polynomials. This is a very rich space, but is not very good for approximation when you have equally-spaced values. Let’s see what happens when we try to approximate the Runge function, $f(x) = 1 / (1 + 25x^2)$, using progressively larger-degree polynomials (depicted by darker, thicker lines), such that your polynomial always matches the values of the function that you have observed:

As you can see, the higher-degree polynomial fits match the sampled data at more points, but they start to “wiggle” more, such that in between the observed points, the values oscillate wildly.

Instead, we’ll build our space by taking simple functions that have finite support, and shifting them around:

```
def square(x):
if x <= -1/2: return 0
if x >= 1/2: return 0
return 1
def shifted_square(i):
def f(x):
return shifted_square(x - i)
return f
```

Now linear function spaces of (progressively more) shifted squares do something more interesting:

When we try to approximate the same function as before, the sampled values (and the rectangles) track the function progressively better and better, and the approximation function does not oscillate. Of course, we lost something in this exchange. The polynomial fits we obtained were smooth: this means that we had access to function derivatives, which are useful in a variety of settings. The function fits we got aren’t even continuous; that’s not ideal.

Fortunately, this is a situation which we can easily fix by
creating better “simple functions”. Also, from now on, we will call these
simple functions *reconstruction kernels*, or *kernels* for short.
(Notice that the term kernel is used in statistics to denote a
completely different notion. The confusion is most often avoided by
noticing that reconstruction kernels take a single parameter while
similarity kernels from statistics are always two-parameter functions
that compare values.)

```
def b0(x):
if x <= -1/2: return 0
if x >= 1/2: return 0
return 1
def b1(x):
if x <= -1: return 0
if x <= 0: return 1+x
if x <= 1: return 1-x
return 0
def b2(x):
# ...
```

So, how do we create function spaces that aren’t prone to oscillation, but allow smooth functions? We use kernels that are themselves continuous, or smooth. In the examples below, we are creating functions by giving weights $[0.4, 0.3, 0.5, 0.7, 0.6]$ to each of the basis functions in order. Because the basis functions are different, the reconstructed function is itself different.

A proper discussion of reconstruction kernels is far beyond the scope of this (and any one) piece; see this piece instead.

Note, though, that as the reconstruction kernels get smoother, so do
the reconstructions. This happens because of a trivial, but very
important property of linear function spaces: since these are linear
spaces, function operations that are themselves linear will factor
through to the basis functions. For example, the derivative is a
linear operator. As a result, the derivative of a function from a
linear function space is necessarily a member of a *different* linear
function space, whose basis vectors are the derivatives of the basis
vectors of the original function space:

The same thing happens for integrals, expectations, convolutions, etc. This makes linear function spaces very computationally convenient.

So far, we’ve seen function spaces whose domain are the real numbers (a one-dimensional space). It is straightforward to extend the notion to multidimensional functions: we just change the domain of the function to operate on $R^n$ instead of $R$. The only significant change is that our reconstruction kernels need to be themselves two-dimensional.

Do note here a bit of terminological ambiguity: in this context, we are using “multi-dimensional” to refer to the domain of the function, not to the dimension of the space (as in its rank).

The most common way to create multi-dimensional reconstruction kernels
is to do it one dimension at a time. Concretely speaking, the space of
all possible multidimensional reconstruction kernels is very large,
and one natural solution is to look for kernels that *factor*:

Using B-Splines as the separable filters is very common, and very convenient.

TODO: Add illustrations of a simple 2D function using $\beta_0$, $\beta_1$, $\beta_2$ as the separable filters.

- Mitchell and Netravalli, Reconstruction Filters in Computer Graphics, SIGGRAPH ‘88. A classic in the area.
- Moller et al, Design of Accurate and Smooth filters for function, VolVis 1998. How to design filters in order to get good spatial properties out of them.