I recently went off on a tangent trying to figure out how white noise works, and I found that there is a lot of strangeness to it that may not be apparent at a first glance. The content in this post is primarily from:

TLDR: We can’t just define a continuous-time white noise process as an -indexed collection of uncorrelated normal random variables because such a collection does not exist.

The Problem With White Noise

Let’s start with a few simple definitions. In the following we will assume we are working over the well-behaved probability space , where is the Lebesgue measure on the Borel -algebra .

A real-valued stochastic process is a random variable valued function such that is a real-valued random variable, or a measurable function from to . We can think of as representing time, but this does not need to be the case.

A real-valued stochastic process is stationary when its unconditional joint probability distribution does not change when shifted in . That is, for any and we have that the joint distributions of the sets of random variables and are the same.

Continuous-time white noise is often defined as a stationary real-valued stochastic process where all and for all we have that is when and otherwise. That is, for all , the random variables and are uncorrelated normal random variables with variance .

However, such a collection cannot exist! To see this, let’s define the collection of random variables . Then we have that is square integrable, and therefore in . However, is separable, and can therefore only countain countably many mutually orthogonal elements. This implies that not all can be mutually orthogonal.

Working around the Problem

To resolve this, we need to use some pretty beefy mathematical machinery. Basically, while we can’t define continuous-time white noise to be a random variable valued function over , we can define it as a random variable valued generalized function.

To start, let’s define the Brownian Motion Process to be a stochastic process that satisfies:

  • If then the random variables for are independent.
  • For each and the random variable has distribution .
  • For almost all , the function is everywhere continuous in .

The formal derivative in of is the continuous-time white noise process. It isn’t too hard to see why this should be the case: by the conditions above, the random variables formed from the increments in Brownian motion are independent and normally distributed. The differentiation process just continuous-ifies this. This suggests that we could reasonably hand wave white noise to be the derivative in of the Brownian motion process. Of course, things are more complex than this. In fact, for almost every the function is nowhere continuous in .

In order to resolve this, we need to switch from talking about functions to talking about generalized functions. A generalized function is a “linear functional on a space of test functions”. This is a mouthful, but it’s essentially just a linear mapping from a set of smooth functions of compact support (the test functions) into . We can think of a generalized function as behaving somewhat like a probability measure over the set of test functions (although a true mathematician might crucify me for saying this…).

We can view any continuous function as a generalized function. For example, if we write the application of the generalized function corresponding to Brownian motion to the test function as then we have:

Note that is itself a random variable that maps to . Now we define the derivative of the generalized function to be the generalized function such that . Therefore, the derivative of the generalized function corresponding to Brownian motion is the following random variable valued generalized function, which we can think of as a more formal definition of continuous-time white noise: