Web Sfog - Absorbin Everything interesting: Probability 2

Second post in the series.

Conditional expectation

Given a random variable \(X\) over probability space of \((\Omega, \mathcal{F}, P)\) with \(\mathbb{E}[X]<\infty\) we define conditional expectation \( Y=\mathbb{E} [X|\mathcal{F}_1] \) where \(\mathcal{F}_1 \subset \mathcal{F}\) as follows:

It is random variable measurable on \(\mathcal{F}_1\)
The following holds for any \(A\in\mathcal{F}_1, \mathbb{E}[Y \mathbb{1}_{A}]=\mathbb{E}[X\mathbb{1}_{A}]\)

What does those conditions imply?

Lets begin with the first condition. The measurability on \(\mathcal{F}_1\) is the requirement that the level sets, \(\{ \omega \mid \mathbb{E} [X|\mathcal{F}_1](\omega) \leq \alpha; \forall \alpha \in \mathbb{R}\}\), of the random variable in a question, are in \(\mathcal{F}_1\).

To get a feeling of what’s going on, I think of \( \mathbb{E} [X|\mathcal{F}_1] \) as a steps function over sets in \(\mathcal{F}_1\). The justification comes from the fact that any measurable function can be approximated as the limit of a sequence of steps functions. When I say “steps function” I refer to a function defined as \(\sum_{A \in \mathcal{F}} \alpha_{A} \mathbb{1}_A\) usually called “simple function” (that notion should be familiar from Lebesgue integration theory).

To build the intuition for conditional expectation I will define the notion of minimal sets in \(\mathcal{F}_1\) . Minimal set is one that can’t be further divided into smaller sets inside \(\mathcal{F}_1\). look at the very simple case below, where \(\mathcal{F}_1\) consists only of \(A\) and its completion \(A^{C}\). Since \(A\) and \(A^{C}\) are the only sets in \(\mathcal{F}_1\) (except the whole \(\Omega\)), they are obviously the minimal sets there. The random variable \(Y\) on \(\mathcal{F}_1\) can change only on the boundaries of the minimal sets. It is easy to see why, suppose \(Y\) did change inside, obviously, that would create level sets which are not in \(\mathcal{F}_1\) in contradiction to \(Y\) being measurable on \(\mathcal{F}_1\). So minimal sets dictate the best resolution of the random variables that live on \(\mathcal{F}_1\).

Now I will turn my attention to the second condition. It requires that the expectation of both the \(X\) and the \(Y\), agree when calculated over sets in \(\mathcal{F}_1\).

This means that conditional expectation \(Y\) is a coarse estimation of the original random variable \(X\). Fires note that the minimal sets in \(\mathcal{F}_1\) are not in general minimal in \(\mathcal{F}\). It is possible to further divided them to get still smaller sets in \(\mathcal{F}\). Consequently the resolution of random variable on \(\mathcal{F}\) is greater then on \(\mathcal{F}_1\). Thus the variable in \(\mathcal{F}\) may appear smooth while variable in \(\mathcal{F}_1\) may appear coarse in comparison.

Now think of \(X\) as step function that is defined on so fine structure of \(\mathcal{F}\) that it just appears as smooth function over \(\Omega\), while \(Y\) is confined to the very course structure of \(\mathcal{F}_1\subset \mathcal{F}\):

In order to satisfy condition 2 the average of \(X\) and \(Y\) over sets in \(\mathcal{F}_1\) should agree. \(Y\) is constant on the minimal sets of \(\mathcal{F}_1\) so it has to be equal to the average of \(X\) over minimal set of \(\mathcal{F}_1\). If you still follow me, you should see by now why \(Y\) is a sort of course version of \(X\) by courser resolution of \(\mathcal{F}_1\).

What is the relation to the “undergraduate” notion of conditional expectation?

In introduction to the probability theory course we defined conditional expectation differently. Given random variable \(X\) and event \(A\) conditional expectation , \(\mathbb{E}[X|A]\), is \(\int_{A} Xd\omega\) , so how this definition is related to the “graduate” conditional expectation?

First of all while \(\mathbb{E}[X|A]\) is real number the “graduate” definition talks about a random variable.

All the rest is pretty straight forward, lets define \(\mathcal{F}_1\) to be \((A,A^{C},\varnothing,\Omega)\) then \( Y=\mathbb{E} [X|\mathcal{F}_1] \) equals to:

\[\mathbb{E} [X|\mathcal{F}_1]=\begin{cases}\mathbb{E}[X|A],&\quad \forall \omega \in A\\ \mathbb{E}[X|A^{C}],&\quad \forall \omega \in A^{C}\end{cases}\]

That’s it!