2 Transformations and Expectations

2.1 Distributions of Functions of a Random Variable

If X is a random variable with cdf \(F_X(x)\), then any function of X, say g(X), is also a random variable. We set Y=g(X), then for any set A

\[P(Y \in A) = P(g(X) \in A)\]

Formally, if we write y = g(x), the function g(x) defines a mapping from the original sample space of X, \(\mathcal{X}\), to a new sample space, \(\mathcal{Y}\), the sample space of the random variable Y. That is,

\[g(x): \mathcal{X} \to \mathcal{Y}\]

We associate with g an inverse mapping, denoted by \(g^{-1}\),

\[g^{-1}(A) = \{ x \in \mathcal{X}: g(x) \in A\}\]

If the random variable Y is now defined by Y = g(X), we can write for any set \(A \subset \mathcal{Y}\),

\[P(Y \in A) = P(g(X) \in A) = P(\{x\in\mathcal{X}: g(x) \in A\} = P(X \in g^{-1}(A))\]

If Y is a discrete random variable, the pmf for Y is

\[f_Y(y) = P(Y=y) = \sum_{x \in g^{-1}(y)}P(X=x) =\sum_{x \in g^{-1}(y)}f_X(x),\text{ for }y \in \mathcal{Y}\]

It’s easiest to deal with function g(x) that are monotone, that is those that satisfy either increasing or decreasing. It the transformation x –> g(x) is monotone, then it is one-to-one and onto from \(\mathcal{X} \to \mathcal{Y}\).

Example 2.1.1 Binomial transformation

Example 2.1.2 Uniform transformation

Theorem 2.1.3

Let X have cdf \(F_X(x)\), let Y = g(X), and let \(\mathcal{X} = \{ x: f_X(x) > 0\}\), \(\mathcal{Y} = \{y: y = g(x)\text{ for some }x \in \mathcal{X}\}\).

  • If g is an increasing function on \(\mathcal{X}\), \(F_Y(y) = F_X(g^{-1}(g))\text{ for }y \in \mathcal{Y}\)
  • If g is a decreasing function on \(\mathcal{X}\) and X is a continuous random variable, \(F_Y(y) = 1-F_X(g^{-1}(y))\text{ for }y \in \mathcal{Y}\)

Example 2.1.4 Uniform-exponenetial relationship-I

Theorem 2.1.5

Let X have pdf \(f_X(x)\) and let \(Y=g(X)\), where g is a monotone function. Let \(\mathcal{X} = \{ x: f_X(x) > 0\}\), \(\mathcal{Y} = \{y: y = g(x)\text{ for some }x \in \mathcal{X}\}\). Suppose that \(f_X(x)\) is continuous on \(\mathcal{X}\) and that \(g^{-1}(y)\) has a continuous derivative on \(\mathcal{Y}\). THen the pdf of Y is given by

\[f_Y(y)= \begin{cases}f_X(g^{-1}(y))|\frac{d}{dy}g^{-1}(y)| & \quad y \in \mathcal{Y} \\ 0 & \quad \text{otherwise}\end{cases}\]

Example 2.1.6 Inverted gamma pdf

Example 2.1.7 Square transformation

Example 2.1.9 Normal-chi squared relationship

Theorem 2.1.10 Probability integral transformation

Let X have continuous cdf \(F_X(x)\) and define the random variable Y as Y = \(F_X(X)\). Then Y is uniformly distributed on (0,1), that is, \(P(Y \leq y) = y, 0<y<1\).

2.2 Expected Values

Definition 2.2.1

The expected value or mean of a random varaible g(X), denoted by E(g(X)), is

\[E(g(X)) = \int_{-\infty}^{\infty} g(x)f_X(x)dx\], if X is continuous

\[E(g(x)) = \sum_{x \in \mathcal{X}} g(x)f_X(x) = \sum_{x \in \mathcal{X}} g(x)P(X=x)\], if X is discrete,

provided that the integral or sum exists.

Note: The proof of expectation of binomial distribution can be found here.

Theorem 2.2.5

Let X be a random variable and let a, b and c be constants. Then for any functions \(g_1(x)\) and \(g_2(x)\) whose expectations exist,

  • \(E(ag_1(X) + bg_2(X) + c) = aE(g_1(X))+bE(g_2(X)) + c\)
  • If \(g_1(x) \geq 0\) for all x, then \(E(g_1(X)) \geq 0\)
  • If \(g_1(x) \geq g_2(X)\) for all x, then \(E(g_1(X)) \geq E(g_2(X))\)
  • If \(a \leq g_1(x) \leq b\) for all x, then \(a \leq E(g_1(x)) \leq b\)

2.3 Moments and Moment Generating Functions

Definition 2.3.1

For each integer n, the nth moment of X (or \(F_X(x)\)), \(\mu^{'}_n = E(X^{n})\).

The nth central moment of X, \(\mu_n = E((X-\mu)^{n})\), where \(\mu = \mu_1^{'} = E(X)\).

Definition 2.3.2

The variance of a random variable X is its second central moment, \(Var(X) = E((X-E(X))^2)\). The positive square root of Var(X) is the sandard deviation of X.

Theorem 2.3.4

If X is a random variable with finite variance, then for any constants a and b,

\[Var(aX + b) = a^2 Var(X)\]

Note: The proof of variance of binomial distribution can be found here.

Definition 2.3.6

Let X be a random variable with cdf \(F_X\). The moment generating function(mgf) of X (or \(F_X\)), denoted by \(M_X(t)\), is

\[M_X(t) = E(e^{tX})\]

provided that the epectation exists for t in some neighborhood of 0.

Theorem 2.3.7

If X has mgf \(M_X(t)\), then

\[E(X^n) = M_X^{(n)}(0)\]

where we define

\[M_X^{(n)}(0) = \left.\frac{d^n}{dt^n}M_X(t)\right|_{t=0}\]

That is, the nth moment is equal to the nth derivative of \(M_X(t)\) evaluated at t=0.

kernel: the kernel of a function is the main part of the function, the part that remains when constants are disregarded.

2.4 Differentiating Under an Integral Sign

2.5 Miscellanea