4 Multiple Random Variables
4.1 Joint and Marginal Distributions
The probability models and computation of probability for events involving only one random variable are called univariate models. The probability models that involve more than one random variable are called multivariate models. The models involving two random variables are called bivariate models.
Definition 4.1.1 An n-dimensional random vector is a function from a sample space S into \(R^n\), n-dimensional Euclidean space.
Definition 4.1.3 Let (X,Y) be a discrete bivariate random vector. Then the function \(f(x,y)\) from \(R^2\) into R defined by \(f(x,y) = P(X=x,y=y)\) is called the joint probability mass function or joint pmf of (X,Y). If it is necessary to stress the fact that \(f\) is the joint pmf of the vector (X,Y) rather than some other vector, the notation \(f_{X,Y}(x,y)\) will be used.
Theorem 4.1.6 Let (X,Y) be a discrete bivariate random vector with joint pmf \(f_{X,Y}(x,y)\). Then the marginal pmfs of X and Y, \(f_X(x) = P(X=x)\) and \(f_Y(y)=P(Y=y)\), are given by
\[f_X(x) = \sum_{y \in R} f_{X,Y}(x,y)\text{ and }f_Y(y) = \sum_{x \in R} f_{X,Y}(x,y)\]
Definition 4.1.10 A function \(f(x,y)\) from \(R^2\) into R is called a joint probability density function or joint pdf of the continuous bivariate random vector (X,Y) if, for every \(A \subset R^2\),
\[P((X,Y) \in A) = \int_A \int f(x,y)dxdy\]
The marginal probability density functions of X and Y are given by
\[f_X(x) = \int_{-\infty}^{\infty}f(x,y)dy,\text{ } -\infty < x < \infty\]
\[f_Y(y) = \int_{-\infty}^{\infty}f(x,y)dx,\text{ }-\infty< y< \infty\]
The joint cdf (cumulative distribution function) is the function F(x,y) defined by for a discrete random vector,
\[F(x,y) = P(X \leq x, Y \leq y)\]
or for a continuous bivariate random vector,
\[F(x,y) = \int_{-\infty}^x\int_{-\infty}^y f(s,t) dt ds\]
4.2 Conditional Distributions and Independence
Definition 4.2.1 Let (X,Y) be a discrete bivariate random vector with joint pmf f(x,y) and marginal pmfs \(f_X(x)\) and \(f_Y(y)\). For any x such that \(P(X=x)=f_X(x) > 0\), the conditional pmf of Y given that X=x is the function of y denoted by f(y|x) and definied by
\[f(y|x) = P(Y=y|X=x) = \frac{f(x,y)}{f_X(x)}\]
Definition 4.2.3 Let (X,Y) be a continuous bivariate random vector with joint pdf f(x,y) and marginal pdfs \(f_X(x)\) and \(f_Y(y)\). For any x such that \(f_X(x) > 0\), the conditional pdf of Y given that X=x is the function of y denoted by f(y|x) and defined by
\[f(y|x) = \frac{f(x,y)}{f_X(x)}\]
Definition 4.2.5 Let (X,Y) be a bivariate random vector with joint pdf or pmf f(x,y) and marginal pdfs or pmfs \(f_X(x)\) and \(f_Y(y)\). Then X and Y are called independent random variables if, for every \(x \in R\text{ and }y \in R\),
\[f(x,y) = f_X(x)f_Y(y)\]
Theorem 4.2.10 Let X and Y be independent random variables.
- For any \(A \subset R\text{ and } B \subset R\), \(P(X \in A, Y \in B) = P(X \in A) P(Y \in B)\); that is, the events {X A} and {Y B} are independent events.
- Let g(x) be a function only of x and h(y) be a function only of y. Then \(E(g(X)h(Y)) = E(g(X))E(h(Y))\).
Theorem 4.2.12 Let X and Y be independet random variables with moment generating functions \(M_X(t)\) and \(M_Y(t)\). Then the moment generating function of the random variable Z = X + Y is given by \(M_Z(t) = M_X(t)M_Y(t)\)
Theorem 4.2.14 Let \(X \sim n(\mu, \sigma^2)\text{ and }Y \sim n(\gamma, \tau^2)\) be independent normal random variables. Then the random variable Z = X + Y has a \(n(\mu+\gamma, \sigma^2 + \tau^2)\) distribution.
4.3 Bivariate Transformations
Let \((X,Y)\) be a bivariate random vector with a know probability distribution. Now consider a new bivariate random vector \((U,V)\) defined by \(U=g_1(X,Y)\) and \(V=g_2(X,Y)\), where \(g_1(x,y)\) and \(g_2(x,y)\) are some specified function.
Theorem 4.3.2 If \(X \sim Poisson(\theta)\) and \(Y \sim Poisson(\lambda)\), and X and Y are independent, then \(X+Y \sim Poisson(\theta+\lambda)\)
Consider \((X,Y)\) is a continuous random vector with joint pdf \(f_{X,Y}(x,y)\), assume the transformation \(u=g_1(x,y)\) and \(v=g_2(x,y)\) are one-to-one transformation, denote the inverse transformation by \(x=h_1(u,v)\) and \(y=h_2)u,v\), the joint pdf of (U,V) is given by
Equation 4.3.2 \(f_{U,V}(u,v)=f_{X,Y}(h_1(u,v), h2(u,v))|J|\)
, where J is the determinant of a matrix of partial derivateives (Jacobian of the transofrmations)
\[J=\begin{vmatrix} \frac{\partial x}{\partial u} & \frac{\partial x}{\partial v} \\ \frac{\partial y}{\partial u} & \frac{\partial y}{\partial v} \end{vmatrix}=\frac{\partial x}{\partial u} \frac{\partial y}{\partial v}-\frac{\partial x}{\partial v} \frac{\partial y}{\partial u}\]
Theorem 4.3.5 Let X and Y be independent random variables. Let \(g(x)\) be a function only of x and \(h(y)\) be a function only of y. Then the random variables \(U=g(X)\) and \(V=h(Y)\) are independent.
Example 4.3.6 (Distribution of the ratio of normal variables) The ratio of two independent standard normal random variables is a Cauchy random variable.
4.4 Hierarchical Models and Mixtrue Distributions
Example 4.4.1 (Binomial-Poisson hierarchy)
Complicated processes may be modeled by a sequence of relatively simple models placed in a hierarchy
Dealing with the hierarchy, dealing with conditional and marginal distributions
Theorem 4.4.3 If X and Y are any two random variables, then
\[EX=E(E(X|Y))\]
, provided that the expectations exist
Mixture distribution refers to a distribution arising from a hierarchical structure.
Definition 4.4.4 A random variable X is said to have a mixture distribution if the distribution of X depends on a quantity that also has a distribution.
the hierarchy at two stages, multistage hierarchy
noncentral chi-squared distribution the pdf for p degrees of freedom and noncentrality parameter \(\lambda\) is given by
\[f(x|\lambda, p)=\sum_{k=0}^{\infty}\frac{x^{p/2+k-1}e^{-x/2}}{\Gamma(p/2+k)2^{p/2+k}}\frac{\lambda ^ke^{-\lambda}}{k!}\]
This can be seen as a mixture distribution, made up of central chi-squared densities and Poisson distributions. That is
\[X|K \sim \chi^2_{p+2K}\]
\[K \sim Poisson(\lambda)\]
So
\[EX = E(E(X|K))=E(p+2K)=p+2\lambda\]
Theorem 4.4.7 (Conditional variance identity) For any two random variables X and Y,
\[Var X = E(Var(X|Y))+Var(E(X|Y))\]
4.5 Covariance and Correlation
covariance and correlation: two numerical measures of the strength of a relationship between two random variables
Notation: \(EX=\mu_X\), \(EY=\mu_Y\), \(Var X=\sigma^2_X\), and \(Var Y=\sigma^2_Y\)
Definition 4.5.1 The covariance of X and Y is the number defined by
\[Cov(X,Y)=E((X-\mu_X)(Y-\mu_Y))\]
Definition 4.5.2 The correlation of X and Y is the number defined by
\[\rho_{XY}=\frac{Cov(X,Y)}{\sigma_X\sigma_Y}\]
, where the value \(\rho_{XY}\) is called the correlation coefficient.