Week 7: Poisson Processes, Joint Distributions, and Covariance

0. Logistical Info

Section date: 11/1
Associated lectures: 10/24, 10/26
Associated pset: Pset 7, due 11/3
Office hours on 11/1 from 7-9pm at Quincy Dining Hall
Remember to fill out the attendance form

0.1 Summary + Practice Problem PDFs

1. Moment Generating Functions (MGFs)

For a random variable $X$ , the $n^{th}$ moment is $E (X^{n})$ .

For a random variable

X

, the moment generating function (MGF) is

M_{X} (t) = E (e^{t X})

for

t \in R

. If the MGF exists, then

\begin{aligned} M_{X} (0) & = 1, \\ \frac{d^{n}}{d t^{n}} M_{X} (t) |_{t = 0} = M_{X}^{(n)} (t) & = E (X^{n}) . \end{aligned}

You should sanity-check that

M_{X} (0) = 1

whenever you calculate an MGF.

Useful MGF results:

For independent random variables, $X, Y$ with MGFs $M_{X}, M_{Y}$ , then $M_{X + Y} (t) = M_{X} (t) M_{Y} (t)$ .
For random variable $X$ and scalars $a, b$ , $\begin{array}{r} M_{a + b X} (t) = e^{a t} M_{X} (b t) \end{array}$ since $M_{a + b X} (t) = E (e^{t (a + b X)}) = e^{a t} E (e^{b t X})$ .

A distribution is uniquely determined by any of the following:

PMF (common for discrete),
PDF,
CDF (common for continuous),
MGF, or
matching to a named distribution (common).

2. Poisson Processes

Consider a problem similar to Blissville/Blotchville, where $T_{1}, T_{2}, \dots,$ represent the arrival times of busses (the amount of time from when we started waiting to when each bus arrives). Then the bus arrival process is a Poisson process with rate $λ$ if it satisfies the following conditions:

For any interval in time of length $t > 0$ , the number of arrivals in that interval is distributed $Pois (λ t)$ .
For any non-overlapping (disjoint) intervals of time, the number of bus arrivals are independent. This applies for any “arrival process” where $T_{1}, T_{2}, \dots$ correspond to arrival times.

Pay attention to units:

λ

is a rate. So if

λ

has units of arrivals per hour, then

t

should have units of hours.

Results:

Inter-arrival times: In a Poisson process with rate $λ$ , the inter-arrival times (the time for the first arrival, $T_{1}$ , and the times between consecutive arrives $T_{2} - T_{1}, T_{3} - T_{2}, \dots$ ) are each independently distributed $\begin{array}{r} T_{1}, T_{2} - T_{1}, T_{3} - T_{2}, \dots \overset{i . i . d .}{\sim} Expo (λ) . \end{array}$

Additionally note that

T_{2}, T_{3}, \dots,

are not exponentially distributed. In fact, they follow Gamma distributions (which we will introduce soon):

T_{n} \sim Gamma (n, λ)

Count-time duality: Fix a time $t > 0$ . Let $N_{t}$ be the number of arrivals in the time interval $[0, t]$ , and let $T_{n}$ be the arrival time of the $n$ -th arrival. Then $\begin{array}{r} (T_{n} > t) = (N_{t} < n) . \end{array}$

3. Marginal, Conditional, and Joint Distributions

Marginal, conditional, and joint distributions

Consider two random variables $X, Y$ .

	Joint	Marginal	Conditional
Distribution	$(X, Y)$	$X$	$X \| Y = y$
PMF	$P (X = x, Y = y)$	$P (X = x)$	$P (X = x \| Y = y)$
CDF	$P (X \leq x, Y \leq y)$	$P (X \leq x)$	$P (X \leq x \| Y = y)$

For example, $P (X | Y = y)$ is a marginal PMF. All of these apply if we flip $X$ and $Y$ , and PDFs follow analogously from PMFs.

Marginalization: If we know the joint distribution of random variables $(X, Y)$ , then we can find the marginal distribution of $X$ (and analogously, $Y$ ) by LOTP: $\begin{aligned} P (X = x) & = \sum_{y} P (X = x, Y = y), & X, Y discrete . \\ f_{X} (x) & = \int_{- \infty}^{\infty} f_{X, Y} (x, y), & X, Y continuous . \end{aligned}$

Note that marginal distributions of

X

and

Y

are not sufficient (not enough information) to find the joint distribution of

X, Y

Joint from marginal and conditional: If we know the marginal distribution of $X$ and the conditional distributions $Y | X = x$ for any $x$ , then we can find the joint distribution of $(X, Y)$ by factoring out our probability: $\begin{aligned} P (X = x, Y = y) & = P (X = x) P (Y = y | X = x), & X, Y discrete. \\ f_{X, Y} (x, y) & = f_{X} (x) f_{Y | X = x} (y), & X, Y continuous. \end{aligned}$

Independence of random variables: Random variables

X, Y

are independent if for all

x

and

y

, any of the following hold (they imply each other, if valid):

\begin{aligned} F_{X, Y} (x, y) = P (X \leq x, Y \leq Y) & = P (X \leq x) P (Y \leq Y) = F_{X} (x) F_{Y} (y), & CDFs for any X, Y . \\ P (X = x, Y = y) & = P (X = x) P (Y = y), & PMFs for discrete X, Y . \\ f_{X, Y} (x, y) & = f_{X} (x) f_{Y} (y), & PDFs for continuous, X, Y . \end{aligned}

2D LOTUS: Let $X, Y$ be random variables with known joint distribution. For $g : support (X) \times support (Y) \to R$ , LOTUS extends to 2 dimensions (or analogously for any larger dimensions) to give $\begin{aligned} E (g (X, Y)) & = {\begin{cases} \sum_{x} \sum_{y} g (x, y) P (X = x, Y = y), & X, Y discrete \\ \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} g (x, y) f_{X, Y} (x, y) d x d y, & X, Y continuous . \end{cases} \end{aligned}$

4. Covariance and Correlation

Covariance and Correlation

The covariance of random variables $X, Y$ is $\begin{aligned} Cov (X, Y) & = E ([X - E X] [Y - E Y]) \end{aligned}$ where $E X$ is shorthand for $E (X)$ . Equivalently, $\begin{aligned} Cov (X, Y) & = E (X Y) - E (X) E (Y) . \end{aligned}$

The correlation of random variables $X, Y$ is $\begin{aligned} Corr (X, Y) & = \frac{Cov (X, Y)}{\sqrt{Var (X) Var (Y)}} & = \frac{Cov (X, Y)}{SD (X) SD (Y)}, \end{aligned}$ where $SD (X) = \sqrt{Var (X)}$ is the standard deviation of $X$ . Equivalently, we first standardize $X$ and $Y$ , then find their covariance: $\begin{aligned} Corr (X, Y) & = Cov (\frac{X - E (X)}{SD (X)}, \frac{Y - E (Y)}{SD (Y)}) . \end{aligned}$

$X$ and $Y$ are

positively correlated if $Corr (X, Y) > 0$ ,
negatively correlated if $Corr (X, Y) < 0$ ,
uncorrelated if $Corr (X, Y) = 0$ .

Since correlation and covariance have the same sign, this also applies for positive/negative/zero covariance.

Properties of covariance: see page 327 in Blitzstein & Huang for full list. Let $X, Y, W, Z$ be random variables, as well as those of the form $X_{1}, X_{2}, \dots,$ .

If $X, Y$ are independent, then $Cov (X, Y) = 0$ (so $X, Y$ are uncorrelated).
$Cov (X, X) = Var (X)$ .
$Var (\sum_{i} X_{i}) = \sum_{i} Var (X_{i}) + \sum_{i < j} 2 Cov (X_{i}, X_{j})$ .
- This can be especially useful for finding the variance of a sum of indicators.
$\begin{aligned} Cov (X + Y, W + Z) & = Cov (X, W) + Cov (X, Z) \\ + Cov (Y, W) + Cov (Y, Z) . \end{aligned}$
$Cov (a X, b Y) = a b Cov (X, Y)$ .

The last two properties are referred to as bilinearity.

Properties of correlation Let $X, Y$ be random variables.

-If $X, Y$ are independent, then $Corr (X, Y) = 0$ (so $X, Y$ are uncorrelated)
$- 1 \leq Corr (X, Y) \leq 1$ .

Uncorrelated does NOT imply independent: In the previous two results, we noted independent random variables have zero correlation and zero covariance. However, the converse does not apply: uncorrelated random variables are not necessarily independent.