Week 7: Poisson Processes, Joint Distributions, and Covariance

0. Logistical Info

  • Section date: 11/1
  • Associated lectures: 10/24, 10/26
  • Associated pset: Pset 7, due 11/3
  • Office hours on 11/1 from 7-9pm at Quincy Dining Hall
  • Remember to fill out the attendance form

0.1 Summary + Practice Problem PDFs

Summary + Practice Problems PDF

Practice Problem Solutions PDF

1. Moment Generating Functions (MGFs)

For a random variable $X$, the $\mathbf{n^{th}}$ moment is $E(X^n)$.

For a random variable $X$, the moment generating function (MGF) is $$M_X(t) = E(e^{tX})$$ for $t \in \mathbb{R}$. If the MGF exists, then \begin{align*} M_X(0) &= 1,\\ \frac{d^n}{dt^n} M_X(t) |_{t=0} = M_X^{(n)}(t) &= E(X^n). \end{align*} You should sanity-check that $M_X(0) = 1$ whenever you calculate an MGF.

Useful MGF results:

  • For independent random variables, $X, Y$ with MGFs $M_X, M_Y$, then $M_{X+Y}(t) = M_X(t) M_Y(t)$.
  • For random variable $X$ and scalars $a, b$, \begin{align*} M_{a+bX}(t) = e^{at} M_X(bt) \end{align*} since $M_{a+bX}(t) = E(e^{t(a+bX)}) = e^{at} E(e^{btX})$.

A distribution is uniquely determined by any of the following:

  • PMF (common for discrete),
  • PDF,
  • CDF (common for continuous),
  • MGF, or
  • matching to a named distribution (common).

2. Poisson Processes

Consider a problem similar to Blissville/Blotchville, where $T_1, T_2, \ldots,$ represent the arrival times of busses (the amount of time from when we started waiting to when each bus arrives). Then the bus arrival process is a Poisson process with rate $\lambda$ if it satisfies the following conditions:

  1. For any interval in time of length $t > 0$, the number of arrivals in that interval is distributed $\mathrm{Pois}(\lambda t)$.
  2. For any non-overlapping (disjoint) intervals of time, the number of bus arrivals are independent. This applies for any “arrival process” where $T_1, T_2, \ldots$ correspond to arrival times.
Pay attention to units: $\lambda$ is a rate. So if $\lambda$ has units of arrivals per hour, then $t$ should have units of hours.

Results:

  • Inter-arrival times: In a Poisson process with rate $\lambda$, the inter-arrival times (the time for the first arrival, $T_1$, and the times between consecutive arrives $T_2-T_1, T_3-T_2, \ldots$) are each independently distributed \begin{align*} T_1, T_2-T_1, T_3-T_2, \ldots \stackrel{i.i.d.}{\sim} \mathrm{Expo}(\lambda). \end{align*}
Additionally note that $T_2, T_3, \ldots,$ are not exponentially distributed. In fact, they follow Gamma distributions (which we will introduce soon): $T_n \sim \mathrm{Gamma}(n, \lambda)$.
  • Count-time duality: Fix a time $t > 0$. Let $N_t$ be the number of arrivals in the time interval $[0, t]$, and let $T_n$ be the arrival time of the $n$-th arrival. Then \begin{align*} (T_n > t) = (N_t < n). \end{align*}

3. Marginal, Conditional, and Joint Distributions

Marginal, conditional, and joint distributions

Consider two random variables $X, Y$.

JointMarginalConditional
Distribution$(X, Y)$$X$$X\vert Y=y$
PMF$P(X = x, Y = y)$$P(X = x)$$P(X = x\vert Y = y)$
CDF$P(X \le x, Y \le y)$$P(X \le x)$$P(X \le x \vert Y = y)$

For example, $P(X \vert Y = y)$ is a marginal PMF. All of these apply if we flip $X$ and $Y$, and PDFs follow analogously from PMFs.

  • Marginalization: If we know the joint distribution of random variables $(X, Y)$, then we can find the marginal distribution of $X$ (and analogously, $Y$) by LOTP: \begin{align*} P(X = x) &= \sum_y P(X = x, Y = y), & \text{$X,Y$ discrete}.\\ f_X(x) &= \int_{-\infty}^\infty f_{X,Y}(x, y), & \text{$X,Y$ continuous}. \end{align*}
Note that marginal distributions of $X$ and $Y$ are not sufficient (not enough information) to find the joint distribution of $X, Y$.
  • Joint from marginal and conditional: If we know the marginal distribution of $X$ and the conditional distributions $Y | X=x$ for any $x$, then we can find the joint distribution of $(X, Y)$ by factoring out our probability: \begin{align*} P(X = x, Y = y) &= P(X = x) P(Y = y | X = x), & \text{$X, Y$ discrete.}\\ f_{X, Y} (x, y) &= f_{X}(x) f_{Y|X=x} (y), & \text{ $X, Y$ continuous.} \end{align*}
Independence of random variables: Random variables $X, Y$ are independent if for all $x$ and $y$, any of the following hold (they imply each other, if valid): \begin{align*} F_{X, Y} (x, y) = P(X \le x, Y \le Y) &= P(X \le x) P(Y \le Y) = F_X(x)F_Y(y), & \text{ CDFs for any $X, Y$.}\\ P(X = x, Y = y) &= P(X = x) P(Y = y), & \text{PMFs for discrete $X, Y$.}\\ f_{X, Y} (x, y) &= f_X(x) f_Y(y), &\text{PDFs for continuous, $X, Y$.} \end{align*}
  • 2D LOTUS: Let $X, Y$ be random variables with known joint distribution. For $g: \mathrm{support}(X) \times \mathrm{support}(Y) \to \mathbb R$, LOTUS extends to 2 dimensions (or analogously for any larger dimensions) to give \begin{align*} E(g(X, Y)) &= \begin{cases} \sum_x \sum_y g(x, y) P(X = x, Y = y), & \text{ $X, Y$ discrete}\\ \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} g(x, y) f_{X, Y}(x, y)dx dy, & \text{ $X, Y$ continuous}. \end{cases} \end{align*}

4. Covariance and Correlation

Covariance and Correlation

$\newcommand{\cov}{\mathrm{Cov}}\newcommand{\corr}{\mathrm{Corr}}\newcommand{\var}{\mathrm{Var}}\newcommand{\sd}{\mathrm{SD}}$ The covariance of random variables $X, Y$ is \begin{align*} \cov(X, Y) &= E\left( \left[X - EX \right] \left[Y - EY\right]\right) \end{align*} where $EX$ is shorthand for $E(X)$. Equivalently, \begin{align*} \cov(X, Y) &= E(XY) - E(X) E(Y). \end{align*}

The correlation of random variables $X, Y$ is \begin{align*} \corr(X, Y) &= \frac{\cov(X, Y)}{\sqrt{\var(X) \var(Y)}}\ &= \frac{\cov(X, Y)}{\sd(X)\sd(Y)}, \end{align*} where $\sd(X) = \sqrt{\var(X)}$ is the standard deviation of $X$. Equivalently, we first standardize $X$ and $Y$, then find their covariance: \begin{align*} \corr(X, Y) &= \cov\left( \frac{X - E(X)}{\sd(X)}, \frac{Y - E(Y)}{\sd (Y)} \right). \end{align*}

$X$ and $Y$ are

  • positively correlated if $\corr(X, Y) > 0$,
  • negatively correlated if $\corr(X, Y) < 0$,
  • uncorrelated if $\corr(X, Y) = 0$.

Since correlation and covariance have the same sign, this also applies for positive/negative/zero covariance.

Properties of covariance: see page 327 in Blitzstein & Huang for full list. Let $X, Y, W, Z$ be random variables, as well as those of the form $X_1, X_2, \ldots,$.

  • If $X, Y$ are independent, then $\cov(X, Y) = 0$ (so $X, Y$ are uncorrelated).
  • $\cov(X, X) = \var(X)$.
  • $\var(\sum_i X_i) = \sum_i \var(X_i) + \sum_{i<j} 2 \cov(X_i, X_j)$.
    • This can be especially useful for finding the variance of a sum of indicators.
  • \begin{align*}\cov(X+Y, W+Z) &= \cov(X, W) + \cov(X, Z)\\ &+ \cov(Y, W) + \cov(Y, Z).\end{align*}
  • $\cov(aX, bY) = ab \cov(X, Y)$.

The last two properties are referred to as bilinearity.

Properties of correlation Let $X, Y$ be random variables.

  • -If $X, Y$ are independent, then $\corr(X, Y) = 0$ (so $X, Y$ are uncorrelated)
  • $-1 \le \corr(X, Y) \le 1$.
Uncorrelated does NOT imply independent: In the previous two results, we noted independent random variables have zero correlation and zero covariance. However, the converse does not apply: uncorrelated random variables are not necessarily independent.
Previous
Next