Exponetial Distribution and Poisson Distribution
In paper of Fader and Hardie[1], they review Pareto/NBD framework for counting and forecasting number of customers in period $t$. and propose BGD framework. One of cores in two frameworks is adopting exponential distribution to model the distribution of purchase interval which is usually not intuitive as Poisson distribution. Hence I try to figure it out (which incents me starting this blog ) and this note is mostly based on two blogs [2] [3].
Firstly, I need to figure out wherer Poisson comes from.[4] [5]
For each individual(consumer), they have $p$ probability to purchase $x$ times in $n$ shopping trips, and her purchase decision follows a binomial distribution $Bin(n,p)$. Poisson distribution is the limiting form of Binomial distribution, $n \to \infty$.
Let’s have a look how could we prove that. Whilst $\lambda = np$ is expected total purchase rate through $n$ trips, then we consider the $p.m.f$ of binomial distribution,
$$\begin{array}{c}
Bin(x ; n, p) = \frac{n !}{(n-x) ! x !} p^{x}(1-p)^{n-x}
\end{array}$$
This can be rewritten as,
$$\begin{array}{c}
\frac{\lambda^{x}}{x !} \frac{n !}{(n-x) ! n^{x}}\left(1-\frac{\lambda}{n}\right)^{n}\left(1-\frac{\lambda}{n}\right)^{-x}
\end{array}$$
When $n\to\infty$,
$$\begin{align*}
&\lim (n \rightarrow \infty) \frac{n !}{(n-x) ! n^{x}}\
&= \lim (n \rightarrow \infty)\left{\frac{n(n-1) \cdots(n-x+1)}{n^{x}}\right}\
& = 1 \
\end{align*}$$
$$\begin{align*}
\lim (n \rightarrow \infty)\left(1-\frac{\lambda}{n}\right)^{n} & = e^{-\lambda} \
\end{align*}$$
$$\begin{align*}
\lim(n \rightarrow \infty)\left(1-\frac{\lambda}{n}\right)^{-x}& = 1
\end{align*}$$
After substituting these part above, then the limiting form of binomial distribution is found,
$$\begin{align*}
\lim (n \rightarrow \infty) Bin(x ; n, p)=\frac{\lambda^{x} e^{-\lambda}}{x !}
\end{align*}$$
a Poisson distribution $Pois(x;\lambda)$.
So we usually assume the number of transactions made by a customer follows a Poisson distrbution with trasaction rate $\lambda$.
In Fader and Hardie’s paper, they mention that, the assumption above is equivalent to assuming time interval between transactions is distributed exponentially.
Before we prove the relationship between Poisson and exponential distribution, transaction rate $\lambda$ could give us a glance of it. [3]
However, when we model the time interval between transactions, we tend to speak in terms of time instead of rate, e.g., the number of years a computer can power on without failure is 10 years (instead of saying 0.1 failure/year, which is a rate), a customer arrives every 10 minutes, major hurricanes come every 7 years, etc. When you see the terminology — “mean” of the exponential distribution — $\frac{1}{\lambda}$ is what it means.
So decay rate $\frac{1}{\lambda}$, how long next transaction occurs, instead of $\lambda$, is reconsidered, although $\lambda$ in exponetial distribution is exactly same as that in Poisson process.
If you think about it, the time interval until the trasaction occurs means during the waiting period, not a single transaction has happened.
This is, in other words, $Pois(x=0)$.
$$\begin{align*}
Pois(x = k) = \frac{\lambda^{k} e^{-\lambda}}{k!}
\end{align*}$$
When $k = 0$,
$$\begin{align*}
Pois(x = 0) = \frac{\lambda^{0} e^{-\lambda}}{0!} = e^{-\lambda}
\end{align*}$$
How about probability distribution of “nothing happens during the time duration $t$” ? (Poisson distribution assumes that transactions occurs independently to each other)
$$\begin{align*}
&P(T>t)\
&= P(Nothing\quad happens\quad during\quad t\quad time\quad units)\
&= P(X=0\quad in\quad the\quad first\quad time\quad unit)\times P(X=0\quad in\quad the\quad second\quad time\quad unit)\
&\times\cdots \times P (X=0\quad in\quad the\quad t-th\quad time\quad unit)\
&= e^{-\lambda} \times e^{-\lambda}\times \cdots \times e^{-\lambda}\
&= e^{-\lambda t}
\end{align*}$$
Then the probability of transaction occurs before $t$ is
$$\begin{align*}
P(T \le t) = 1 - P(T>t) = 1 - e^{-\lambda t}
\end{align*}$$
Then we have the $p.d.f$ of $t$
$$\begin{align*}
f(t) = \frac{d}{dt}P(T \le t)= \frac{d}{dt}(1 - e^{-\lambda t} )= \lambda e^{-\lambda t}
\end{align*}$$
Then we have
$$\begin{align*}
&f(t_j|t_{j-1};\lambda)\
&=\frac{f(t_j,t_{j-1};\lambda)}{f(t_{j-1};\lambda)}\
&=\frac{f(t_j;\lambda)}{f(t_{j-1};\lambda)}\
&=\frac{\lambda e^{-\lambda t_j}}{\lambda e^{-\lambda t_{j-1}}}
\end{align*}$$
an exponential distribution $Exp(t;\lambda)$.
- Fader, P. S., Hardie, B. G., & Lee, K. L. (2005). “Counting your customers” the easy way: An alternative to the Pareto/NBD model. Marketing science, 24(2), 275-284. ↩
- Poisson Distribution — Intuition, Examples, and Derivation ↩
- Exponential Distribution — Intuition, Derivation, and Applications ↩
- Relationship between Poisson Distribution and Binomial Distribution ↩
- Proof - Limiting form of Binomial distribution ↩