This is a derivation of the cumulative distribution function, characteristic functionmoment generating function, first moment, expected value, second moment, and variance of the exponential distribution given its probability density function.

The probability density function of the exponential distribution is:

f_X(x) = \lambda e^{-\lambda x}

Thus the cumulative distribution function is:

F_X(x \leq a) = \displaystyle\int_{0}^{a} f_X(x) dx = \int_0^a \lambda e^{-\lambda x} dx = -e^{-\lambda x} \Big |_0^a = 1 - e^{-\lambda a}

And the characteristic function:

\displaystyle \phi_X(u) \displaystyle  = \displaystyle E[e^{i u X}]
\displaystyle  = \displaystyle\int_{0}^{\infty} e^{i u x} f_X(x) dx
\displaystyle = \displaystyle \int_0^{\infty} e^{i u x} \lambda e^{- \lambda x} dx
\displaystyle = \displaystyle \lambda \int_0^{\infty} e^{-(\lambda - iu) x} dx
\displaystyle = \displaystyle - \frac{\lambda}{\lambda-iu} e^{-(\lambda-iu)x} \Big |_0^{\infty}
\displaystyle = \displaystyle \frac{\lambda}{\lambda - iu}

Now the moment generating function, which is obtained from the characteristic function:

M_X(t) = \phi_X(-it) = \displaystyle\frac{\lambda}{\lambda - (i(-it))} = \frac{\lambda}{\lambda - t}

Now the first moment:

M_X'(t) = \displaystyle\frac{d}{dt} \lambda(\lambda-t)^{-1} = -\lambda (\lambda -t)^{-2} \frac{d}{dt} (\lambda - t) = \frac{\lambda}{(\lambda - t)^2}


E[X] = M_X'(0) = \displaystyle\frac{1}{\lambda}

And the second moment:

M_X''(t) = \displaystyle\frac{d}{dt} M_X'(t) = \frac{d}{dt} \lambda (\lambda - t){-2} = -2 \lambda ( \lambda - t)^{-3} \frac{d}{dt} (\lambda - t) = \frac{2 \lambda}{(\lambda - t)^3}


E[X^2] = M_X''(0) = \displaystyle\frac{2}{\lambda^2}

And finally:

\displaystyle Var(X) = E[X^2] - E[X]^2 = \frac{2}{\lambda^2} - \Big(\frac{1}{\lambda}\Big)^2 = \frac{1}{\lambda^2}

Of course the expected value and the variance can be computed by appropriate combinations of:

\displaystyle E[X] =  \int_0^\infty x \lambda e^{-\lambda x} dx


\displaystyle E[X^2]  = \int_0^\infty x^2 \lambda e^{-\lambda x} dx

Next we’ll see how the exponential distribution is memoryless, that is:

\displaystyle P(X > t+s|X>t) = P(X > s)

First apply Bayes’ Rule:

\displaystyle P(X>t +s|X>t) = \displaystyle \frac{P(X>t|X>t+s)P(X>t+s)}{P(X>t)}
= \displaystyle \frac{1*P(X>t+s)}{P(X>t)}
= \displaystyle \frac{1-(1-e^{-\lambda (t+s)})}{1-(1-e^{-\lambda t})}
= \displaystyle \frac{e^{-\lambda (t+s)}}{e^{-\lambda t}}
= \displaystyle e^{-\lambda s}
= P(X>s)

Next, Let X_1, X_2 be exponential random variables with parameters \lambda_1 and \lambda_2 respectively, then:

\displaystyle P(X_1 < X_2) = \displaystyle \int_0^\infty P(X_1<X_2|X_1=x)P(X_1=x) dx
= \displaystyle \int_0^\infty P(X_1<X_2|X_1=x)\lambda_1 e^{-\lambda_1 x} dx
= \displaystyle \int_0^\infty P(x<X_2)\lambda_1 e^{-\lambda_1 x} dx
= \displaystyle \int_0^\infty P(X_2 \geq x)\lambda_1 e^{-\lambda_1 x} dx
= \displaystyle \int_0^\infty e^{-\lambda_2 x} \lambda_1 e^{-\lambda_1 x} dx
= \displaystyle \lambda_1 \int_0^\infty e^{-(\lambda_1 + \lambda_2)x} dx
= \displaystyle \lambda_1 \Big( -\frac{1}{\lambda_1+\lambda_2} e^{-(\lambda_1 + \lambda_2)x} \Big) \Big|_0^\infty
= \displaystyle \frac{\lambda_1}{\lambda_1+\lambda_2}


Ross, Sheldon M. Introduction to Probability Models, 9th edition. Academic Press. 2007.
Bremaud, Pierre. An Introduction to Probabilistic Modeling, 3rd printing. Springer. 1997.