## Sunday, February 24, 2013

### The Expected Draws to Sum over One.

Q: You have a random number generator that creates random numbers exponentially between $$[0,1]$$. You draw from this generator and keep adding the result. What is the expected number of draws to get this sum to be greater than 1.

Fifty Challenging Problems in Probability with Solutions (Dover Books on Mathematics)

A: Before getting into the solution for this, I'll go over an established theorem of exponential distributions.

If there exists two random variables which follow a exponential distribution with parameters $$\lambda_1$$ and $$\lambda_2$$ then their sum is given by the convolution of the two probability density functions. This is shown as

$$P(z = X_1 + X_2) = f_{z}(z) = \sum_{x=0}^{z}f_{X_{1}}(x) f_{X_{2}}(z - x)$$

The probability density function of a distribution with rate parameter $$\lambda$$ is given as

$$f(k,\lambda) = \frac{\lambda^{k}e^{-\lambda}}{k!}$$

Plugging this into the convolution formula gives us

$$f_{z}(z) = \sum_{x=0}^{z}\frac{\lambda_{1}^{x}}{x!}e^{-\lambda_{1}}\times \frac{\lambda_{2}^{z-x}}{(z-x)!}e^{-\lambda_2}$$

With some rearrangement of the terms the above simplifies to (hint: multiply & divide by $$z!$$ and use the binomial expansion of $$(\lambda_1+\lambda_2)^{z}$$ )

$$\frac{(\lambda_1 + \lambda_2)^{z}}{z!}$$

which is the same as Poisson($$\lambda_1 + \lambda_2$$). Coming back to the problem, we want to find $$n$$ such that $$\lambda_1 + \lambda_2 + \ldots + \lambda_n = 1$$ and as we draw the sum just once we can set $$k= 1$$ in the probability density equation. This results in a probability density function

$$P(k=1,\lambda_1 + \lambda_2 + \ldots + \lambda_n=1) = \frac{1^1 e^{-1}}{1!} = \frac{1}{e}$$

Which in turn implies that the number of draws needed to get the sum to 1 is $$e$$. This fundamental number surfaces again!

Some good books to learn probability

Introduction to Algorithms
This is a book on algorithms, some of them are probabilistic. But the book is a must have for students, job candidates even full time engineers & data scientists

Introduction to Probability Theory

An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd Edition

The Probability Tutoring Book: An Intuitive Course for Engineers and Scientists (and Everyone Else!)

Introduction to Probability, 2nd Edition

The Mathematics of Poker
Good read. Overall Poker/Blackjack type card games are a good way to get introduced to probability theory

Let There Be Range!: Crushing SSNL/MSNL No-Limit Hold'em Games
Easily the most expensive book out there. So if the item above piques your interest and you want to go pro, go for it.

Quantum Poker
Well written and easy to read mathematics. For the Poker beginner.

Bundle of Algorithms in Java, Third Edition, Parts 1-5: Fundamentals, Data Structures, Sorting, Searching, and Graph Algorithms (3rd Edition) (Pts. 1-5)
An excellent resource (students/engineers/entrepreneurs) if you are looking for some code that you can take and implement directly on the job.

Understanding Probability: Chance Rules in Everyday Life A bit pricy when compared to the first one, but I like the look and feel of the text used. It is simple to read and understand which is vital especially if you are trying to get into the subject

Data Mining: Practical Machine Learning Tools and Techniques, Third Edition (The Morgan Kaufmann Series in Data Management Systems) This one is a must have if you want to learn machine learning. The book is beautifully written and ideal for the engineer/student who doesn't want to get too much into the details of a machine learned approach but wants a working knowledge of it. There are some great examples and test data in the text book too.

Discovering Statistics Using R
This is a good book if you are new to statistics & probability while simultaneously getting started with a programming language. The book supports R and is written in a casual humorous way making it an easy read. Great for beginners. Some of the data on the companion website could be missing.

1. numero=zeros(10000000,1);

for j=1:10000000

suma=0;

i=0;

while(suma<1)
i = i + 1;
suma = suma +rand;

end

numero(j)=i;

end

mean(numero)-exp(1)

¬ 1e-4

2. Thanks for fixing the type of distribution. Do you know what the result might be if it were indeed a uniform distribution, or is there any difference? I followed Enrique in testing this programatically and encountered the same result, but I thought that random number generators for most languages pick numbers uniformly. Admittedly I've made zero effort in answering my own questions, this was just a passing wonder.

3. Its the same if it were uniform. The simplex way is the popular "text-book" way out there. I think something can be done for the uniform distribution case by observing that the natural log of a uniform random variable is exponentially distributed. But I'ven't given it much thought yet.