Counting, Part 1

What if you can make a mistake while counting?

It’s an unfortunate reality. We aren’t perfect counters.

Suppose you start counting at 1. You keep counting until you make a mistake. When you make a mistake, you write down the last number you got right. This is your score. Then, you start over.

You repeat this 76 times before getting bored. The scores you wrote down were:


249	331	17	36	29	104	118	301	47	11
88	461	8	12	51	12	201	13	9	97
383	1	3	6	100	194	6	78	5	56
33	67	9	5	12	3	30	1	4	150
430	12	1	1	248	13	27	7	5	239
456	17	43	12	5	2	3	10	1	17
2	15	13	2	1	147	81	3	1	186
29	60	95	66	90	118

Being mathematically minded, you wonder how you can model this. Assume that you have some fixed chance $p$ of making a mistake each time you count. Each time you play this counting game, let the random variable $C$ denote your score for that game. Then, $C \sim \mathrm{NB}(1, p)$, meaning that $C$ follows the negative binomial distribution. Great! Now we just need to find a good value for $p$ based on the score data.

We’ll use maximum likelihood estimation for this task. Given our observations $C_1$, $C_2$, $\ldots$, $C_n$ (where $n = 76$ in this case), what is the value $\hat{p}$ that maximizes the likelihood that $C_i$ were all drawn from $\mathrm{NB}(1, \hat{p})$? This likelihood, which we’ll denote $L(p)$, is:

\[L(p) = \prod_{i=1}^n P(C = C_i) = \prod_{i=1}^n p (1 - p)^{C_i}\]

We wish to find:

\[\hat{p} = \argmin_p L(p)\]

Since the $\log$ function monotonically increases,

\[\argmin_p L(p) = \argmin_p \log L(p)\]

It will be far easier to compute the maximum log-likelihood. Working through the math now,

\[\begin{aligned} \log L(p) &= \sum_{i=1}^n \log \left[ p (1 - p)^{C_i} \right] \\ &= \sum_{i=1}^n \left[ \log p + C_i \log (1 - p) \right] \\ &= n \log p + \log (1 - p) \sum_{i=1}^n C_i \\ &= n \log p + c \log (1 - p) \end{aligned}\]

where $c = \sum C_i$.

$L(p)$ is maximized when $\frac{\partial}{\partial p} \log L(p) = 0$:

\[\begin{aligned} \frac{\partial}{\partial p} \log L(p) = \frac{n}{p} - \frac{c}{1-p} &= 0 \\ \therefore\ n (1 - p) - p c &= 0 \\ \therefore\ n &= p (n + c) \\ \therefore\ p &= \frac{n}{n + c} \end{aligned}\]

Thus, $\hat{p} = \frac{n}{n + c}$.

Remarkably, the individual counts themselves have no bearing on the probability estimate. Only the number of games played and the total of scores matter.

For our case, we can compute $\hat{p} = \frac{76}{76 + 5799} \approx 0.012936$. We have a 1.3% chance of failure!

Home

All Posts


249	331	17	36	29	104	118	301	47	11
88	461	8	12	51	12	201	13	9	97
383	1	3	6	100	194	6	78	5	56
33	67	9	5	12	3	30	1	4	150
430	12	1	1	248	13	27	7	5	239
456	17	43	12	5	2	3	10	1	17
2	15	13	2	1	147	81	3	1	186
29	60	95	66	90	118


249	331	17	36	29	104	118	301	47	11
88	461	8	12	51	12	201	13	9	97
383	1	3	6	100	194	6	78	5	56
33	67	9	5	12	3	30	1	4	150
430	12	1	1	248	13	27	7	5	239
456	17	43	12	5	2	3	10	1	17
2	15	13	2	1	147	81	3	1	186
29	60	95	66	90	118


249	331	17	36	29	104	118	301	47	11
88	461	8	12	51	12	201	13	9	97
383	1	3	6	100	194	6	78	5	56
33	67	9	5	12	3	30	1	4	150
430	12	1	1	248	13	27	7	5	239
456	17	43	12	5	2	3	10	1	17
2	15	13	2	1	147	81	3	1	186
29	60	95	66	90	118