I’ve played around a bit with ChatGPT but only recently found a particularly interesting and very niche use case. I needed to do a reliability calculation but couldn’t remember the name of the correct probabilty distribution to use. I typically1 get there by recalling just enough grad school statistics to think of a distribution that’s close enough, look it up on Wikipedia, and then read up on similar distributions until I find the right one. This time, I asked the entire internet, as distilled through a generative pre-trained transformer.
My problem: some hardware has a per-use reliability of X. How many uses can I expect before it fails? In probability speak, what is the expected value of the discrete probability distribution that models a sequential series of trials successful trials that terminates in a failure? I’d normally start on the page for discrete probability distributions or perhaps remember that coin flipping is a Bernoulli trial modeled by a binomial distribution. Instead, I asked the robot:
What’s the expected value for the number of uses for something with a per-use reliability of 90%?
That’s a super clumsy phrasing, but I’m trying to get a job done here and didn’t think I’d write about the sausage making. ChatGPT’s response:
The expected value for the number of successful uses of something with a per-use reliability of 90% can be calculated using the formula for the expected value of a binomial distribution.
Let’s denote the number of uses as n and the probability of success (i.e., a use being successful) as p = 0.9. Then, the expected value for the number of successful uses is: E(X) = np. This means that on average, we expect the hardware to function without failure for 9 out of 10 uses. Note that this is the expected value and not the guaranteed value – there is still a chance that the hardware will fail before reaching 9 successful uses.
Impressive. It named a distribution, gave the formula for the expected value, plugged in my example number, did the calculation, and provided a caveat to ensure I knew this was the expected value and not a limit or guarantee. I know it’s a textbook application of a very simple distribution that’s probably synthesized from a thousand online undergraduate exams, but still.
Unfortunately, I forgot a key detail. In the situation I want to model a failure is catastrophic, so I’m not looking for the expected number of successes for a fixed number of trials; I want to know, on average, how many successes I can expect before a failure destroys the item. So I elaborated:
Yes, but what if the hardware is destroyed after an unsuccessful use?
If the hardware is destroyed after an unsuccessful use, then we can model the situation using a geometric distribution. The geometric distribution gives the probability of the number of independent trials required to achieve the first success, and it has the following probability mass function: P(X=k) = p (1-p)^(k-1), where X is the random variable representing the number of trials required to achieve the first success, p is the probability of success (the per-use reliability), and k is a positive integer (1, 2, 3, …) representing the number of trials required to achieve the first success.
In this scenario, the expected value for the number of successful uses will be slightly lower than in the previous scenario where the hardware was not destroyed after an unsuccessful use. Let’s denote the probability of success (i.e., a hardware use being successful) as p = 0.9, and let’s assume that the hardware is destroyed after an unsuccessful use. Then, the expected value for the number of successful uses is: E(X) = 1/p
In this case, the expected value for the number of successful uses is: E(X) = 1/0.9 ≈ 1.11. This means that on average, we expect the hardware to function without failure for only 1.11 uses before it fails and is destroyed.
Again, impressive, starting with carry over of the context from the previous question. I know conversational memory is a key feature of ChatGPT, but it’s still a pleasant surprise given my daily interactions with Alexa the Electronic Goldfish.) It again named a distribution (which I independently verified on Wikipedia was the one I wanted), gave a new example, and contrasted this case with the previous response. The example inverted the problem a bit, since what I really want to know is how many successes until the first failure, but it was trivial to invert the formulation by defining P = 1-P_0.
Perhaps I shouldn’t be impressed, as I I did ask a pretty textbook question. On the other hand, my textbook was at home and this was way faster—and I’m super into that.
Yes, this is something I do not often per se but often enough that I really do have a typical approach. ↩