I. Introduction to Probability Questions, Problems, Paradoxes
II. Monty (Monte Hall) Paradox and the TV Game Shows: How to Win?
III. Theory of Probability Applied to Monty Paradox
IV. Addenda to the Ion Saliu Paradox of N Trials
V. Another Paradox: Classical Occupancy Problem or Coupon Collector's Problem
VI. The Most Precise Method of Calculating Probabilities, Software, Simulation
VII. Resources in Theory of Probability, Mathematics, Statistics, Combinatorics, Software, Paradoxes
Mathematical software accompanying this article:
RandomDigits
OccupancySaliuParadox.
Axiomatic one, it no longer comes as a surprise to me receiving emails on various probability questions, and problems, and paradoxes. Ever since I published the best introduction to theory of probability, I made myself available to the public as the Doctor in Occult Science of Randomness. In addition to emails, people try also to get me to participate in debates in newsgroups and other public forums. I know it because I see questions repeated in the newsgroups or forums I visit. I also receive direct questions by email on those repeated problems or paradoxes.
I believe that my site offers answers to every possible probability question or problem. The probability page offers the best theoretical template in solving just about every kind of probability question. Furthermore, my website also offers the best software to deal with the problem. My software will do probability, statistics, and combinatorics calculations. Calculate and also generate all possible cases of a problem — and then count the favorable cases. That's what probability is all about: Rapport the favorable cases to total cases.
This article is about a famous paradox, which is constantly raised in public forums as probability question or 3-door problem. I am referring to the Monty Paradox, or the Monty Hall Problem or Monte Hall Paradox. So many names for a very simple probability problem! We'll make it here really complex. We'll generalize from 3 doors to N doors or prizes! I also offer a few more details on Ion Saliu Paradox (Problem) of N Trials; it responds to several questions in newsgroups.
The host, Monty Hall (a.k.a. Monte Hall), offers the player the opportunity to win what is behind one of three doors. Typically there was a really nice prize (i.e. a car) behind one of the doors and a not-so-nice prize (i.e. a goat) behind the other two doors. After selecting a door, Monty would then proceed to open one of the doors you didn't select. It is important to note here that Monty would NOT open the door that concealed the car. Thus, the host always eliminates one of the losing cases. At this point, Monty would then ask you if you wanted to switch to the other door before revealing what you had won.
In September of 1991, a reader of Marilyn Vos Savant's Sunday Parade column wrote in and asked the following question:
"Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the other doors, opens another door, say No. 3, which has a goat. He then says to you, 'Do you want to pick door No. 2?' Is it to your advantage to take the switch?"
This problem was given the name The Monty Hall Paradox in honor of the long time host of the television game show Let's Make a Deal. Articles about the controversy appeared in The New York Times and other papers around the country. Marilyn's answer was that the contestant should switch doors and she received nearly 10,000 responses from readers, most of them disagreeing with her. Several were from mathematicians and scientists whose responses ranged from hostility to disappointment at the nation's lack of mathematical skills!
Why so many people were convinced that Marilyn Vos Savant was wrong? They had all decided that it did not matter if the contestant switched or did not switch. There may be a reason so many disagreed with her. Omitting one phrase in the statement of this problem changes the answer completely and this might explain why many people have the wrong intuition about the solution. If the Host (Monty Hall) does not know where the car is behind the other two doors, then the answer to the question is IT DOESN'T MATTER IF THE CONTESTANT SWITCHES. The change in the statement of the problem is so slight that this might be the reason this problem is such a "paradox."
The original game: N = 3 (three doors or choices)
Stay Change
---- ------
1 2 3 W L
1 3 2 W L
2 1 3 L W
2 3 1 L W
3 1 2 L W
3 2 1 L W
The digit in position #1 represents the Player's choice. The Player always selects first. Then, the Host eliminates a losing possibility. The Player is greatly assisted by having to make one of two choices. That's what makes many people believe that the Player only faces a p = 1/2 choice. But they ignore what the total number of possibilities is, and what the number of favorable cases is.
Surprisingly, the picture is still NOT clear to a large percentage of readers! Let's remember that the 1st digit in the 6-element set represents the very 1st choice of the game: The Player chooses a door — Win or Lose. That's a given. Monty cannot erase the door in that position.
Next, the Host eliminates one losing door from the remaining 2 doors (position 2 or 3).
Let's update the 6-element set following the Host's action. Monte Hall cannot eliminate door 1 — because we designated it as the winning choice; Monty may only eliminate a losing choice in step #2; the losing choices are 2 and 3 in the 6-element set.
The original game: N = 3 (three doors or choices)
Stay Change
---- ------
1 x 3 W L
1 x 2 W L
2 1 x L W
2 x 1 L W
3 1 x L W
3 x 1 L W
No matter what losing door Monte Hall “erases” (the x in the graphic), the Player will always have 2 of 6 winning choices (p = 1/3) on Stand and 4 of 6 winning choices on Change (p = 2/3). You can see that there are 4 W remaining in the Change column as opposed to 2 W in the Stay column.
Things would be very different IF the Host eliminated a remaining door randomly (a winning door as well as a losing door — like blindfolded). Let's update the 6-element set following the Host's action — IF Monty Hall eliminated a remaining door randomly.
The original game: N = 3 (three doors or choices)
Stay Change
---- ------
1 x 3 W L
1 x 2 W L
2 1 x L W
2 3 x L L
3 1 x L W
3 2 x L L
Evidently, the Player will always have 2 of 6 winning choices (1/3) on Stand and 2 of 6 winning choices on Change (1/3). But, where are 1/3 of the probabilities? That's . . . Dealer's tip! In one 3rd of the situations, the Player would just mildly shout to the Host: “Just don't tell me WHAT door you eliminated!”
Those are the situations when the winning door is removed from play. There are 2 in 6 cases: 2-3 and 3-2. The Player selected a losing door (goat) and, if Change is selected, the only choice available is the other “goat”. It doesn't matter if the Player stays or changes: It's a lose-lose situation. But the Monte Hall game was NOT played that way, or it would have had no success at all. The attraction of TV game shows is the higher possibility of winning.
There is no doubt that the total number of possibilities is 6, or 3! (3 factorial or factorial of 3). No matter what, the Player has always 2 favorable STAY situations, those starting with digit 1. It doesn't matter what door number that is. Digit 1 in position #1 means that the Player guessed correctly the door number where the prize (car) hides. This situation occurs 2 out of 6 times, or p = 2/6 = 1/3. Of course, if the Player stays (sticks) with her choice, the Player wins. In other words, the Player loses if she changes (switches) her first selection.
In 4 out of 6 cases, the digit in the first position is NOT 1; therefore, the player makes a losing selection at the beginning of the game. The complementary situation is CHANGE the selection. In other words, the Player WINS if she changes (switches) her first selection. The winning situation occurs 4 out of 6 times, or p = 4/6 = 2/3.
This is a no-brainer, really. The Player is better served if changing his first choice. That was a TV game, NOT a casino game. No casino in the world would offer such a favorable-to-the-player game! If allowed to switch, I would bet on Dealer's hand on most occasions…
We can generalize this game. The probability to win in STAY situations is p = (3-1)! / 3! = (1 x 2) / 6 = 1/3. For N choices, the STAY probability is:
PS = (N – 1)! / N! = 1 / N
The probability to win in CHANGE situations is the complementary probability:
PC = {1 – [(N – 1)! / N!] = 1 – (1 / N) = (N - 1) / N
If N tends to infinity, the probability to win in STAY cases tends to zero; the probability to win in CHANGE cases tends to 1 (certainty).
Let's analyze the N = 4 case in our quest to find the general solution. In this case, there are four doors. The prize is behind one of the doors. The other three doors represent a losing choice. Again, 1 represents the prize; the digit in the first position represents the Player's first selection. Here is what my program generated.
N = 4; four doors (choices); one prize (favorable case)
Stay Change
---- ------
1 2 3 4 W LL
1 2 4 3 W LL
1 3 2 4 W LL
1 3 4 2 W LL
1 4 2 3 W LL
1 4 3 2 W LL
Stay Again Change Again
========== ============
2 1 3 4 L WL W L
2 1 4 3 L WL W L
2 3 1 4 L WL L W
2 3 4 1 L WL L W
2 4 1 3 L WL L W
2 4 3 1 L WL L W
3 1 2 4 L LW W L
3 1 4 2 L LW W L
3 2 1 4 L WL L W
3 2 4 1 L WL L W
3 4 1 2 L WL L W
3 4 2 1 L WL L W
4 1 2 3 L LW W L
4 1 3 2 L LW W L
4 2 1 3 L WL L W
4 2 3 1 L WL L W
4 3 1 2 L WL L W
4 3 2 1 L WL L W
Total cases: 4! = 1 x 2 x 3 x 4 = 24.
The favorable cases are W; a total of 6 wins for STAY; that is, (4 - 1)! = 3! = 6; 6 / 24 = 1 / 4.
Number of cases that qualifies as CHANGE AGAIN is 18; that is, 4! – 3! = 18; 18 / 24 = 3 / 4.
Of course, this situation is more complex. The host can choose to eliminate all the losing situations but one — after the Player makes his first selection. In such a case, we are exactly as in N = 3. The probability to win IF STAY is 1/N; the probability to win IF CHANGE is (N - 1) / N.
If the host eliminates only one losing door, the player has more nested choices to make. That's why the program analyzes also Stay Again and Change Again situations. Still — the winning probability for STAY is ALWAYS 6/18 = 1/3. It is the same probability as in the original problem with 3 doors (choices). No-brainer: The player should always change. The winning probability for the CHANGE situation is always 1 – 1/3 = 2/3.
If N > 4, we record the same situation for change, and change, and change, etc. The probability to win — IF STAY — is always 3!/(4! – 3!) = 6/18 = 1/3.
Instead of applying the numerical sets permutations, we will make use of arrangements. In many systems of education, the arrangements are also known as m Permutations n. In this 3-element case, 3 Permutations 2 = Arrangements of 3 taken 2 at a time = 3 x 2 = 6. So, we deal with 6 sets again, but consisting of 2 elements per line.
Stay Change
---- ------
1 2 W L
1 3 W L
2 1 L W
2 3 L L
3 1 L W
3 2 L L
We kept 1 as the winning door as in the previous case. The Player can choose one door from the remaining two; i.e. the Host did not eliminate any. If Stay, Player still has 2 W and 4 L out of 6 as in the original form of the TV game. This time, however, the Change situation has only 2 W out of 6; since Host didn't eliminate a losing door, there are 2 extra L for a total of 4 out of 6.
And thus the winning chances for the player are:
Clearly now, it does NOT matter if Player Stays or Switches to another door. That's what confused the vast majority of people when the game was first analyzed publicly. They overlooked this very important condition: One losing door was eliminated in the real TV game show.
I understand this is a question for me now. Not because the question was repeated in newsgroups, but because I was asked directly to give an answer to it.
The answer to your question can be found in a recently discovered paradox. I named it Ion Saliu's Problem (Paradox) of N Trials.
“The limit of the degree of certainty DC is {1 — (1/e)} when N tends to infinity, for an event of probability p = 1/N and a number of trials equal to N. e represents the base of the natural logarithm and equals approximately 2.718281828... The limit 1 — (1/e) is approximately 0.63212055...”
Read: Theory of Probability: Best introduction, formulae, algorithms, software.
Read also: Fundamental Formula of Gambling: Theory of Probability, Mathematics, Chance.
4.1. First, a random sequence of digits (0-9) can be in the Birthday Paradox format; i.e. we draw 10 digits IN ONE DRAW. Simultaneous is the keyword here. It does make a difference if the numbers are drawn simultaneously, or one at a time. There are some shrewd (or screwed?) gamblers who apply the Birthday Paradox to roulette. But their calculations are wrong. Roulette draws one number at a time. The Birthday Paradox requires at least TWO elements simultaneously (in the same sequence, or drawing). Read: The Birthday Paradox: Combinatorics, Probability, Software, Pick 3 Lottery, Roulette (in the Resources section).
If you draw 10 digits in the same draw, the Birthday Paradox shows that the probability of the 10 digits being unique is 0.04%. That means that in 99.96% of the situations, the 'draw' consists of at least two duplicate digits.
4.2. You probably mean drawing ONE DIGIT AT A TIME. The probability of each digit is 1/10. We always need a 3rd element: the degree of certainty. Truth is an independent relation from any human. Truth has nothing to do with human confidence. Confident or not be you, o incredulous one, the truth has nothing to do with that. What matters is the degree of certainty that a relation exists. Our spite or joy seeing a relation is irrelevant to the relation.
The tool of choice to solve your problem is the Fundamental Formula of Gambling (FFG). One program that incorporates this extraordinary formula is my probability and statistical software SuperFormula. It calculates easily many probability and statistics parameters, even for gigantic values.
4.3. What is the degree of certainty that all 10 digits will appear in 10 draws? Select option C in SuperFormula, then 2 (the program calculates the probability p). Entries: p = 1/10, number of trials N = 10. The result: 65.13%. You draw a digit from 0 to 9. In most cases, you'll see that 6 or 7 DISTINCT (unique) digits come up.
You can use another piece of mathematical software I wrote: PermuteCombine. You can generate RANDOMLY 1000 pick-3 combinations (arrangements of 3 elements from 1 to 10 each). You'll notice most of the time the 1 – 1/e limit. There will be around 632 unique pick-3 combinations among the 1000 random combinations.
Axiomatic one, I remember in the 1990s a group of Australian investors bought tickets in Virginia, USA, Lottery. A lotto game had a huge jackpot, exceeding US$100 million. The investors bought a number of tickets equal to total possible combinations in that lotto game (I forget now what game it was, probably 6/49). The Aussies bought quick-picks: Random combinations generated by the computer of the lottery commission. They thought they had covered every possible combination. They hadn't! Was it a paradox? Luckily, they won the jackpot. They were lucky, indeed. They hadn't covered around 37% of the combinations. So, they faced a scary probability to lose: 37%! I think they had bought over US$13-million worth of tickets!
4.4. You can rephrase your question. How many lottery drawings will it take to get ALL 10 digits with a degree of certainty of 90%? Use option F in SuperFormula. Answer: 22 draws. How many draws will it take to get ALL 10 digits with a degree of certainty of 99%? Answer: 44 draws.
There have been several attempts to offer the most precise method of calculating the probability of N elements to appear with a degree of certainty. I believe every method is an approximation. The answer I have just given above — 44 draws for 10 digits — is also an approximation. I found some interesting new facts. It is undoubtedly accurate to say that a digit will appear within 44 trials with a degree of certainty of 99%. But that calculation refers to just one of the digits. There will be situations when more than one digit will skip more than 44 trials (drawings).
I wrote software to simulate The Classical Occupancy Problem or paradox, which calculates the most precise results:
RandomDigits — part of Scientia, integrated scientific software.
The program simulates a wide range of games, from coin tossing, dice throwing, 10 digits, 3-digit combinations of the pick-3 lottery, etc. The user can select how many times to run the program. For example, the 10-digit case, drawing one element at a time, from 0 to 9. The user can run the program 10,000 times.
I did it many times, with just seconds per run. If the output is displayed on screen, the user can see also the individual elements generated randomly. When all 10 digits have been generated, the software ends the current run, and starts the next. If the output is directed to disk, the execution is much faster and more comprehensive. The program saves to file total trials for each run. The runs are sorted in ascending order for the purpose of calculating the median. Two sample runs follow.
~ 10 digits; lower bound = 0; upper bound = 9; total runs: 10,000
Minimum number of trials: 10
Maximum number of trials: 164
Median number of trials: 30
~ one die; lower bound = 1; upper bound = 6; total runs: 10,000
Minimum number of trials: 6
Maximum number of trials: 91
Median number of trials: 15
You can run for the pick-3 game: lower bound = 0; upper bound = 999. It takes much longer to generate all 1000 pick-3 straight combinations. I saw runs of over 9,000 drawings (trials) before all pick-3 sets came out!
The median number of trials is a useful parameter. It represents the number of trials for a degree of certainty equal to 50%. SuperFormula has a special function: C = Calculate probability from median. The number of trials in the output files generated by RandomDigits represents, in fact, the skips. The digits will skip various numbers of draws before ALL of them appear. In the die throwing case, such probability for a median equal to 15 is: p = 1/23. In the 10-digit case, such probability for a median equal to 30 is: p = 1/44. These values get closer and closer to the reverse of N calculated by the Fundamental Formula of Gambling. It is option F in SuperFormula. FFG calculates N as:
N = log(1 — DC) / log(1 — p)
It appears that the probability of the Occupancy Problem is approximated by the formula 1/N:
poc = log(1 — p) / log(1 — DC),
for a DC = 0.99 (99%).
The fast computers of the today and the near future make possible simulations for longer and longer runs of RandomDigits — also part of Scientia, integrated scientific software. Therefore much more precise relations will be found.
I wrote also software to simulate Ion Saliu's Paradox of N Trials:
OccupancySaliuParadox.
This program supersedes OccupancySaliuParadox — it does everything RandomDigits does, plus simulates Ion Saliu's Paradox.
The player selects, for example, the 10 numbers case; the number of trials N = 10 in this case. The program generates randomly 10 numbers from 1 to 10. Each run of 10 trials will show also how many numbers are missing (numbers NOT drawn in a run of 10 trials). At the end, the program calculates the average of unique numbers per run. That value always tends to {1 — 1/e}.
We can generalize to N elements randomly drawn N at a time (e.g. rolling 6 dice at the same time). The probability of all N elements being drawn is equal to permutations over exponents. A precise formula reads:
I created another type of probability software that randomly generates unique numbers from N elements. I even offer the source code (totally free), plus the algorithms of random number generation (algorithm #2). For most situations, only the computer software can generate N random elements from a set of N unique or distinct items. Also, only the software can generate Ion Saliu's Sets (Exponents) when N is larger than even 5. Caveat: today's computers are not capable of handling very large Saliusian sets!
PermuteCombine can generate any possible set of numbers, including the almighty exponents (Saliusian Sets). Another program can take the output of PermuteCombine and count the favorable cases. For example: rolling the die 7 times. Count all occurrences of triple digits or two repeat-digits. That means that exactly 5 of the 6 point-faces were generated. Such case will calculate the probability that exactly one of the six digits is missing (equivalent to the probability of exactly 5 of 6 in 7 trials). This method of calculating the probability is absolutely precise. The random simulation is approximate, given a degree of certainty.
OccupancySaliuParadox.
This program supersedes RandomDigits — it does everything RandomDigits does, plus simulates Ion Saliu's Paradox.
The player selects, for example, the 10-number case; the number of trials N = 10 in this case. The probability program generates randomly 10 numbers from 1 to 10. Each run of 10 trials will show also how many numbers are missing (numbers NOT drawn in a run of 10 trials). At the end, the program calculates the average of unique numbers per run. That value always tends to 1 — 1/e. For double-zero roulette, the average of unique numbers in 38 spins is between 24 and 25 for the majority of cases.
Read Ion Saliu's first book in print: Probability Theory, Live!
~ Founded on valuable mathematical discoveries, including probability theory applied to various probability paradoxes or problems: Monty Paradox (Monte Hall Paradox), Ion Saliu's Paradox, Classical Occupancy Problem.
Resources in Theory of Probability, Mathematics, Statistics, Combinatorics, Software
See a comprehensive directory of the pages and materials on the subject of theory of probability, mathematics, statistics, combinatorics, plus software.