Bayes Theorem, Conditional Probabilities, Simulation, Surveys, Polling; Relation to Ion Saliu's Paradox

I. Formulas to Calculate Conditional Probabilities

• The foundation of Bayes theorem in a nutshell: Determine the conditional probabilities from statistical data. It is not a bad idea and it led to the invention of simulation and special-purpose software. It established also the rule of multiplication in probability theory.
• One simple example is to roll the dice. The result is known to the conductor of the experiment. The statistical data reveals that the outcome was an odd number. What is the probability that the result was the 5-point face? We know the die faces are odd or even. Therefore the probability is p = 1/2. There are 3 odd point-faces, therefore the probability for the 5-point face is p = 1/3. The formula for conditional probability P of B given A is:
P(B|A) = P(A and B) / P(A)

• The and operand means multiplication: P(A and B) = P(A) * P(B)

In our example: P(B|A) = [1/2 * 1/3] / [1/2] = [1/6] / [1/2] = 2/6 = 1/3.

Let's take a brute-force example: Elections in the United States. Let's say specialized organizations conducted surveys or exit polls. The figures that follow only attempt to be in step with reality.

They compiled the following statistics expressed as percentages. 48% of the voters are men; 52% are women. 60% of the polled citizens are white, 40% are minorities. 60% vote(d) Democrat, 40% vote(d) Republican.
• We can calculate what I call (and followed by many others in my name-calling!) degree of certainty (DC).
• Let's calculate the degree of certainty (or chance, or likelihood) of a minority AND woman. 40% * 52% = 20.8%.
• What is the degree of certainty for a man to vote Democrat? 48% * 60% = 28.8%. What is the degree of certainty for a minority to vote Republican? 40% * 40% = 16%.
• We can go one step further and calculate conditional probabilities for 3 parameters. What is the chance for a woman AND minority AND to vote Democrat? 52% * 40% * 60% = 12.48%.

The above figures can be referred to as probabilities of simultaneous events. We can try now to calculate conditional probabilities. When we know the voter is a man, what is the probability to vote Democrat?

P(B|A) = P(A and B) / P(A) = 48% * 60% / 48% = 28.8% / 48% = 60%.

Let's see the other side. We know the voter is a man, what is the probability to vote Republicab?

P(B|A) = P(A and B) / P(A) = 48% * 40% / 48% = 19.2% / 48% = 40%.

"Much ado about nothing," right? We already knew that Democrat would get 60%, while Republican would get 40%. Many find the conditional step to be futile. Why would I want somebody else tell me what suit a card was and then guess what value? The real probability means guessing in advance the color and value! Then, if I know what exact card was drawn, I can calculate the new probability.

There is a lot of confusion surrounding this mathematical rule. Some even apply it to mutually exclusive phenomena, such as coin tossing. As if there could be such a simultaneous event as Heads AND Tails! Or, Man AND Woman simultaneously (hermaphrodite, anyone?)

Bayes' Theorem applies only to non-mutually exclusive events, such as determining probabilities from surveys or polls. The surveys and the polls are the only means of calculating probabilities as social events do NOT have predetermined formulas. Such events consist of two phases. They give examples such as drawing a card (a Jack) from a deck. Then the experiment conductor tells the subject that the result was a face card; what is the probability that the picture card was a Jack? That experiment does NOT have two phases — and it should NOT be given as an example of applications of the Bayes Theorem. Lottery is in the same category.

The best application of the Bayes Theorem is polling. It is widely applied in the political campaigns in the United States. Polling is the most typical case of a probabilistic event in two phases. Polls show quite accurate data how various segments of population vote. But nobody can predict the voter turnout.

• If the turnout is high in USA (close to 60%), the Democrat Party is heavily favored to win. There was an amazing fact in the 2012 presidential elections. The TV pundits determined that Obama (a Democrat) won the state of Pennsylvania — after only 10% of the votes were counted. The exit polls showed a large number of minorities and women voting, especially in the city of Philadelphia. Those two groups vote overwhelmingly Democrat. The Bayes' Theorem indicated, beyond statistical doubt, that Obama would be the winner — and he was by a substantial margin.
• Turnout in the 2012 Presidential Election was 58.2% (down from 61.6% in 2008, but still high by United States standards). Women represented 53% of voters, men 47%. The Democrat candidate, Barack Obama, received 60% of women's vote, while the Republican candidate, Mitt Romney, received 55% of men's vote. The results were quite close to polling data.
• If the turnout will be close to 2012 data, the 2016 electoral results, based on similar gender and ideological bias, would favor the Democrat Party as the presumptive winner. These are the Bayes Theorem calculations: 53% * 60% (women) + 47% * 45% (men) = 32% + 21% = 53% Democrat (47% Republican). It is likely that the Democrats will win the women's vote, while the Republicans are favored to win over the American men. It all depends on the percentage points, in turn dependent on voter turnout. The minorities also vote preponderantly Democrat, but their turnout is significantly lower in non-presidential (midterm) elections.

More controversial chances are compiled by insurance companies to determine the risk. For example, the risk probability for a potential customer to suffer from cancer, when they know the would-be customer is a man. The Bayes Theorem simply amplifies the numbers and gives the impression of wider discrepancies. Blind faith in statistical data can lead to discrimination or racial profiling. Statistics are faulty numerous times…

II. Simulation and Real Probabilities

Indubitably axiomatic one, surveys or exit polls are forms of simulation. Problem with simulation: It is not a 100%-accuracy method. Let's keep in mind that the Bayes Theorem was conceived in the 18th century… way before the computers were invented! The famed Ion Saliu's Paradox, with much help from computer programming, proves undeniably that only 63% of all elements will appear in a random generating process. Around 37% of the elements will be missing. Since some elements are NOT in the statistical data, the Bayesians will calculate the probability p as 0 (zero) for those missing elements. That's wrong. The 1,2,3,4,5,6 lotto combination has not been drawn in any lotto game worldwide. That combination, however, has a probability greater than 0. In fact, that particular p is equal to the probabilities of the rest of the lotto combinations (e.g. 1/13+ millions in a 6/49 lotto game).

Even the simplest cases, where the probability p = 1/2 (e.g. coin tossing) lead to… misleading simulations. If we toss the coin twice (number of trials N = 2), we expect the heads to appear once. NOT!! The degree of certainty that heads will appear is only 75%. We simulate 2 coin tosses and tails appears twice. Heads did not show up — is it probability 0 (zero)? NOT!!

• Only generating ALL elements in lexicographic order offers the 100% accurate method of calculating the probability.

In the case of tossing a coin, the lexicographical generating will look like this:
Tails

Take the example of elections. There is no way to generate all possible outcomes in lexicographical order. Only polling can give an idea of outcome probabilities. There are, indeed, situations when lexicographic order generating is not possible, as the real probability isn't known. In such cases, an extremely large amount of elements is necessary in random generation. An intermediary probability p1 can be determined. Then, that value p1 should be multiplied by 0.63 to get a value closer to the real probability p.

In the case of roulette, a random simulation of 37 spins (for all possibilities on the roulette wheel), only 24 unique numbers will appear. The Ion Saliu Paradox limit for roulette is 0.65. We will divide 24 by 0.65 and the result is 36.9. The 1/36.9 figure is very close to the real roulette probability of 1/37.

The famed Fundamental Formula of Gambling (FFG) offers the best and most accurate method of determining the real probability from a randomly generated statistical series. We only need to generate as many random outcomes as we can think of — the more, the better.

The key point is the repeat of elements in random events. The Ion Saliu's Paradox is, in fact, derived from the Fundamental Formula of Gambling (FFG). Half of the elements will repeat after a number of trials N less than or equal to the FFG median.

Let's say I generate some 2000 pick-3 lottery sets and track how many number of trials it took for every set to repeat. We have another series of statistical data consisting of the number of trials N. My lottery software does it very easily. The software then sorts the N data series and discovers the median of the series. I call it FFG median: The value in the middle of the N data series. That value is closely around 690. Keep in mind, several pick 3 sets will not come out in 2000 “drawings” (sets generated). Some sets will show up after 3000, or 4000, even after 5000 number of trials. The Bayes Theorem will conclude that those particular sets do not exist (as their probability appears to be 0)!

Next, I run the great piece of probability software SuperFormula (by Ion Saliu, of course!) The function we need is P = Probability for FFG Median and Number of Trials N. I enter as degree of certainty DC the value of 50 (for 50%, which is the middle).  For Number of Trials N I type 690 (the number in the middle of the N series). The result: 1 in 995.96 or very, very close to the real probability of 1 in 1000 for a pick 3 lottery game.

Ion Saliu's Paradox also determines an acceptable value of the famous large numbers or long run in stochastic events. If the event has a probability p expressed as 1/N, then generating N * 50 elements randomly satisfies the long run requirement. The today's computers say that every possible element will appear IF generating N * 50 random elements. In the case of a 6 of 49 lotto game, that means 13983816 * 50 = 699,190,800 random combinations.

III. Probability Rules of Multiplication and Addition

1) Bayes' Theorem dealt with conditional probabilities and simultaneous events. The calculation for probability of simultaneous events involves the rule of multiplication. In blackjack, for example, the house edge is determined by the simultaneous bust for player and dealer. Since the bust (following the dealer's rules) is 33%, the chance for simultaneous busts is 33% * 33% = 11%. The player gets a few bonuses that may reduce the house advantage to some 7% (the best case scenario for players!)

2) There is also another important probability rule: Addition. It applies to mutually exclusive and non-mutually exclusive events.

2.1) In the case of two mutually exclusive events A, B, the probability that A or B will occur is the sum of the probability of each event.

P(A or B) = P(A) + P(B)

In our political example, the probability for a voter being man or woman is 48% + 52% = 100%. Evidently! The probability to get face 4 or point-face 6 when casting the dice is 1/6 + 1/6 = 2/6 = 1/3.

2.1) In the case of two non-mutually exclusive events A, B, the probability that A or B will occur is the sum of the probability of each event, minus the probability of the simultaneous event.

P(A or B) = P(A) + P(B) - P(A and B)

In our political example, we have a more complicated case: The probability to be a white or vote Democrat. This is a non-mutually exclusive event; it is not an either – or situation, as in coin-tossing. This specific chance is calculated as: P(A or B) = 60% + 60 % – 36% = 84%.

These three essays deal with probability topics, with a lot of help from computers and software: