"The most important questions of life are, for the most part, really only problems of probability."
(Pierre Simon de Laplace, "Thιorie Analytique des Probabilitιs")
I view the best introduction as necessarily the best treatise. Science goes into such details that sounds like a jargon and resembles a game. Kind of like a theory of word puzzles: Using a lot of jargon for little practical purposes. Theory of probability has undoubtedly its jargon. It also has that huge number of formulas and equations with no practical purpose but to torment sleepless students before exams.
I get my share of questions regarding various aspects of probability. I can also see in public forums plenty of probability problems. Who can answer all those questions? What I remark, however, is a deficiency in the introduction to theory of probability. Probably the essentials are skipped too fast in order to cover all those insomnia-causing topics.
I wrote previously a few pages dedicated to probability and odds. Since I am unable to respond to most private questions and requests, I try to put together now the essential introduction to theory of probability. We must start with the start: the mother of all probability formulas; the formula that gives birth to many other formulas. We must also formulate fundamental algorithms of analyzing a wide variety of probability problems. Then we must put on the table the most efficient instrument of answering (almost) all questions on probability.
It is the Fundamental Formula of Probability (FFPr) and everything in theory of probability is derived from it.
A simple example of sets and favorable elements: there are 10 balls in a jar; 5 of the 10 balls are red; what is the probability to extract one red ball from the jar? There are 5 favorable cases in a total of 10 elements; the probability is 5/10 = 1/2 = 0.5.
The easiest case is coin tossing. What is the probability of heads in one coin toss: 1/2 = .5.
The probability can be also understood as expected number of successes in one trial. That formulation makes it easier to understand why probability can never be higher than 1: no event can have more than one success in one try! The flip side is that phenomena with only one side do not exist. The Everything does have at least two sides. In fact, TheEverything is a unity of two opposites the all-encompassing Flipping Coin paradigm.
The term odds is often used in theory of probability, especially the branch dealing with gambling. The most correct usage of odds points to the degree of difficulty for an event to occur. The odds are N to n or (N n) to n. It is widely used in horse racing. The odds for Horse X to win are 5 to 2. The N to n case represents the odds against.
Another form of odds is used now even more widely than the term probability or probabilities. It is the expression favorable odds. The favorable odds are calculated and expressed as n in N, most frequently as 1 in N1. If probability is most often expressed as a floating point number between 0 and 1, the favorable odds tend to be expressed as 1 in Number_Greater_Than_1, usually an integer. The odds of winning exactly 3 of 6 in a 6/49 lotto game are 1 in 57.
Personally, I prefer expressing the probability as two terms: n/N and 1 in N1. I never use the expression N to n.
In general, the total number of possible cases is calculated by combinatorics, a branch of mathematics. Still at this Web site, you'll find the best presentation of combinatorics. All four numeric sets are clearly presented. Moreover, specialized software is provided to generate all possible types of sets and also calculate total possible elements in the sets Permute Combine. In other cases, calculating total number of cases is a matter of enumeration. How many sides does the coin have? Two = that's the number of total cases. A die has six faces = that's total possible cases.
Determining the number of favorable cases is more difficult in most cases. In a very simple case like betting on heads (not over head!) in coin tossing it is easy. There is one favorable case out of two. Betting on face six of a die is also easy: One favorable case (out of six).
Based on this attribute, we can calculate the probability differently. In general, in the first case we apply the very first formula of probability (FFPr = n/N) or the probability of the normal distribution (a more general and encompassing case than the first formula). In the separable case of probabilistic events, we calculate the probability by applying the hypergeometric distribution.
The questions can get more and more complicated. That's how the so-called probability problems come to life. What is the probability to get heads five times in a row? What is the probability to roll face six of a die exactly 4 times in 10 throws? How about the probability to get at most 4 heads in 15 coin tosses? Or, what is the probability of getting at least 4 winners out of 6 numbers drawn in a 6/49 lotto game? To answer such questions we need to apply more sophisticated and complicated formulas and algorithms and even computer programs.
N!
BDF = --------------------- * p^{M} * (1 p)^{N M}
(N M)! * M!
Or, using a simpler equation of combinations C(N, M):
BDF = the binomial distribution probability
p = the individual probability of the phenomenon (e.g. p = 0.5 to get tails in coin tossing)
M = the exact number of successes (e.g. exactly 5 tails in 10 coin tosses)
N = the number of trials (e.g. exactly 5 tails in 10 coin tosses = number of trials)
C(N, M) = combinations of N elements taken M at a time.
The hypergeometric distribution probability formula has certain restrictions. They are nasty, especially for a computer programmer trying to implement probability algorithms. Some cases are logically impossible; e.g. 1 of 6 in 10 from 10.
~ A problem like this one shows up in forums and newsgroups and emails. Let's throw four dice. What is the probability to get all faces equal (e.g. 1-1-1-1)? This is simple actually. No hypergeometric or binomial is necessary. Total number of cases: 6^{4} = 1296. There is a total of 6 favorable cases, from:
1-1-1-1
to
6-6-6-6.
Clearly, the probability is (n / N): p = 6/1296 = 0.004629 (1 in 216).
The case can be further complicated by asking: What is the probability to get at least one pair (e.g. 1-1or 3-3-). The total number of cases is the same as above. The 6 previously favorable cases can be broken down in cases of at least two faces being equal. The probability for at least or at most cases can be calculated using my probability programme Formula (16-bit software) and especially Super Formula (powerful 32-bit). First, we can calculate that the probability for each of the six faces. To get at least two of the same face throwing four dice: there are 6 possibilities. Since we have 6 pair possibilities, the number of favorable cases becomes 6 * 6 = 36. Finally, the probability to get at least two of the same point-face when throwing four dice is: 36 / 1296 = 0.2778 (1 in 36).
Wrong! It isn't a separable event, remember? It is a type of the well-known Birthday Paradox (or probability of duplication, or odds of collisions).
In the probability problem of four dice, the Birthday Paradox parameters are: lower bound = 1, upper bound = 6, total elements (number of dice) = 4. The probability to get at least two dice showing the same point face when throwing four dice is: 0.7222 or 1 in 1.385. It is easy to verify without software. Throw four dice. In almost three out of four rolls, at least two dice show the same face.
Also, the probability to get the four dice show the same point face is precisely calculated by using the exponential sets. A die has six faces always! More odds: To get exactly 1-1-1-1 = 1/1296; the probability to get exactly 6-6-6-6 = 1/1296; the probability to get exactly 1-2-3-4 = 1/1296.
The pick 3, 4 lottery games should be considered forms of dice rolling therefore inseparable phenomena. A drawing machine is a 10-face die. The pick 3 game is like casting three 10-faceted dice. The slot machines, by extension, are the equivalent of casting multi-faceted dice (usually three dice).
~ Another probability problem that pops up in forums and newsgroups and emails. A jar contains 7 red balls, 6 black ball, 5 green balls, and 3 white balls. We can construct a huge variety of probability problems with the 21 balls. For example, the probability to draw exactly 5 balls with this exact composition: 2 red, 2 black, 1 white.
My quick response was: Apply the hypergeometric distribution for each color.
- Exactly 2 red of 5 drawn in 7 red from a total of 21 balls: 0.375645 (1 in 2.662)
- Exactly 2 black of 5 drawn in 6 red from a total of 21 balls: 0.335397 (1 in 2.982)
- Exactly 1 white of 5 drawn in 3 white from a total of 21 balls: 0.4511278 (1 in 2.217)
Then, I combined the 3 floating-point results to get the simultaneous probability: 0.056838 or 1 in 17.6.
Software to the rescue! I am fortunate to be a very skilled programmer in these fields of mathematics and science. As a matter of truth, my software is still unique in the combinatorial, probabilistic, statistical fields. Thusly, generating combinations or permutations or arrangements from various groups of numbers, I noticed some results differed from my calculations above.
I came back to the previous probability problem of drawing EXACTLY a number of balls of various colors and pools.
- Exactly 2 red from a total of 7 red balls: Combinations of 7 taken 2 at a time = 21 cases;
- Exactly 2 black from a total of 6 black balls: C(6, 2) = 15 elements;
- Exactly 1 white from a total of 3 white balls: C(3, 1) = 3 cases.
We combine the elements of the 3 groups of differently colored balls (or different numbers, for that matter): 21 * 15 * 3 = 945.
Total number of combinations given by 21 balls taken 5 at a time: 20349. The simultaneous probability is 945 / 20349 = 0.04644 or 1 in 21.5.
My software will generate precisely the amount of combinations from those three groups. We substitute the colored balls by groups of numbers such as lotto decades, or frequency groups, etc. One such program is named SkipDecaFreq it works with decades (e.g. 1 9, 10 19, etc.), lottery frequency groups, odd, even, low, high lotto numbers, skips of lottery numbers (i.e. misses between hits).
How about the probability to draw 5 balls and get at least one ball of each color? Applying now the W option of SuperFormula (Win at least Lotto, Powerball):
- At least 1 red of 5 drawn in 7 red from a total of 21 balls: 0.9016 (1 in 1.109)
- At least 1 black of 5 drawn in 6 red from a total of 21 balls: 0.8524 (1 in 1.173)
- At least 1 green of 5 drawn in 5 green from a total of 21 balls: 0.7853 (1 in 1.273)
- At least 1 white of 5 drawn in 3 white from a total of 21 balls: 0.5789 (1 in 1.727)
Now, the correct combined probability is the product of the four: 0.3494 or 1 in 2.86.
The degree of certainty can be viewed as a probability of probability strongly connected to a number of trials. The master formula that calculates the number of trials N for an event of probability p to appear with a degree of certainty DC is known as the Fundamental Formula of Gambling. We may also call it the Fundamental Formula of The Universe or the Formula of The Everything.
If p = 1 / N, we can discover an interesting relation between the degree of certainty DC and the number of trials N. The degree of certainty has a limit, when N tends to infinity. Let's analyze a few particular cases.
Rolling the unbiased dice; actually just one die. The probability to get any one of the point faces is p = 1/6. The degree of certainty DC to get any one of point faces in 6 throws is 66.5%.
Spinning the roulette wheel. The probability to get any one of the 38 numbers is p = 1/38. The degree of certainty DC to get any one of the numbers in 38 spins is 63.7%.
Let's look at a case with a very large number of possibilities, therefore a very low probability a lotto 6/49 game. Total possible combinations in a 6/49 lotto game is 13,983,816. The probability to get any one of the combinations is p = 1/13,983,816. The degree of certainty DC to get any one of the numbers in 13,983,816 drawings is 63.212057% (.632120571967238...).
I noticed a mathematical limit. I saw clearly: lim((1 (1 / N))^{ N}) is equal to 1 / e (e represents the base of the natural logarithm or approximately 2.71828182845904...). Therefore:
The limit of 1 (1 / e) is equal to approximately .632120558828558...
I tested for N = 100,000,000 N = 500,000,000 N = 1,000,000,000 (one billion) trials. The results ever so slightly decrease, approaching the limit but never surpassing the limit!
When N = 100,000,000, then DC = .632120560667764...
When N = 1,000,000,000, then DC = .63212055901829...
(Calculations performed by Super Formula, option C = Degree of Certainty (DC), then option 1 = Degree of Certainty (DC), then option 2 = The program calculates p.)
You can see the mathematical proof right here, for the first time. I created a PDF file with nicely formatted equations:
The general formula of probability p = n / N (favorable cases over total possibilities) does not always lead to 1 / {Integer} cases (e.g. {1 in N}). For example, play 3 roulette numbers at a time. In this example, n = 3 and N = 38. The division 38/3 does not result in an integer. The Ion Saliu Paradox is not limited to 1 / Integer cases. In the roulette case here, playing 3 roulette numbers in 38 spins, the paradox leads to this result:
{1 [(1 / e) ^ 3]} = {1 [0.05]} = approx. 0.95
The generalized Ion Saliu Paradox for p = n / N and N trials:
The degree of certainty DC tends to {1 [(1 / e) ^ n]} = [1 (e)^{-n}], when N tends to infinity, regardless of probability p.
There are people, indeed with training in mathematics and probability theory, who don't accept the idea that a mathematical LIMIT can be reached both from the left (incrementally) and from the right (decreasingly). The limit of this probability paradox is reached decreasingly, whereas e is typically viewed as an incremental limit. Run my great program Super Formula, option C = Degree of Certainty DC.
In the renowned Problem of Coincidences or Couple, Spouse Swapping we had encountered a very similar limit: 1 / e. Very interesting how everything in life, indeed in the Universe is...coupled!
Is it the same difference the same thing gambling this way? For example: Play one lotto ticket N consecutive drawings, OR N tickets in one drawing? Something is very clear now. If you play one roulette number for 38 spins, you are not guaranteed to win! You have only a 63.7% chance to win. On the other hand, if you play all 38 numbers, you are guaranteed to win! You, who have ears to hear, don't bet it all on one spin or number. Play more numbers or tickets at once. The probability is significantly lower for two or more numbers or combinations to have simultaneously long losing streaks.
The mathematics of Ion Saliu's Paradox (Problem of N Trials) is presented in detail on the Mathematical Foundation of Fundamental Formula of Gambling page.
I wrote also software to simulate Ion Saliu's Paradox of N Trials:
Occupancy Saliu Paradox, special probability software.
Let me ask you another question, axiomatic one. What is the probability to randomly generate N elements at a time and ALL N elements be unique?
Let's say we have 6 dice (since a die has 6 faces or elements); we throw all 6 dice at the same time (a perfectly random draw). What is the probability that all 6 faces will be unique (i.e. from 1 to 6 in any order)? Total possible cases is calculated by the Saliusian sets (or exponents): 6^{6} (6 to the power of 6) or 46656. Total number of favorable cases is represented by permutations. The permutations are calculated by the factorial: 6! = 720. We calculate the probability of 6 unique point-faces on all 6 dice by dividing permutations to exponents: 720 / 46656 = 1 in 64.8.
We can generalize to N elements randomly drawn N at a time. The probability of all N elements be unique is equal to permutations over exponents. A precise formula reads:
This situation is related to another famous probability case: the Classical Occupancy Problem. This problem deals with the number of trials necessary for all N elements to appear in a random drawing, IF drawing 1 element at a time. That is, how many drawings (trials) would it take to have all 10 digits appear? I believe I offered the best solution to the Classical Occupancy Problem, including precise software. The blunt answer to that problem is: Infinity. That is a mathematical absurdity! We must correlate the Classical Occupancy Problem to a degree of certainty. For example: All 10 digits will be randomly drawn in 44 trials, with a degree of certainty equal to 99%.
I created another type of probability software that randomly generates unique numbers from N elements. I even offer the source code (totally free), plus the algorithms of random number generation (algorithm #2). For most situations, only the computer software can generate N random elements from a set of N unique or distinct items. Also, only the software can generate Ion Saliu's sets (exponents) when N is larger than even 5. Caveat: today's computers are not capable of handling very large Saliusian sets!
This is such an important step. We can make lots of mistakes doing the calculations by hand or even using calculators. The more complicated the formulas the higher the probability to make errors. The computer programs are of invaluable help, really. I know it firsthand. Moreover, good computer programs can also generate all the elements of various sets for various probability cases. It is similar to the case above involving four categories of balls.
You can use Permute Combine, the Word option for Combinations sets. You first need to write a simple text file consisting of 21 lines. Each line consists of one letter (the beginning of each color), from:
R
R
to
W
W
W
If you select the Lexicographical generation, there are 20,349 combinations. It would be hard to count all configurations such as RRBBW, WRBBR, etc. It would be easier to generate just 100 random combinations. The results should be close to the theoretical probabilities. I tested both ways, using also in-house software. This is THE most accurate method of determining the real probability. Generate ALL possible elements of the set (in our case, 5-letter combinations RRBBW, WRBBR, etc.) Then, count FAVORABLE occurrences (e.g. Red-White-Black-Red-Black, White-Black-Red-Red-Black, etc.)
More probability, mathematics, scientific software that I created and you can use with no problem (if you register):
SuperFormula: The best probability software period. Among many other functions, the program does a multitude of probability calculations: exactly, at least, and at most. Super Formula also calculates the relations between the probability p, the degree of certainty DC and the number of trials N.
FORMULA. This program does the calculations for the Fundamental Formula of Gambling as presented on the Fundamental Formula of Gambling page. The program can perform plenty of probability calculations. 16-bit software superseded by SuperFormula.
OddsCalc calculates the odds of any lotto game, including Powerball, Mega Millions, Euromillions and Keno. If the game draws 6 winning numbers, the program calculates the odds from 0 of 6 to 6 of 6 (the jackpot case).
ODDS calculates the lotto odds using the hypergeometric distribution probability. The odds (probabilities) are calculated as k of m in n from N. More clearly, let's suppose a lotto 6/49 game. The lottery draws 6 winning numbers. The player must play exactly 6 numbers per ticket. But the player can choose to play a pool of 10 favorite numbers. What is the probability to get 4 of 6 in 10 from 49? The odds: 1 in 90.
PermuteCombine, the universal permutations, arrangements and combinations calculator and generator for any numbers and words.
LexicographicSets, the universal permutations, arrangements and combinations lexicographic indexing (ranking).
OccupancySaliuParadox simulates Ion Saliu's Paradox of N Trials.
BirthdayParadox is based on the popular probability problem known as the Birthday Paradox. It is well presented by Warren Weaver in his famous book Lady Luck (page 132).
Collisions: The Birthday Paradox is one tiny particular case derived from the mathematical sets named EXPONENTS or Saliusian sets. The Saliusian sets (or Ion Saliu sets) are the best tools to calculate a wide variety of probability problems.
Collisions deals specifically with the probability of COLLISIONS or duplication/duplicates or coincidences. What is the probability of generating N random numbers with at least TWO of the numbers being exactly the same (duplicates)?
Birthday Paradox works best with birthday cases; i.e. smaller numbers, 1 to 365.
Collisions works best with larger numbers, such as genetic code sequences, lotto combinations, social security numbers, etc. Collisions is less accurate with small numbers, such as birthday cases; e.g. inaccurate for birthdays of 200 persons in the room.
Read Ion Saliu's first book in print: Probability Theory, Live!
~ Founded on valuable mathematical discoveries with a wide range of scientific applications, including probability theory applied to gambling, lottery, software, life, Universe.