TABLE OF CONTENTS
・The word “Chance”
・Axioms of Probability
Mathematics articles that help in reading this article
・Numerical Computation:Ep. 1, Ep. 5, Ep. 14, Ep. 15
・Numerical Computation:Ep. 1, Ep. 5, Ep. 14, Ep. 15
The word “Chance”
The chance of winning a lottery can be calculated if you know the total number of tickets and how many of
them are winners.
More precisely, the chance of winning is given by the ratio of winning tickets to the total number of
tickets.
The chance of rain in a weather forecast is not something we can calculate as easily as a lottery. Even so, you don’t need to know how the chance of rain is calculated in order to use it as a guide—when the chance is high, you take an umbrella with you. In other words, when thinking about chance, there are situations where knowing the calculation method is not essential.
Even if we don’t know how it is calculated, we still understand that the numerical value of a chance is constrained to some extent. For example, if a weather forecast said “a 200% chance of rain,” most people would immediately find that strange. Because the chance of rain must fall between 0% and 100%, we can see that a “chance” cannot take just any value.
Finally, there are many words that describe how often something happens, such as “probably,” “likely,” “almost,” or “rarely.” When people use these expressions, they are usually describing how often something occurs based on past experience. However, if you were suddenly asked to specify exactly which occurrences happened on which trials—say, “the 3rd and 7th times out of 20”—most people would not be able to answer. This means that, unconsciously, people smooth out and summarize their past experiences to estimate how often events occur. This, too, can be regarded as a kind of probabilistic way of thinking.
The chance of rain in a weather forecast is not something we can calculate as easily as a lottery. Even so, you don’t need to know how the chance of rain is calculated in order to use it as a guide—when the chance is high, you take an umbrella with you. In other words, when thinking about chance, there are situations where knowing the calculation method is not essential.
Even if we don’t know how it is calculated, we still understand that the numerical value of a chance is constrained to some extent. For example, if a weather forecast said “a 200% chance of rain,” most people would immediately find that strange. Because the chance of rain must fall between 0% and 100%, we can see that a “chance” cannot take just any value.
Finally, there are many words that describe how often something happens, such as “probably,” “likely,” “almost,” or “rarely.” When people use these expressions, they are usually describing how often something occurs based on past experience. However, if you were suddenly asked to specify exactly which occurrences happened on which trials—say, “the 3rd and 7th times out of 20”—most people would not be able to answer. This means that, unconsciously, people smooth out and summarize their past experiences to estimate how often events occur. This, too, can be regarded as a kind of probabilistic way of thinking.
Axioms of Probability
Proposition:A statement or equation whose truth value is determined. Important propositions in
discussions are also called theorems.
Axioms:A proposition that is stated without proof, and is intended to prove another proposition.
Axioms:A proposition that is stated without proof, and is intended to prove another proposition.
Just as you prepare ingredients like flour and sugar to bake a cake, in mathematics you first set down a
collection of axioms, and from them proceed to prove various propositions when you want to discuss a
subject.
\[ \text{Axioms of Probability}\]
\( \Omega \) is a set consisting of elements called elementary events, and this set is referred
to as the sample space. The set \( \mathfrak{F} \), known as the event space, is a
collection of subsets of \( \Omega \), and its elements are called events. A triplet \( \left(
\Omega, \mathfrak{F}, P \right) \) that satisfies the following axioms of probability [A1]–[A6]
is called a probability space.
Axioms of Probability:
[A1] The union, difference, and intersection of any two elements in \( \mathfrak{F} \) are also contained in \( \mathfrak{F} \).
[A2] \( \Omega \in \mathfrak{F} \).
[A3] For any element \( A \in \mathfrak{F} \), a non-negative real number \( P(A) \), called the probability of the event \( A \), is assigned.
[A4] \( P(\Omega) = 1 \).
[A5] If two elements \( A \) and \( B \) in \( \mathfrak{F} \) are disjoint, then \[ P \left( A \cup B \right) = P \left( A \right) + P \left( B \right) \] In this case, \( A \) and \( B \) are said to be mutually exclusive.
[A6] For any decreasing sequence of elements in \( \mathfrak{F} \), \[ A_1 \supset A_2 \supset \cdots \supset A_n \supset \cdots \] if \[ \bigcap _{i=1} ^{\infty} A_i = \varnothing \] then \[ \lim _{i \to \infty} P \left( A_i \right) = 0 \]
Axioms of Probability:
[A1] The union, difference, and intersection of any two elements in \( \mathfrak{F} \) are also contained in \( \mathfrak{F} \).
[A2] \( \Omega \in \mathfrak{F} \).
[A3] For any element \( A \in \mathfrak{F} \), a non-negative real number \( P(A) \), called the probability of the event \( A \), is assigned.
[A4] \( P(\Omega) = 1 \).
[A5] If two elements \( A \) and \( B \) in \( \mathfrak{F} \) are disjoint, then \[ P \left( A \cup B \right) = P \left( A \right) + P \left( B \right) \] In this case, \( A \) and \( B \) are said to be mutually exclusive.
[A6] For any decreasing sequence of elements in \( \mathfrak{F} \), \[ A_1 \supset A_2 \supset \cdots \supset A_n \supset \cdots \] if \[ \bigcap _{i=1} ^{\infty} A_i = \varnothing \] then \[ \lim _{i \to \infty} P \left( A_i \right) = 0 \]
\( \mathfrak F \) is the letter F written in Fraktur script.
Fraktur is a style of lettering that was used in Germany for printing up until around the time of World War
II, and it is still used in mathematics today, where a large number of distinct symbols are needed.
Next, we will prove the following six propositions from the axioms.
Next, we will prove the following six propositions from the axioms.
From the axioms of probability, the following six propositions hold:
[1] If \( A \subset B \), then \( P(A) \leq P(B) \).
[2] \( P \left( \varnothing \right) = 0 \)
[3] \( P \left( A^c \right) = P \left( \Omega - A \right) = 1 - P \left( A \right) \)
Here, \( A^c \) is called the complementary event.
[4] \( 0 \leq P \left( A \right) \leq 1 \)
[5] \( P \left( A \cup B \right) = P \left( A \right) + P \left( B \right) - P \left( A \cap B \right) \)
[6] If a sequence of sets \( A_1, A_2, \dots, A_n, \dots \) in \( \mathfrak{F} \) are mutually exclusive, then \[ P \left( \bigcup _{i=1} ^{\infty} A_i \right) = \sum _{i=1} ^{\infty} P \left( A_i \right).\]
[1] If \( A \subset B \), then \( P(A) \leq P(B) \).
[2] \( P \left( \varnothing \right) = 0 \)
[3] \( P \left( A^c \right) = P \left( \Omega - A \right) = 1 - P \left( A \right) \)
Here, \( A^c \) is called the complementary event.
[4] \( 0 \leq P \left( A \right) \leq 1 \)
[5] \( P \left( A \cup B \right) = P \left( A \right) + P \left( B \right) - P \left( A \cap B \right) \)
[6] If a sequence of sets \( A_1, A_2, \dots, A_n, \dots \) in \( \mathfrak{F} \) are mutually exclusive, then \[ P \left( \bigcup _{i=1} ^{\infty} A_i \right) = \sum _{i=1} ^{\infty} P \left( A_i \right).\]
Proof of [1]
Let \( A, B \in \mathfrak{F} \) and suppose \( A \subset B \). Since \( B = A \cup (B - A) \), by Axioms A3 and A5 we have \[ P(B) = P(A) + P(B - A) \ge P(A). \] Therefore, statement [1] holds.
Proof of [2]
Let \( A \in \mathfrak{F} \). Since \( A = A \cup \varnothing \), by Axiom A5 we obtain \[ \begin{aligned} P(A) &= P(A) + P(\varnothing) \\\\ 0 &= P(\varnothing) \end{aligned} \] Therefore, statement [2] holds.
Let \( A, B \in \mathfrak{F} \) and suppose \( A \subset B \). Since \( B = A \cup (B - A) \), by Axioms A3 and A5 we have \[ P(B) = P(A) + P(B - A) \ge P(A). \] Therefore, statement [1] holds.
Proof of [2]
Let \( A \in \mathfrak{F} \). Since \( A = A \cup \varnothing \), by Axiom A5 we obtain \[ \begin{aligned} P(A) &= P(A) + P(\varnothing) \\\\ 0 &= P(\varnothing) \end{aligned} \] Therefore, statement [2] holds.
Proof of [3]
Let \( A \in \mathfrak F. \) Since \( \Omega = A \cup \left( \Omega - A \right), \) by Axioms A4 and A5 we have \[ \begin{align} P \left( \Omega \right) &= P \left( A \right) + P \left( \Omega - A \right) = 1 \\\\ P \left( \Omega - A \right) &= 1 - P \left( A \right) \end{align}\] Therefore, statement [3] holds.
Let \( A \in \mathfrak F. \) Since \( \Omega = A \cup \left( \Omega - A \right), \) by Axioms A4 and A5 we have \[ \begin{align} P \left( \Omega \right) &= P \left( A \right) + P \left( \Omega - A \right) = 1 \\\\ P \left( \Omega - A \right) &= 1 - P \left( A \right) \end{align}\] Therefore, statement [3] holds.
Proof of [4]
Let \( A \in \mathfrak F .\) Since \( A \subset \Omega, \) by Axiom A3, Axiom A4, and statement [1], we have \[ 0 \leq P \left( A \right) \leq P \left( \Omega \right) = 1\] Therefore, statement [4] holds.
Let \( A \in \mathfrak F .\) Since \( A \subset \Omega, \) by Axiom A3, Axiom A4, and statement [1], we have \[ 0 \leq P \left( A \right) \leq P \left( \Omega \right) = 1\] Therefore, statement [4] holds.
Proof of [5]
Let \( A,B \in \mathfrak F .\) Since \( A \cup B = A \cup \left( B - A \right) ,\) by Axiom A5 we have \[ P \left( A \cup B \right) = P \left( A \right) + P \left( B - A \right) \ \ \ldots (1) \] Also, since \( B = \left( A \cap B \right) \cup \left( B - A \right), \) again by Axiom A5 we obtain \[ P \left( B \right) = P \left( A \cap B \right) + P \left( B - A \right) \ \ \ldots (2) \] Subtracting \( (2) \) from \( (1) ,\) we get \[ \begin{align} P \left( A \cup B \right) - P \left( B \right) &= P \left( A \right) - P \left( A \cap B \right) \\\\ P \left( A \cup B \right) &= P \left( A \right) + P \left( B \right) - P \left( A \cap B \right) \end{align}\] Therefore, statement [5] holds.
Let \( A,B \in \mathfrak F .\) Since \( A \cup B = A \cup \left( B - A \right) ,\) by Axiom A5 we have \[ P \left( A \cup B \right) = P \left( A \right) + P \left( B - A \right) \ \ \ldots (1) \] Also, since \( B = \left( A \cap B \right) \cup \left( B - A \right), \) again by Axiom A5 we obtain \[ P \left( B \right) = P \left( A \cap B \right) + P \left( B - A \right) \ \ \ldots (2) \] Subtracting \( (2) \) from \( (1) ,\) we get \[ \begin{align} P \left( A \cup B \right) - P \left( B \right) &= P \left( A \right) - P \left( A \cap B \right) \\\\ P \left( A \cup B \right) &= P \left( A \right) + P \left( B \right) - P \left( A \cap B \right) \end{align}\] Therefore, statement [5] holds.
Proof of [6]
Let \( \mathbb{N} \) denote the set of all natural numbers. Let \( \left\{ A_i \right\} _{i \in \mathbb N} \) be a sequence of events in \( \mathfrak{F} \) that are mutually exclusive. Define \[ R_n = \bigcup _{i=n} ^{\infty} A_i \quad \left( n \in \mathbb N \right)\] Then \( {R_n} \) is a decreasing sequence in \( \mathfrak{F} \), and \[ \bigcap_{n=1}^{\infty} R_n = \varnothing . \] Indeed, suppose there exists an element \[ x \in \bigcap_{n=1}^{\infty} R_n . \] Then for every \( n \), \[ x \in R_n = \bigcup_{i=n}^{\infty} A_i \quad \text{and} \quad x \in R_{n+1} = \bigcup_{i=n+1}^{\infty} A_i . \] Now suppose \( x \in A_n \). Since the family \( \left\{ A_i \right\} _{i \in \mathbb N} \) is mutually exclusive, \[ x \notin A_i \quad (i \ge n+1). \] Hence \( x \notin R_{n+1} \), which contradicts \( x \in R_{n+1} \). Therefore \( x \notin A_n \). Since this holds for every \( n \), there exists no \( i \) such that \( x \in A_i \). On the other hand, \[ x \in \bigcap_{n=1}^{\infty} R_n \subset R_1 = \bigcup_{i=1}^{\infty} A_i , \] so there must exist some \( i \) such that \( x \in A_i \), which is a contradiction. Hence, \[ \bigcap_{n=1}^{\infty} R_n = \varnothing . \]
By Axiom A6, \[ \lim_{n \to \infty} P(R_n) = 0. \] Since \( A_{n+1} \subset R_n \), by [1] and Axiom A3, \[ \begin{align} 0 \leq P \left( A_{n+1} \right) & \leq P \left( R_n \right) \\\\ 0 \leq \lim _{n \to \infty} P \left( A_{n+1} \right) & \leq \lim _{n \to \infty} P \left( R_n \right) = 0 \end{align}\] Therefore, \[ \lim_{n\to\infty} P(A_{n+1}) = 0 . \] On the other hand, by Axiom A5, \[ P \left( \bigcup _{i=1} ^{n+1} A_i \right) = \sum _{i=1} ^n P \left( A_i \right) + P \left( A_{n+1} \right) \] Taking the limit as \( n \to \infty \) on both sides yields \[ P \left( \bigcup _{i=1} ^{\infty} A_i \right) = \sum _{i=1} ^{\infty} P \left( A_i \right) \] Therefore, statement [6] holds.
Let \( \mathbb{N} \) denote the set of all natural numbers. Let \( \left\{ A_i \right\} _{i \in \mathbb N} \) be a sequence of events in \( \mathfrak{F} \) that are mutually exclusive. Define \[ R_n = \bigcup _{i=n} ^{\infty} A_i \quad \left( n \in \mathbb N \right)\] Then \( {R_n} \) is a decreasing sequence in \( \mathfrak{F} \), and \[ \bigcap_{n=1}^{\infty} R_n = \varnothing . \] Indeed, suppose there exists an element \[ x \in \bigcap_{n=1}^{\infty} R_n . \] Then for every \( n \), \[ x \in R_n = \bigcup_{i=n}^{\infty} A_i \quad \text{and} \quad x \in R_{n+1} = \bigcup_{i=n+1}^{\infty} A_i . \] Now suppose \( x \in A_n \). Since the family \( \left\{ A_i \right\} _{i \in \mathbb N} \) is mutually exclusive, \[ x \notin A_i \quad (i \ge n+1). \] Hence \( x \notin R_{n+1} \), which contradicts \( x \in R_{n+1} \). Therefore \( x \notin A_n \). Since this holds for every \( n \), there exists no \( i \) such that \( x \in A_i \). On the other hand, \[ x \in \bigcap_{n=1}^{\infty} R_n \subset R_1 = \bigcup_{i=1}^{\infty} A_i , \] so there must exist some \( i \) such that \( x \in A_i \), which is a contradiction. Hence, \[ \bigcap_{n=1}^{\infty} R_n = \varnothing . \]
By Axiom A6, \[ \lim_{n \to \infty} P(R_n) = 0. \] Since \( A_{n+1} \subset R_n \), by [1] and Axiom A3, \[ \begin{align} 0 \leq P \left( A_{n+1} \right) & \leq P \left( R_n \right) \\\\ 0 \leq \lim _{n \to \infty} P \left( A_{n+1} \right) & \leq \lim _{n \to \infty} P \left( R_n \right) = 0 \end{align}\] Therefore, \[ \lim_{n\to\infty} P(A_{n+1}) = 0 . \] On the other hand, by Axiom A5, \[ P \left( \bigcup _{i=1} ^{n+1} A_i \right) = \sum _{i=1} ^n P \left( A_i \right) + P \left( A_{n+1} \right) \] Taking the limit as \( n \to \infty \) on both sides yields \[ P \left( \bigcup _{i=1} ^{\infty} A_i \right) = \sum _{i=1} ^{\infty} P \left( A_i \right) \] Therefore, statement [6] holds.
Reference:
[1] 宮西正宜 24 others, 高等学校 数学A 改訂版, 新興出版社啓林館, December 10, 2008
[2] A.N.Kolmogorov, translated by Nathan Morrison, FOUNDATIONS OF THE THEORY OF PROBABILITY, CHELSEA
PUBLISHING COMPANY NEW YORK, 1950
[3] Wikipedia Fraktur, https://en.wikipedia.org/wiki/Fraktur, March 9, 2026