Set theory
Statistics relies extensively on probability theory. In turn, probability theory relies extensively upon set theory - indeed, the most commonly accepted axiomatic foundations of probability are defined using set theory.
Table of contents
1. Experiments, sample spaces, and events
- Experiment: Any process (real or hypothetical) in which the possible outcomes can be identified ahead of time.
- Sample space or population: The set of all possible outcomes that could occur.
- Event: Any collection of possible outcomes in the experiment; any subset of the sample space.
Consider the result of rolling a numbered six-sided die. The act of rolling the die is an experiment; the value on the side of the die facing upward is an outcome, and the set of outcoms of observing either the sides labeled 1, 2, 3, or 4 facing upward, respectively, is an example of an event:
Note: The definition of an experiment is intentionally flexible, allowing almost any process to be labeled as an experiment.
References:
- Luce, R. D. (1986). Response times: Their role in inferring elementary mental organization. Oxford University Press. →
2. Set theory
Set theory provides a formal mathematical way of describing a collection of elements, thereby providing a rigorous framework to characterize the possible outcomes of an experiment.
- Set: A collection of objects (called elements or members) regarded as a single object.
- The sides of a die (1 to 6) are an example of a set
- A specific side (i.e., the side labeled “1”) is an element/member of the set
Consider the set Ω = { 1, 2, 3, 4, 5, 6 }. There are several operators we can use to indicate whether elements are in a set or not:
- 1 ∈ Ω - the number 1 is an element of the the set Ω.
- 7 ∉ Ω - the number 7 is not an element of the the set Ω.
- { 1, 3, 5 } ⊂ Ω - the set of numbers 1, 3, and 5 form a subset of the larger set Ω.
- { 1, 2, 3, 4, 5, 6 } ⊆ Ω - the set of numbers from 1 to 6 are a subset of or equal to the set Ω.
There are several special sets worthy of note:
- The empty set, or ∅ - a set with no elments or members (the empty set is any event that cannot occur).
- Real numbers, or ℝ - the set of all rational or irrational (e.g., ) numbers, either positive, negative, or zero.
- Natural numbers, or ℕ - the set of whole numbers (either starting from 0 or 1, depending on the field).
- Integers, or ℤ - the set with whole numbers, negative whole numbers, and zero.
Sets can be categorized as being…
- Finite; sets with a finite number of elements.
- Infinite; sets with an infinite number of elements. Furthermore, an infinite set can be:
- Countable (when there is a one-to-one correspondence between elements of the set and the set of natural numbers ℕ).
- Uncountable (sets that are neither finite nor countable).
The sample space, then, is a type of set, and an event is any subset of the sample space, allowing us to use all of the tools of set theory to describe the outcomes for the experiment of interest.
References:
- DeGroot, M. H., & Schervish, M. J. (2012). Probability and statistics (4th ed.). Boston, MA: Addison-Wesley. →
3. Elementary set operations
Three key operations that allow us to describe how sets do or don’t overlap are:
- The complement of a set:
- All elements not in a set
- Written as A’ or Ac
- The union of two sets:
- For sets A and B, their union refers to the set of elements contained in either A or B, irrespective of overlap
- Written as A ∪ B
- The intersection of two sets:
- For sets A and B, their intersection refers to the set of elements contained in both A or B, only overlapping elements
- Written as A ∩ B
Furthermore, sets A and B are mutually exclusive or disjoint when no elements in either set overlap; A ∩ B = ∅.
With care, one can treat set operations in a similar manner to addition or multiplication. For example, given any three events, A, B, and C, which are subsets of sample space Ω, we have:
- Commutativity:
- A ∪ B = B ∪ A
- A ∩ B = B ∩ A
- Associativity:
- A ∪ (B ∪ C) = (A ∪ B) ∪ C
- A ∩ (B ∩ C) = (A ∩ B) ∩ C
- Distributive Laws:
- A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
- A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)
- DeMorgan’s Laws:
- (A ∪ B)c = Ac ∩ Bc
- (A ∩ B)c = Ac ∪ Bc
The operations of union and intersection can be extended to a collection of sets, up to an infinite number of sets:
\[\cup_{i=1}^{\infty} E_i = \{ x \in \Omega : x \in E_i \text{ for some i} \}.\] \[\cap_{i=1}^{\infty} E_i = \{ x \in \Omega : x \in E_i \text{ for all i} \}.\]The collection of events E1, E2, . . ., are pairwise disjoint if Ei ∩ Ej = ∅ for all i ≠ j.
References:
- Casella, G., & Berger, R. L. (2002). Statistical inference (2nd ed.). Pacific Grove, CA: Thomson Learning. →
4. Examples in R
# Example sample space
S <- c( 1, 2, 3, 4, 5 )
# Subsets/Events
A <- c( 1, 2, 3 )
B <- c( 1, 3, 5 )
# Test if element is in set
is.element( 1, S ) # Returns: TRUE
is.element( 6, S ) # Returns: FALSE
# Elementary set operations
union( A, B ) # Returns: 1 2 3 5
intersect( A, B ) # Returns: 1 3
# Complement of
setdiff( S, A ) # Returns: 4 5 6
# Complement of B
setdiff( S, B ) # Returns: 2 4 6
Coming soon
- Check usage of events, subsets, and sets
Return to: Probability; Sections; Index; Home page