This session provides an introduction to the fundamental concepts of probabilities, such as events, probability spaces, probability laws, and random variables. The goal of this instruction is to give students a solid understanding of the principles underlying probability calculations while helping them develop practical skills to apply these principles to real-world situations. We will also cover advanced concepts such as the laws of large numbers and the central limit theorem, which are essential for understanding the behavior of samples in surveys and research in the humanities and social sciences.
We have chosen, which is not usual, to precede the instruction on statistical inference with that on hypothesis testing, regression, and other tests related to two-way statistical tables, to give the reader an opportunity to fully understand the concepts of probability, random variables, and combinatorial analysis... which greatly contribute to making the following teachings more comprehensible.
In the second part of the instruction, we will focus on combinatorial analysis, which will help us count the possible configurations in complex situations (probabilities). We will explore basic combinatorial techniques, such as permutations, combinations, arrangements, and partitions. These techniques are particularly useful in probability theory, as they allow us to calculate the number of possible outcomes of a random experiment and thus determine the corresponding probabilities.
To infer results in quantitative analysis, we calculate the probability that a result is attributable to chance. If the probability is too high that the result is due to chance, it means that it should be viewed with caution. The risk of making an error by accepting this result is too great. One can also accept a sample result by allowing a probability of not making a mistake (Laflamme, S., & Zhou, R. M., 2014, p85).
In conclusion, this session aims not only to develop students' analytical skills in probability and combinatorics but also to raise their awareness of the importance of these tools for analyzing and understanding human and social phenomena. The knowledge gained will help you better understand the uncertainties and variabilities inherent in the phenomena studied in the humanities and social sciences, while preparing you for the use of quantitative methods in your future research.
This session on probabilities and statistical inference procedures aims to achieve the following objectives:
This course serves as an introduction and review of the main concepts related to the vocabulary and calculation of probabilities in a finite universe. This overview, which we consider essential, is the key that will help students better grasp the logic of inference, hypothesis testing, and other topics related to bivariate analysis;
To grasp the scope of bivariate analysis and statistical inference, we found it useful to include a brief section on combinatorial analysis, which mainly covers the concepts of arrangements, permutations, and combinations. Combinatorial analysis is fundamental to data analysis in surveys and random variables, so this course also aims to better understand random experiments and their role in statistical inference through this section;
This objective is to deepen knowledge in probabilities and combinatorial analysis to apply them in understanding the concepts of estimation and confidence intervals, in order to appreciate the depth of various statistical tests that will be detailed in the upcoming session;
This session also aims to prepare you for using quantitative data analysis software in your research work. We will present a case study and discuss the options, constraints, and conditions that researchers face when conducting their surveys, involving work with Python and more.
This session covers the fundamentals of probabilities and combinatorial analysis, based on the following concepts: probability spaces, events, conditional probability, independence, random variables, probability laws, expected value, variance and standard deviation, law of large numbers, central limit theorem, permutations, combinations, arrangements, partitions.
In this section, we will define the basic concepts of probabilities in a finite sample. Sampling and inference procedures largely rely on reasoning in terms of probabilities, which aim to model random experiments.
A random experiment is an experiment whose outcome cannot be known a priori. The result of this experiment is called an outcome, and the set of all possible outcomes is called the Universe of the random experiment, generally denoted as \(\Omega\).
The Universe is said to be finite when it contains only a finite number of outcomes (an event is a subset of the Universe). An event is realized if one of the outcomes composing it is realized.
From the above definition and note, the following elements can be derived:
Definition: Let \(A\) and \(B\) be two events in a Universe \(\Omega\).
The following figures illustrate the operations of intersection, union, and complement:
Note:
\(\overline{\Omega} = \varnothing\) and \(\overline{\varnothing} = \Omega\); more generally, we write \(\overline{\overline{A}} = A\);
\(A\) being an event: \(A \cup \overline{A} = \Omega\) and \(A \cap \overline{A} = \varnothing\) (if \(A\) is distinct from \(\varnothing\) and \(\Omega\), we say that the set \(\{A, \overline{A}\}\) is a partition of the universe).
This section provides a brief explanation of the principles of probability calculation on a finite universe (considering \(\Omega\) as non-empty and finite).
The probability of an event \(A\) in a finite space \(\Omega\) is a measure of the certainty or uncertainty of the occurrence of that event. It is defined as the ratio between the number of favorable cases for event \(A\) and the total number of possible cases in the universe \(\Omega\) of the considered events:
Outcome \(\omega\) | \(\omega_{1}\) | \(\omega_{2}\) | \(\omega_{3}\) | \(\omega_{4}\) | \(...\) | \(\omega_{n}\) |
---|---|---|---|---|---|---|
\(P \left\{\omega\right\} = \) | \( p_{1}\) | \( p_{2}\) | \( p_{3}\) | \( p_{4}\) | \(...\) | \( p_{n}\) |
Note:
Example and Explanation
Suppose we roll a six-sided die. We want to calculate the probability of getting a 3. \( P(\text{getting a 3}) = \frac{1}{6} \approx 0.17 \) or 17% |
Conditional ProbabilityThe conditional probability of event A given B is the probability that A occurs given that B has occurred. Formula: P(A|B) = P(A ∩ B) / P(B) |
Example and ExplanationSuppose we have a bag containing 3 red balls and 2 blue balls. If we draw a ball at random, the probability of drawing a red ball (A) given that the ball drawn is blue (B) is 0, as there is no intersection between the event of drawing a red ball and the event of drawing a blue ball. Formally, this is written as: P(A ∩ B) / P(B) = 0 / 2 = 0 |
Sum RuleFor two events A and B, the probability that A or B occurs is given by: Formula: P(A ∪ B) = P(A) + P(B) − P(A ∩ B) |
Example and Explanation
Suppose we have a six-sided die. Event A is "rolling an even number" and event B is "rolling a number greater than 4". The even numbers are {2, 4, 6} and the numbers greater than 4 are {5, 6}.
Applying the sum rule, we have: P(A ∪ B) = P(A) + P(B) − P(A ∩ B) = 1/2 + 1/3 − 1/6 = 3/6 + 2/6 − 1/6 = 4/6 = 2/3. |
Product RuleFor two independent events A and B, the probability that A and B both occur is the product of their individual probabilities. Formula: P(A ∩ B) = P(A) ⋅ P(B) |
Example and ExplanationSuppose we roll two six-sided dice. Event A is "rolling a 3 on the first die" and event B is "rolling a 5 on the second die". The probability of A is P(A) = 1/6 and the probability of B is P(B) = 1/6. Applying the product rule, we have: P(A ∩ B) = P(A) ⋅ P(B) = (1/6) ⋅ (1/6) = 1/36. |
Bayes' TheoremBayes' Theorem allows us to find the probability of an event A given another event B. Formula: P(A ∣ B) = P(B ∣ A) ⋅ P(A) / P(B) |
Example and Explanation
Suppose a disease affects 1% of the population. A screening test for this disease has a sensitivity of 99% (the probability of testing positive if one is sick) and a specificity of 95% (the probability of testing negative if one is not sick).
Let:
P(B) can be calculated as follows:
Applying Bayes' Theorem, we have:
Therefore, if a person tests positive, the probability that they are actually sick is about 16.67%. |
Total ProbabilityThe total probability is used to find the probability of an event by considering several partitions of the sample space. Formula: P(A) = ∑i P(A ∣ Bi) ⋅ P(Bi) |
Example and ExplanationSuppose there are three factories (F1, F2, F3) producing parts. The probability that each factory produces a part is as follows:
The probability that a part is defective given it came from each factory is:
Using the total probability formula, we have:
Therefore, the probability that a part is defective is 0.019 or 1.9%. |
Probability calculations are partly based on random variables.
To calculate probabilities, we use the standard normal curve, also known as the Z curve.
The standard normal curve is a graphical representation of the standard normal distribution, where any normal random variable is transformed into a random variable \(Z\) with a mean of \(0\) and a standard deviation of \(1\). The transformation is done using the formula:
Where:
The standard normal curve is essential in probability calculations because it allows the standardization of different normal distributions, thereby simplifying the calculation of associated probabilities. By using the Z curve, one can determine the probability that a random variable falls within a certain range using standard normal distribution tables.
Suppose we have a random variable \(X\) that follows a normal distribution with a mean \(\mu = 100\) and a standard deviation \(\sigma = 15\). We want to calculate the probability that \(X\) is less than 115.
First, we transform \(X\) into \(Z\) using the formula \(Z = (X - \mu) / \sigma\):
Next, we use the standard normal distribution table to find the probability that \(Z\) is less than 1 (this refers to Table 2 in the Appendix of Statistical Tables).
The standard normal distribution table gives us the cumulative probability up to a certain value of \(Z\). For \(Z = 1\), the table indicates that \(P(Z < 1) \approx 0.8413\), which means that the probability that \(X\) is less than 115 is approximately \(84.13\%\).
Z | Probability P(Z < Z) |
---|---|
0.9 | 0.8159 |
1.0 | 0.8413 |
1.1 | 0.8643 |
Thus, the value of \(Z = 1\) corresponds to a probability of \(0.8413\), which means there is an \(84.13\%\) chance that the variable \(X\) is less than 115 in this normal distribution.
The chart below represents the standard normal curve. The shaded area under the curve corresponds to the probability that \(Z\) is less than 1.
The transformation of \(X\) into the \(Z\) score allows us to bring the problem into a standardized situation where the mean is 0 and the standard deviation is 1. This standardization process enables the comparison of different normal distributions and their analysis using the same distribution table, regardless of their specific means and standard deviations.
The following code allows you to test your knowledge of the probability concepts covered so far, replacing the Sheets that usually appear at the end of each session's content.
The code is designed to generate an unlimited number of probability exercises on topics such as conditional probability, the sum rule, and the product rule. Each exercise comes with a solution and an explanation that the user can reveal by clicking the appropriate buttons.
How it works: The code uses functions to randomly generate exercises based on predefined templates.
Click "Generate Exercise" to start.
Combinatorial analysis is a branch of mathematics that studies methods for counting, arranging, and combining objects. It allows us to determine the number of different ways a set of elements can be organized while respecting certain rules and constraints.
The basic principles of combinatorial analysis include the addition rule and the multiplication rule.
A permutation is an arrangement of all the elements of a set in a specific order. The formula for calculating the number of permutations of a set of \( n \) elements is given by:
\( n! = n \times (n-1) \times (n-2) \times ... \times 1 \)
Where \( n! \) is the "factorial" of \( n \).
Permutations without repetition involve arrangements of elements where each element is unique and appears only once in each arrangement.
Example:
For a set of 3 elements {A, B, C}, the permutations without repetition are:
The total number of permutations is \( 3! = 6 \).
Permutations with repetition involve arrangements of elements where some elements may appear more than once.
The formula for permutations with repetition is:
\( \frac{n!}{n_1! \times n_2! \times ... \times n_k!} \)
Where \( n \) is the total number of elements, and \( n_1, n_2, ..., n_k \) are the frequencies of the repeated elements.
Example:
For a set of 3 elements {A, A, B}, the permutations with repetition are:
The total number of permutations is \( \frac{3!}{2!} = 3 \).
An arrangement is a selection of \( k \) elements from \( n \) elements of a set, where order matters. The formula for calculating the number of arrangements of \( n \) elements taken \( k \) at a time is given by:
\( A(n, k) = \frac{n!}{(n-k)!} \)
Where \( n! \) is the "factorial" of \( n \) and \( (n-k)! \) is the "factorial" of \( n-k \).
Arrangements without repetition involve selections of elements where each element is unique and appears only once in each arrangement.
Example:
For a set of 3 elements {A, B, C} taken 2 at a time, the arrangements without repetition are:
The total number of arrangements is \( A(3, 2) = \frac{3!}{(3-2)!} = 6 \).
Arrangements with repetition involve selections of elements where some elements may appear more than once.
The formula for arrangements with repetition is:
\( n^k \)
Where \( n \) is the total number of elements and \( k \) is the number of elements taken at a time.
Example:
For a set of 3 elements {A, B, C} taken 2 at a time, the arrangements with repetition are:
The total number of arrangements is \( 3^2 = 9 \).
A combination is a selection of \( k \) elements from \( n \) elements of a set, where order does not matter. The formula for calculating the number of combinations of \( n \) elements taken \( k \) at a time is given by:
\( C(n, k) = \frac{n!}{k! \times (n-k)!} \)
Where \( n! \) is the "factorial" of \( n \), \( k! \) is the "factorial" of \( k \), and \( (n-k)! \) is the "factorial" of \( n-k \).
Combinations without repetition involve selections of elements where each element is unique and appears only once in each selection.
Example:
For a set of 3 elements {A, B, C} taken 2 at a time, the combinations without repetition are:
The total number of combinations is \( C(3, 2) = \frac{3!}{2! \times 1!} = 3 \).
Combinations with repetition involve selections of elements where some elements may appear more than once.
The formula for combinations with repetition is:
\( C(n+k-1, k) = \frac{(n+k-1)!}{k! \times (n-1)!} \)
Where \( n \) is the total number of elements and \( k \) is the number of elements taken at a time.
Example:
For a set of 3 elements {A, B, C} taken 2 at a time, the combinations with repetition are:
The total number of combinations is \( C(3+2-1, 2) = \frac{4!}{2! \times 2!} = 6 \).
The following code allows you to test your knowledge on combinatorial analysis concepts. Each exercise is accompanied by a solution and an explanation.
Click "Generate Exercise" to start.
In this lesson, we covered the fundamental concepts of probability theory, combinatorial analysis, and probability calculations. The session aims to help understand and apply these concepts in various contexts.
During this session, we defined probabilities, emphasizing their major trait of studying random phenomena and measuring the likelihood of events. We also introduced the theme of combinatorial analysis as the study of different ways to organize or select objects from a set.Here is a review of the main concepts from the session:
The Support does not have a final bibliography (in its online version); references are inserted at the end of each Block.
The multiple-choice questions consist of fifteen questions related to probability and combinatorial analysis. To view and test your knowledge, click HERE :)
This session does not have sheets to download. During the directed work session dedicated to this topic, we will revisit probability calculations and combinatorial analysis using exercise generators and the Python compiler.
To delve further into the concepts related to probability and combinatorial analysis, you can consult the following documents and videos:
On the Course App, you will find the summary of this Block, as well as series of Directed Work related to it.
You will also find links to multimedia content relevant to the Block.
In the Notifications panel, an update is planned based on the questions raised by students during the Course and Directed Work sessions.
An update will also cover the exams from previous sessions, which will be reviewed during the Directed Work sessions to prepare for the current year's exams.
By using the link below, you can download the Flipbook in PDF format :
In this Python corner, a table summarizes the essentials for probability calculations and combinatorial analysis in Python.
Parameter | Python Code | Example |
---|---|---|
Simple Probability |
|
Suppose an urn contains 3 red balls and 7 blue balls.
What is the probability of drawing a red ball?
Calculation: p = 3 / 10 = 0.3 Explanation: The probability is 0.3 because there are 3 red balls out of a total of 10 balls. |
Conditional Probability |
|
Let A be the event "drawing a red ball," and B be the event "drawing a ball."
What is the probability of drawing a red ball given that a ball has been drawn?
Calculation: p_cond = 3/10 / 1 = 0.3 Explanation: The probability remains 0.3 because the ball was drawn from the entire set. |
Sum Rule |
|
Let A be the event "drawing a red ball" and B be the event "drawing a blue ball."
What is the probability of drawing either a red ball or a blue ball?
Calculation: p_sum = 0.3 + 0.7 - 0 = 1 Explanation: The probability of drawing a red or blue ball is 1 because these are the only possible outcomes. |
Product Rule |
|
Let A be the event "drawing a red ball" and B be the event "drawing a red ball."
What is the probability of drawing two red balls in a row?
Calculation: p_product = 0.3 * 0.2 = 0.06 Explanation: The probability of drawing two red balls consecutively is 0.06. |
Arrangement |
|
Suppose a group of 5 people. What is the number of ways to arrange 3 of them?
Calculation: arrangement = 5! / (5-3)! = 60 Explanation: There are 60 different ways to choose and arrange 3 people from a group of 5. |
Permutation |
|
How many different ways are there to arrange 4 people?
Calculation: permutation = 4! = 24 Explanation: There are 24 different ways to arrange 4 people. |
Combination |
|
Suppose a group of 5 people. What is the number of ways to choose 3 of them?
Calculation: combination = C(5,3) = 10 Explanation: There are 10 ways to choose 3 people from a group of 5. |
The forum allows you to discuss this session. You will notice a subscription button so you can follow discussions about research in the humanities and social sciences. It is also an opportunity for the instructor to address students' concerns and questions.