Menu Top
Complete Course of Mathematics
Topic 1: Numbers & Numerical Applications Topic 2: Algebra Topic 3: Quantitative Aptitude
Topic 4: Geometry Topic 5: Construction Topic 6: Coordinate Geometry
Topic 7: Mensuration Topic 8: Trigonometry Topic 9: Sets, Relations & Functions
Topic 10: Calculus Topic 11: Mathematical Reasoning Topic 12: Vectors & Three-Dimensional Geometry
Topic 13: Linear Programming Topic 14: Index Numbers & Time-Based Data Topic 15: Financial Mathematics
Topic 16: Statistics & Probability


Content On This Page
Bayes’ Theorem: Statement and Formula Applications of Bayes’ Theorem (Solving Problems)


Bayes’ Theorem




Bayes’ Theorem: Statement and Formula


Introduction and Purpose

Bayes' Theorem, named after the 18th-century English statistician and theologian Thomas Bayes, is a fundamental theorem in probability that provides a framework for updating the probability of a hypothesis based on new evidence or information. It describes how to revise existing probabilities (prior probabilities) in light of new data to arrive at updated probabilities (posterior probabilities).

In essence, Bayes' Theorem allows us to calculate a conditional probability $P(E_k|A)$ (the probability of event $E_k$ given that event $A$ has occurred) using the "reverse" conditional probabilities $P(A|E_i)$ (the probability of $A$ given various conditions $E_i$) and the initial probabilities of those conditions $P(E_i)$.

It answers the question: Given that a particular outcome or piece of evidence ($A$) has been observed, what is the revised probability that it was caused by or is associated with a specific initial event or hypothesis ($E_k$)?


Setup and Statement of Bayes' Theorem

To state Bayes' Theorem formally, we first need to define a partition of the sample space and the event representing the new evidence.

Let $\{E_1, E_2, \dots, E_n\}$ be a set of $n$ events that form a **partition** of the sample space $S$. This means these events are:

Let $A$ be any event associated with the sample space $S$ such that $P(A) > 0$. Event $A$ represents the observed evidence or outcome.

**Bayes' Theorem states** that for any specific event $E_k$ from the partition (where $k$ is one of $1, 2, \dots, n$), the conditional probability of $E_k$ occurring given that event $A$ has occurred is given by the formula:

$$P(E_k|A) = \frac{P(E_k \cap A)}{P(A)}$$

... (1)

And by using the Multiplication Rule ($P(E_k \cap A) = P(E_k) P(A|E_k)$) for the numerator and the Law of Total Probability ($P(A) = \sum_{i=1}^{n} P(E_i) P(A|E_i)$) for the denominator, the formula becomes:

$$P(E_k|A) = \frac{P(E_k) P(A|E_k)}{\sum_{i=1}^{n} P(E_i) P(A|E_i)}$$

... (2)

This expanded form is the most commonly used version of Bayes' Theorem in applications.


Derivation of Bayes' Theorem

Bayes' Theorem can be easily derived from the basic definition of conditional probability and the Law of Total Probability.

  1. Start with the definition of conditional probability for $P(E_k|A)$:

    $$P(E_k|A) = \frac{P(E_k \cap A)}{P(A)} \quad (\text{assuming } P(A) > 0)$$

    ... (iii)

  2. Apply the general Multiplication Rule (Formula 2 from Section I1) to the numerator, $P(E_k \cap A)$:

    $$P(E_k \cap A) = P(E_k) P(A|E_k) \quad (\text{assuming } P(E_k) > 0)$$

    ... (iv)

  3. Apply the Law of Total Probability (Formula 2 from Section I4) to the denominator, $P(A)$. Since $\{E_1, E_2, \dots, E_n\}$ is a partition of $S$, the overall probability of event $A$ is the sum of the probabilities of $A$ occurring under each $E_i$:

    $$P(A) = P(E_1) P(A|E_1) + P(E_2) P(A|E_2) + \dots + P(E_n) P(A|E_n)$$

    $$P(A) = \sum\limits_{i=1}^{n} P(E_i) P(A|E_i)$$

    ... (v)

  4. Substitute the expression for the numerator from step 2 (equation iv) and the expression for the denominator from step 3 (equation v) back into the formula from step 1 (equation iii):

    $$P(E_k|A) = \frac{P(E_k) P(A|E_k)}{\sum\limits_{i=1}^{n} P(E_i) P(A|E_i)}$$

    ... (vi)

This completes the derivation of Bayes' Theorem.


Terminology in Bayes' Theorem

The terms in Bayes' Theorem have specific interpretations, particularly in the context of updating beliefs:

Bayes' Theorem provides a formal mechanism for learning from data – starting with a prior belief and updating it based on observed evidence to arrive at a posterior belief.



Applications of Bayes’ Theorem (Solving Problems)


Solving Problems using Bayes' Theorem

Bayes' Theorem is a powerful tool for solving problems where we are given the results of an experiment (the evidence) and want to determine the probability of a specific cause or condition that could have led to that result. These are often referred to as "inverse probability" problems because we are inferring back to the cause ($E_k$) based on the effect ($A$).

The general strategy for applying Bayes' Theorem to solve problems is outlined below:

  1. Define the Events Clearly:

    • Identify and define the set of mutually exclusive and exhaustive events ($E_1, E_2, \dots, E_n$) that represent the different possible scenarios, causes, or hypotheses that could exist or occur at the first stage of the problem. These events must form a partition of the sample space ($S$).
    • Identify and define the event $A$ that represents the observed evidence, outcome, or result obtained from the experiment.
  2. Assign the Known Probabilities:

    Determine the numerical values for the probabilities required:

    • **Prior Probabilities ($P(E_i)$):** The probabilities of each of the partition events $E_i$ occurring *before* the evidence $A$ is known. These should sum to 1 ($\sum P(E_i) = 1$).
    • **Conditional Probabilities (Likelihoods, $P(A|E_i)$):** The probabilities of observing the evidence $A$ *given* that each of the partition events $E_i$ has occurred.
  3. State the Goal:

    Clearly identify which posterior probability you need to calculate. This will always be of the form $P(E_k|A)$ for a specific event $E_k$ from the partition, given the evidence $A$.

  4. Calculate the Denominator ($P(A)$):

    Calculate the overall probability of the evidence $A$ occurring. This is typically done using the Law of Total Probability:

    $$P(A) = P(E_1) P(A|E_1) + P(E_2) P(A|E_2) + \dots + P(E_n) P(A|E_n) = \sum\limits_{i=1}^{n} P(E_i) P(A|E_i)$$

    ... (1)

    This sum serves as the denominator in Bayes' formula.

  5. Calculate the Numerator:

    Identify the specific term in the sum from step 4 that corresponds to the event $E_k$ whose posterior probability $P(E_k|A)$ you want to find. The numerator of Bayes' formula for $P(E_k|A)$ is $P(E_k) P(A|E_k)$.

    Numerator for $P(E_k|A) = P(E_k) P(A|E_k)$

    ... (2)

  6. Apply Bayes' Formula:

    Divide the numerator (step 5) by the denominator (step 4) to get the desired posterior probability:

    $$P(E_k|A) = \frac{P(E_k) P(A|E_k)}{P(A)}$$

    ... (3)

    Or, writing out the denominator fully:

    $$P(E_k|A) = \frac{P(E_k) P(A|E_k)}{\sum\limits_{i=1}^{n} P(E_i) P(A|E_i)}$$

    ... (4)

  7. Interpret the Result:

    State your final answer in the context of the original problem, explaining what the calculated posterior probability means.


Example

Example 1 (Urn Problem Revisited). Urn I contains 2 white (W) and 3 black (B) balls. Urn II contains 4 white and 1 black ball. A fair coin is tossed. If it lands Heads (H), Urn I is chosen and a ball is drawn. If it lands Tails (T), Urn II is chosen and a ball is drawn. If the ball drawn is found to be white, what is the probability that the ball was drawn from Urn I?

Answer:

Given: Two urn compositions, random urn selection based on coin toss, and the outcome that a white ball was drawn.

To Find: The probability that Urn I was chosen, given that the ball drawn was white.

Solution:

Step 1: Identify Events.

  • Let $E_1$ be the event that Urn I is chosen. This happens if the coin is Heads.
  • Let $E_2$ be the event that Urn II is chosen. This happens if the coin is Tails.
  • The set of events $\{E_1, E_2\}$ forms a partition of the sample space (mutually exclusive and exhaustive, and $P(E_1), P(E_2) > 0$).
  • Let $W$ be the event that the ball drawn is white. This is the observed evidence ($A$ in the general formula).

Step 2: Assign Probabilities.

  • **Prior Probabilities:** Since a fair coin is tossed to choose the urn, the probabilities of choosing Urn I or Urn II are equal:

    $$P(E_1) = P(\text{Heads}) = \frac{1}{2}$$

    ... (v)

    $$P(E_2) = P(\text{Tails}) = \frac{1}{2}$$

    ... (vi)

  • **Conditional Probabilities (Likelihoods):** These are the probabilities of drawing a white ball given that a specific urn was chosen.
    • $P(W|E_1)$: Probability of drawing a white ball given Urn I was chosen. Urn I has 2 white balls and 3 black balls (Total 5).

      $$P(W|E_1) = \frac{2}{5}$$

      ... (vii)

    • $P(W|E_2)$: Probability of drawing a white ball given Urn II was chosen. Urn II has 4 white balls and 1 black ball (Total 5).

      $$P(W|E_2) = \frac{4}{5}$$

      ... (viii)

Step 3: State the Goal.

We want to find the probability that the ball was drawn from Urn I, given that it was white. This is the posterior probability $P(E_1|W)$.

Step 4: Calculate the Denominator ($P(W)$).

The denominator of Bayes' formula is the overall probability of the evidence (drawing a white ball), $P(W)$. We calculate this using the Law of Total Probability with the partition $\{E_1, E_2\}$:

$$P(W) = P(E_1) P(W|E_1) + P(E_2) P(W|E_2)$$

... (ix)

Substitute values from (v), (vii), (vi), (viii):

$$P(W) = \left(\frac{1}{2}\right) \times \left(\frac{2}{5}\right) + \left(\frac{1}{2}\right) \times \left(\frac{4}{5}\right)$$

$$P(W) = \frac{2}{10} + \frac{4}{10} = \frac{6}{10}$$

$$P(W) = \frac{3}{5}$$

... (x)

Step 5: Calculate the Numerator.

The numerator for $P(E_1|W)$ is $P(E_1 \cap W)$, which by the multiplication rule is $P(E_1) P(W|E_1)$.

Numerator $= P(E_1) P(W|E_1)$

... (xi)

Substitute values from (v) and (vii):

Numerator $= \left(\frac{1}{2}\right) \times \left(\frac{2}{5}\right) = \frac{2}{10} = \frac{1}{5}$

... (xii)

Step 6: Apply Bayes' Formula.

Using the formula $P(E_k|A) = \frac{P(E_k) P(A|E_k)}{P(A)}$ (Formula 3), where $E_k=E_1$ and $A=W$:

$$P(E_1|W) = \frac{P(E_1) P(W|E_1)}{P(W)}$$

... (xiii)

Substitute the values from (xii) and (x):

$$P(E_1|W) = \frac{1/5}{3/5}$$

$$P(E_1|W) = \frac{1}{\cancel{5}} \times \frac{\cancel{5}}{3} = \frac{1}{3}$$

... (xiv)

Step 7: Interpret the Result.

The probability that the ball was drawn from Urn I, given that it was white, is $\frac{1}{3}$.

Note: The prior probability of choosing Urn I was $P(E_1) = 1/2$. After observing the evidence (drawing a white ball), which is more likely to come from Urn II ($P(W|E_2)=4/5$) than from Urn I ($P(W|E_1)=2/5$), our belief about the ball having come from Urn I decreases from 1/2 to 1/3. This demonstrates how Bayes' Theorem updates our beliefs based on new evidence.


Example 2 (Medical Test). A certain disease affects 1% of the population. A test for the disease is 95% accurate if the person has the disease (true positive rate) and 98% accurate if the person does not have the disease (true negative rate, meaning a 2% false positive rate). If a randomly selected person tests positive, what is the probability they actually have the disease?

Answer:

Given: Disease prevalence, true positive rate, true negative rate, and a positive test result.

To Find: Probability of having the disease given a positive test.

Solution:

Step 1: Identify Events.

  • Let $D$ be the event that a randomly selected person actually has the disease.
  • Let $D'$ be the event that a randomly selected person does not have the disease (the complement of D).
  • The set of events $\{D, D'\}$ forms a partition of the population (a person either has the disease or does not have it).
  • Let $+$ be the event that a randomly selected person tests positive for the disease. This is the observed evidence ($A$ in the general formula).
  • Let $-$ be the event that a randomly selected person tests negative.

Step 2: Assign Probabilities.

  • **Prior Probabilities:** The prevalence of the disease gives the prior probability of having the disease.

    $$P(D) = 1\% = 0.01$$

    ... (xv)

    The prior probability of not having the disease is:

    $$P(D') = 1 - P(D) = 1 - 0.01 = 0.99$$

    ... (xvi)

  • **Conditional Probabilities (Likelihoods):** The test accuracy provides these conditional probabilities.
    • True Positive Rate: Probability of testing positive given the person HAS the disease.

      $$P(+|D) = 95\% = 0.95$$

      ... (xvii)

    • True Negative Rate: Probability of testing negative given the person does NOT have the disease.

      $$P(-|D') = 98\% = 0.98$$

      ... (xviii)

    From the true negative rate, we can find the False Positive Rate: Probability of testing positive given the person does NOT have the disease ($P(+|D')$).

    Since testing positive and testing negative are complementary events (for a given condition of having or not having the disease), $P(+|D') + P(-|D') = 1$.

    $$P(+|D') = 1 - P(-|D') = 1 - 0.98 = 0.02$$

    ... (xix)

Step 3: State the Goal.

We want to find the probability that the person actually has the disease, given that they tested positive. This is the posterior probability $P(D|+)$.

Step 4: Calculate the Denominator ($P(+)$).

The denominator is the overall probability of testing positive, $P(+)$. We use the Law of Total Probability with the partition $\{D, D'\}$:

$$P(+) = P(D) P(+|D) + P(D') P(+|D')$$

... (xx)

Substitute values from (xv), (xvii), (xvi), (xix):

$$P(+) = (0.01)(0.95) + (0.99)(0.02)$$

$$P(+) = 0.0095 + 0.0198$$

$$P(+) = 0.0293$$

... (xxi)

Step 5: Calculate the Numerator.

The numerator for $P(D|+)$ is $P(D \cap +)$, which is $P(D) P(+|D)$.

Numerator $= P(D) P(+|D)$

... (xxii)

Substitute values from (xv) and (xvii):

Numerator $= (0.01)(0.95) = 0.0095$$

... (xxiii)

Step 6: Apply Bayes' Formula.

Using the formula $P(E_k|A) = \frac{P(E_k) P(A|E_k)}{P(A)}$ (Formula 3), where $E_k=D$ and $A=+$:

$$P(D|+) = \frac{P(D) P(+|D)}{P(+)}$$

... (xxiv)

Substitute values from (xxiii) and (xxi):

$$P(D|+) = \frac{0.0095}{0.0293}$$

$$P(D|+) \approx 0.32423$$

$$P(D|+) \approx 0.3242$$

(rounded to four decimal places) ... (xxv)

Step 7: Interpret the Result.

If a randomly selected person tests positive for this disease, the probability that they actually have the disease is approximately 0.3242, or about 32.4%.

Note: This result is often counter-intuitive. Despite the test being quite accurate (95% true positive, 98% true negative), if the disease is rare in the population (low prior probability of 0.01), a significant proportion of positive test results will come from people who do not have the disease (false positives). Bayes' Theorem correctly weighs the test's accuracy against the base rate (prevalence) of the disease.