Bayes’ Theorem: Intuition Over Formula

As people join the focus group, we’ll begin with something fundamental that lies at the heart of quantitative reasoning - Bayes’ Theorem. This concept will underpin a lot of our future discussions on modeling and statistical inference, so it’s important to develop an intuitive understanding of how access to new information can be incorporated into your probability estimates.

You’ll all be familiar with the forumula:

P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)}

But, we need to understand why it works and what it represents. We’ll start with an awesome video by 3Brown1Blue:

Once you’re done with the video, let’s discuss the famous monty hall problem:

You’re on a game show and given the choice of three doors: behind one door is a sports car, behind the other two are goats. You win the prize behind the door that you select for opening.
Initially, you pick Door No. 1. The host, who knows what’s behind each door, opens another door - say Door No. 3, revealing a goat.
He then asks, “Do you want to switch your choice to Door No. 2?”
Is it to your advantage to switch?

This is a problem that you may have seen before, but it is important to understand what is happening behind the scenes:

  • What is the new information provided by the host?
  • How does your estimate of where the car is get affected in light of this new information?
  • Does that new information play to your advantage? What belief can be updated given that the host has provided this new information?
  • Try coding up and running a python simulation with 3 doors, a host that reveals a goat, and a guesser who switches vs who doesn’t switch. Let your program calculate which strategy yields optimal results. You can use this link to run your simulations - Bayes' Theorem: Simulating the Monty Hall problem
8 Likes

The host is essentially eliminating a possible door behind which the car can be so that we are left with only 2 doors instead of 3. Initially, there was 1/3 probability of the selected door having the car behind it. In case it was the correct door, switching would mean losing but 2 out of 3 times, it would’ve been a door with a goat behind it. So, switching would mean that we’d be finally selecting the correct door. Hence, the revelation increases the probability of success from 1/3 to 2/3 if we always switch.

I also tried to mathematically look at a generalised n-door problem. Now, initially, the probability of selecting the wrong door is (n-1)/n. In this case, the host selecting another door with a goat will eliminate 2 doors (one by the contestant and one by the host). Now, the probability of success becomes 1/(n-2). And the probability of selecting the correct door in the first try is 1/n. Switching will lead to losing in this case. So, applying total probability theorem, we will get the probability of success to be [(n-1)]/[(n(n-2)] if we switch when we are given the choice, which is greater than the initial probability of 1/n.

@research-focus-group

If you’ve completed the first two exercises, please reply directly to the post with your answers, thoughts, or doubts, wrapped in spoiler tags like this:

your answer here

This way, everyone can think independently before checking others’ responses.

Let’s keep the discussion active and async - share your reasoning, challenge each other’s approaches, and build on new insights.

We expect all focus group members to make a submission for each topic - this ensures the learning stays collaborative, structured, and valuable for everyone involved.

Looking at the generalized N doors case,

Again N doors are there, 1 door has car, rest N-1 have goats.

Let the host open M doors, 1≤M≤N-2, all goat doors.

Again 1/N chance of initially choosing car door and (N-1)/N chance of choosing goat door

Now we see that if we have chosen the goat door and host opens M doors, there is a chance of winning if we swap of 1/(N-M-1) but if we stay obviously 0

If we chose the car door winning chance by swapping is 0 and staying is 1

So we can see the probability finally of winning by swapping= (N-1)/N * 1/(N-M-1)

Winning by staying is 1/N itself always though.

I have used Bayes’ theorem breaking the problem into priors, likelihoods, and posteriors to show mathematically how the host’s action changes the odds.

Let C_i = event “the car is behind door i ”.
Our prior beliefs are:

P(C_1) = P(C_2) = P(C_3) = \frac{1}{3}.

Lets assume we pick door 1 and host opens door 3 showing a goat.
Let H_3 = event “the host opens Door 3” .
We want to find P(C_1 \mid H_3) and P(C_2 \mid H_3) .

Likelihoods

Case Description P(H_3 \mid C_i)
C_1 Car behind Door 1 (our pick); host can open 2 or 3 equally likely \frac{1}{2}
C_2 Car behind Door 2; host must open 3 1
C_3 Car behind Door 3; host never opens 3 0

Applying Bayes’ Theorem

P(C_2 \mid H_3) = \frac{P(C_2)P(H_3 \mid C_2)}{\sum_{i=1}^3 P(C_i)P(H_3 \mid C_i)} = \frac{\frac{1}{3} \cdot 1}{\frac{1}{3}\cdot\frac{1}{2} + \frac{1}{3}\cdot1 + \frac{1}{3}\cdot0} = \frac{\frac{1}{3}}{\frac{1}{6} + \frac{1}{3}} = \frac{\frac{1}{3}}{\frac{1}{2}} = \frac{2}{3}.

Similarly,

P(C_1 \mid H_3) = \frac{1}{3}.

Interpretation

The host’s action is not random, it is chosen knowing where the car is and avoiding it. That rule makes opening door 3 more likely when the car is behind door 2, so the posterior probability shifts: door 2 becomes twice as likely as door 1. Thus switching is advantageous.

  • Since the host knew which gate has car behind it, and atleast one of the unpicked gate has a goat behind it, therefore host will always open a unpicked door gate.
  • The probability of initial chosen gate to have a car is 1 / 3 and thus remaining had 2 / 3. But when the host opens the door P(Gate 3 has a car) becomes 0 and switching therefore increases the chance to 2 / 3 (i.e. rem 2 / 3 - gate 3 prob which is 0) as we won’t pick gate with P = 0.

In terms of bayes theorem, the hypothesis is that the chosen door is the correct one P(H), and P(E/H) (assuming there are n doors) is the probability that the host chooses a particular set of n-2 doors given your chosen door is correct. Through this we can prove P(H/E) = 1/n and P(notH/E) ( switching case ) = (n-1)/n as the probabilities of success given that P(E) is the probability of the host opening a particular set of n-2 doors.

We know that initially there is a 1/3 probability of choosing the car door and 2/3 probability of choosing a goat door. So initially we just have to make a choice. After this when the host opens one door revealing a goat, we get crucial new information.

Case 1: We choose a goat door which has a probability of 2/3. Here the host has only one goat door left to open as the one of the goat doors is our pick and the other door has a car. So the host has to open that remaining goat door and if we make the swap here, we will have chosen the car door.

Case 2: We choose the car door which has a probability of 1/3.

Here the host has two goat doors left to open and so he can randomly pick either of the two remaining doors, that won’t make a difference. If we swap here, we end up with the goat door.

The thing we notice here is that case 1 has twice the probability of case 2.

So the ideal choice with the new information we get from the host is to swap as 66.6% of the time we will have encountered case 1 and only 33.3% of the time we win by staying on our current door in case 2.

Initially,
P(goat behind A) = 2/3
P(car behind A) = 1/3
The host makes a choice between B and C only, giving us no new information to update the probabilities for A, as the host must choose a different door.
So,
P(goat behind B | host chooses C) = P(car behind A) = 1/3
P(car behind B | host chooses C) = 1 - P(goat behind B | host chooses C) = 2/3 (due to total probability being 1)
Therefore, after the host has chosen C, the probability for B to have the car behind it is updated to 2/3, while the probability that A has the car behind it remains unchanged at 1/3 → It is beneficial to switch our choice.

>! When the host reveals a door, the new information provided is:

  1. He always opens the door with a goat behind it and not the car door.

  2. He never opens the door that you picked initially.

Now with the given new information, our probabilities of selecting the car door change. Initially, before the host revealed, the probability that my selected door had the car behind it was 1/3 and it was 2/3 that the car was behind one of the other 2 doors. Now, that the host reveals one of the other two doors, the 2/3 probability doesn’t get redistributed,(because the car and goats remain the same), it still lies among these two non-selected doors- of which one has been revealed to not have the car- so there’s a 2/3 probability that the car is behind the door which was neither selected by me nor opened by the host. That’s why switching benefits my probability of choosing the car door from 1/3 to 2/3.

On writing up a python code, the probabilities of finding the car door by switching and not switching seemed to converge to 2/3 and 1/3 respectively as the number of games played was large enough. !<