The Sleeping Beauty Problem

Jun 5, 2025

Philosophy professors are failing a high school level word problem

11 Comments

The Dutch book argument for thirding seems pretty obvious and straightforward to me. If on every waking you're asked to place a wager on the result of the coin toss, you maximize your winnings by betting heads:tails at a 1:2 ratio.

If you're asked to place your bet at the end of the experiment, then of course you should bet at a 1:1 ratio. In that case it obviously doesn't matter how many times you woke up.

What I don't really understand is why you seem to think the second one is the obvious interpretation of this word problem. As you correctly point out, it's not exactly clear what "credence" is supposed to mean here. But at least if you interpret it as betting odds, the thirder position is pretty clear.

Expand full comment

Jeff Jo

Jun 20

The controversy is really rather trivial. Does SB gain the information that makes it a conditional probability problem, or not? And the correct history helps to see how it came to this point.

It was originated by Arnold Zuboff. A very long-lived hypnotist puts SB to sleep, and wakes her either once each day over the next one trillion days, or on just one randomly selected day in that period, based on a coin flip (result unspecified). Every time she is wakened, she is asked for her credence/confidence that this is the only time she was/will be awakened. Note that this is the same as the probability that the coin indicated that she should be wakened only once.

Adam Elga created the popular version, but that wasn't the question he posed. SB is wakened once, or twice, based on a fair coin flip. Heads means once, but no days are mentioned. In order to solve this problem as a conditional probability problem, he labeled the wakenings with the days Monday and Tuesday. There are FOUR, not THREE, combinations of the day and coin.

But Elga found a way to ignore one. If we tell SB that it is Monday, then only two combinations remain: Mon&T, and Mon&H. If we tell her that the coin landed T, again only two combinations remain: Mon&T, and Tue&T. Since each pair must be equally-likely, and one appears in each pair, Elga concluded that these three combinations must be equally likely even before SB is told anything. Since they are the only possibilities if SB is awake, each must have a 1/3 probability when she is awake.

The problem with this is, that by ignoring Tue&H, it encourages some to think it is not a possibility in the experiment. That Tuesday is somehow removed from the calendar if the coin lands on Heads.

We can remove this problem by using four volunteers instead of one. Each will be assigned a different combination from the set {Mon&H, Mon&T, Tue&H, Tue&T}. Three will be wakened on each day, excluding the one who was assigned that day and the actual coin result. Each will be asked, a la Zuboff, for the probability that this is her only waking. Note this is the same problem as the popular one for the volunteer assigned Tue&H, and an equivalent one for each of the other volunteers.

But each knows that their answers should be the same, that the question is true for exactly one of them, and that their answers have to sum to 1. That answer is 1/3.

The correct solution to the popular problem, is that there are four equally-likely combinations that can apply to a randomly-selected day in the experiment. Since SB is awake, H&Tue is eliminated, and the remaining combinations each have a probability of 1/3.

Expand full comment

Kyle Star

Jun 16

You collapse Monday and Tuesday awakenings into a generic event, and say you’re only scoring once per run. Just because Beauty doesn’t know whether it’s Monday or Tuesday doesn’t mean you can ignore them. They’re latent variables. There’s simpler probability problems that show latent variables can’t be merged.

Credence in this problem is evaluated at the actual moment sleeping beauty is asked when awoken, not once per experimental run at the end of the week. You change the premises in your post. Beauty simply having the subjective experience of waking up happening is new evidence — that’s twice as likely under tails. This sample space has three awakening events (H-Mon, T-Mon, T-Tues). The thirder position is correct.

Expand full comment

Reply (1)

Leon Voß

Jun 17

I don't believe you fully read my article. If you think you did, you understood less than half of it, and it comes off as if you'd just read half. Yet you're speaking as if you think you read and understood all of it. You must be really bad at recognizing when you don't know something.

Expand full comment

Reply (1)

Kyle Star

Jun 17

Yeah, overconfidence, snide remarks (towards an entire profession!) and calling your criticizers stupid while not discussing any of their points is a great way to arrive at the truth.

I at least wish you’d defended merging Tuesday and Monday events in the Bayesian calculation. A PHD level statistician doesn’t understand the idea of a latent variable?

Expand full comment

Reply (1)

Leon Voß

Jun 17

>overconfidence

I'm extremely learned so it's just confidence. You have overconfidence, not me.

>I at least wish you’d defended merging Tuesday and Monday events in the Bayesian calculation

I guess you didn't read the 80% of the article where I did that.

>while not discussing any of their points

I guess you missed the huge part of the article where I derived the thirder position and assessed the canonical thirder argument.

>A PHD level statistician doesn’t understand the idea of a latent variable?

Snide remarks which is funny coming from someone of your literacy ability

Expand full comment

Enon

Jun 13

What is the probability that this protocol made it through an IRB? Pretty damn low.

What is the probability that the problem is not an accurate description of actual events? Virtually certain.

What is the probability that the imaginary psych researchers in the problem are messing with the imaginary girl's head? Very high. Imaginary psych researchers are notorious liars.

Any calculation of probabilities that ignores these issues is wrong.

Expand full comment

NebulaPolitics

Jun 25

Hey Joseph, can I get a link to your Discord server?

Expand full comment

Comment removed

Jul 10

Comment removed

Expand full comment

Reply (1)

Leon Voß

Jul 10Edited

>I don't see it this way at all. Let's just say, on any day sleeping beauty wakes up, she has to guess the coinflip without credence. If they run the experiment 100 weeks, then they'll end up with roughly 150 wakings 100 of which are tails wakings. You acknowledge this in your P(H) given waking calculation. So if SB guesses tails every time she'll get $100 and heads every time she'll get $50.

Why wouldn't you score her at the end of the week, when she's asked about the coin itself, not whether she has woken up twice in the week?

I feel like I covered this reasoning exactly in my article. Is my writing so bad that you felt the need to repeat what I said? Did you read the entire article? If so, did you feel you understood all of it? Could I have made it more understandable without weakening the substance?

Expand full comment

Reply (1)

Comment removed

Jul 10Edited

Comment removed

Expand full comment

Reply (1)

Leon Voß

Jul 10

Why didn't you answer any of my questions?

Do you understand that semantic debates are pointless, you can define credence however, if you define it as "do you believe, little beauty, that the coin is fair?" then the correct answer is yes, which implies a "credence" that it landed on heads on the most recent flip of 50% (ie, it's fair).

Expand full comment

Reply (1)

Comment removed

Jul 10Edited

Expand full comment

Reply (1)

Leon Voß

Jul 10

Are you an LLM? I've asked you

Is my writing so bad that you felt the need to repeat what I said?

Did you read the entire article? If so, did you feel you understood all of it? Could I have made it more understandable without weakening the substance?

Why didn't you answer any of my questions?

Do you understand that semantic debates are pointless, you can define credence however, if you define it as "do you believe, little beauty, that the coin is fair?

Which makes 4 questions. You've answered none of them. Here's 1 more:

Can you summarize my argument that it is a reasonable reading of "credence"? You don't seem to have grokked it.

If you don't address all 5 questions in your next comment on this blog, I'll ban you.

Expand full comment