« FizzBuzz | World without Jobs » |

Tue, Oct 18, 2011

This came up for discussion about a year ago at work. It annoyed me then. But I let it pass and had almost forgotten about it, when it came up on a forum that I occasionally post to this morning.

I don't know what it is about this concept that annoys me so much.. I suspect it's because it's something so obviously simple, yet so many people get so passionate about being wrong.

So, it typically gets phrased like this: Somebody tells you *"I have two children. One of them is a boy."* What are the odds that their OTHER child is also a boy?

Your first instinct would be to say "50/50" - a child's gender is a discreet variable, which means there's no connection between what one child's gender is and what gender their sibling might be. Simple.

Maybe too simple? Because then people think a little more and say *"When you have two children, the gender options can be represented as:*

Boy, Boy

Boy, Girl

Girl, Boy

Girl, Girl

*and since I know one of them is a boy, I can eliminate the bottom row:*

Boy, Boy

Boy, Girl

Girl, Boy

*Thus there are THREE possibilities, not two. And only in one of them is the other child a boy. So the odds are 1/3 that the second child is a boy."*

Somewhat impressively, this appears to show that it's possible to break the rules and link discreet variables together to improve your odds.

The problem is, this is impossible. Discreet variables are discreet. You cannot use one to predict the other. So what's the problem?

The problem is that the little two-by-four table we started with only applies when we don't know anything about the two children. Once we've been informed about one of them, it doesn't become the two-by-three grid. It becomes a different two-by-four.

Let's show this by making our table be just letters:

bb

bg

gb

gg

We don't know anything about either child. We are now told one of them is a boy - which we shall represent as B. The new matrix is NOT:

Bb

Bg

gB

as we had above. Because this is clearly wrong: We've considered the order in which the children are born to be significant if one is a girl, but not if they're both boys. Clearly, this is flawed. To be consistent, we have to consider it as:

Bb

bB

Bg

gB

So we're back to 50/50

Usually, when this is being discussed, you'll get told that this is wrong: You can't make boy-boy count twice, because it's not in the original table twice. The simplest way to refute this is to simply imagine that you have four people, each of whom has two children. One has two boys, one has two girls, and the other two have one boy and one girl. A perfect distribution.

One of them HAS to tell you "I have a boy"

One of them HAS to tell you "I have a girl"

Two of them can tell you either, but because our odds are perfect in this hypothesis, one will tell you "I have a boy" and the other will tell you "I have a girl"

If you guess that the first two people's second child is of the same gender, you will be right. If you guess it for the other two, you will be wrong. 50/50

The biggest problem with the alleged paradox isn't mathematical, it's grammatical. Because you CAN easily phrase it so that the odds ARE one in three. Let's take our four people above. Now ask each of them "Is one of your children a boy?"

If they say "Yes", you have a one in three chance that the other child will also be a boy. You should therefore guess that it's a girl. Why is it different when you ask rather than letting them tell you? Simple: Because by asking, you exclude one of your four people: The person with two girls will say "No"

If you're allowed to bias your sample, you can easily make the odds change. That's rather the definition of bias.

If people can only tell you "I have a boy" then the odds are one in three. If people can pick one child at random and tell you their gender, then your odds are only 50/50.

The problem is, people understand numbers: if you prove that one equals two, they know full well that no matter how convincing your logic, there must be a trick in there somewhere. But people are generally less good at probability: If you show them that discreet variables are not discreet, they won't hold onto the certainty that you're tricking them.

And, because I realize that some people are so in love with the idea that they're clever *(because they say the odds are one in three)* that even after all my patient explanations, they'll still refuse to budge on their position that a discreet variable can be made indiscreet, I've decided to go beyond proving it logically, and just proving it outright. Here is a simple Perl script. It runs the 'paradox' situation both ways a million times. You can run it yourself, and inspect the logic, and you'll still find that the answer to the original question is, was, and always will be, 50/50. And only if you're allowed to bias the sample can you get it to be 1/3.

When I run it, I get the output:

*When told a random child's gender, I got 500163 out of 1000000 right
That is, I was correct 50.0163 percent of the time
When permitted to ask a child's gender, I got 250090 out of 750861 right
That is, I was correct 33.3070967862228 percent of the time*

So either correct the code below, write your own code that proves you right, or just accept you're ****ing wrong and move on with your life.

Thank you.

`#!/usr/bin/perl`

use strict;

use warnings;

my $count = 1000000;

my $counted = 0;

my $i = 0;

my $correct_guesses = 0;

# Run the "One of my children is..." scenario $count times

while ($i++ < $count){

# Randomly assign genders for two children: either 1 or 0

my $child_a = get_gender();

my $child_b = get_gender();

# Randomly choose one child to reveal the gender of

my $known_gender = choose_child() eq 'a' ? $child_a : $child_b;

# Guess that the other child has the same gender

$correct_guesses++ if (($known_gender * 2) == ($child_a + $child_b));

# Increment the count of the number of times we guessed

$counted++;

}

print "When told a random child's gender, I got $correct_guesses out of $counted right\n";

print "That is, I was correct ".(($correct_guesses/$count)*100)." percent of the time\n";

# Reset variables

$i = 0;

$correct_guesses = 0;

$counted = 0;

# Run the "Is one child a boy?" scenario $count times

while ($i++ < $count){

# Randomly assign genders for two children: either 1 or 0

my $child_a = get_gender();

my $child_b = get_gender();

# Find out if at least one child is '1'

next unless ($child_a + $child_b);

# Guess that the other child has the same gender

$correct_guesses++ if ($child_a == $child_b);

# Increment the count of the number of times we guessed

$counted++;

}

print "When permitted to ask a child's gender, I got $correct_guesses out of $counted right\n";

print "That is, I was correct ".(($correct_guesses/$counted)*100)." percent of the time\n";

sub get_gender {

my $random = int(rand(100));

return $random % 2;

}

sub choose_child {

my $random = int(rand(100));

return $random % 2 ? 'a' : 'b';

}

Comment from: Yuta [Visitor]

Lesson on statistics notwithstanding, there is indeed a non-randomness of genders inside sibships. In a sibship containing one mail, there is a higher probability of other siblings being male. Moreover, the order of children matters. If we are given information beforehand that known sibling (male) is the oldest, we have slightly better chance of predicting that the next one will also be male, than we would have otherwise. In fact, this deviation from 50:50 probability of each child in sibship being male or female is routinely used is statistical textbooks to illustrate clumped binomial distribution = )

27/10/11 @ 18:42

Comment from: carrpin [Visitor]

Bravo! I have found someone that really understands this thing. The answer is 1/2, 50-50, period. There is no parodox. It is no riddle or puzzle. ie: There exists another child, that child is either a boy (50%) or a girl (50%), regardless of the gender, or birth order, or birth date, or number of its sibling(s). The answer is never 1/3. Never. The arguments used to defend the answer of 1/3 are always flawed. You addressed the most common flaw by including two possible combinations of Boy/Boy (Bb and bB). I sure would love to debate this with some of those flawed 1/3 people.

23/12/11 @ 03:58

Comment from: shortmanikos [Visitor] · http://youreka.gr

"I have two children. One of them is a boy."

What we know is that someone has two children and one of the two is a boy.

A = they are both boys

B = one of the two children is a boy

( in a two child family )

P(A) = 1/4, P(B) = 3/4 and P(A and B) = P(A) = 1/4 so

P(A|B) = 1/3

( If you don't recognize the symbol P(A|B) there is no point in arguing.)

The somebody in the riddle doesn't say "I have two kids and I pick one at random and it is a boy" - you are assuming that this is what happens. The sentence is simple and it gives you two bits of information, there are two kids and one of the two is a boy. The rest is conditional probability.

( in the above reasoning if we change B to being "a randomly picked child is a boy" we immediately have P(B)= 1/2 and P(A|B)=1/2 )

What we know is that someone has two children and one of the two is a boy.

A = they are both boys

B = one of the two children is a boy

( in a two child family )

P(A) = 1/4, P(B) = 3/4 and P(A and B) = P(A) = 1/4 so

P(A|B) = 1/3

( If you don't recognize the symbol P(A|B) there is no point in arguing.)

The somebody in the riddle doesn't say "I have two kids and I pick one at random and it is a boy" - you are assuming that this is what happens. The sentence is simple and it gives you two bits of information, there are two kids and one of the two is a boy. The rest is conditional probability.

( in the above reasoning if we change B to being "a randomly picked child is a boy" we immediately have P(B)= 1/2 and P(A|B)=1/2 )

08/01/12 @ 21:15