**23 people**. In a room of just 23 people there’s a 50-50 chance of two people having the same birthday. In a room of 75 there’s a 99.9% chance of two people matching.

Put down the calculator and pitchfork, I don’t speak heresy. The birthday paradox is strange, counter-intuitive, and **completely true**. It’s only a “paradox” because our brains can’t handle the compounding power of exponents. We expect probabilities to be linear and only consider the scenarios we’re involved in (both faulty assumptions, by the way).

Let’s see why the paradox happens and how it works.

## Problem 1: Exponents aren’t intuitive

We’ve taught ourselves mathematics and statistics, but let’s not kid ourselves: it’s not natural.

Here’s an example: What’s the chance of getting 10 heads in a row when flipping coins? The untrained brain might think like this:

“Well, getting one head is a 50% chance. Getting two heads is twice as hard, so a 25% chance. Getting **ten** heads is probably 10 times harder… so about 50%/10 or a 5% chance.”

And there we sit, smug as a bug on a rug. No dice bub.

**After pounding your head with statistics**, you know not to divide, but use **exponents**. The chance of 10 heads is not .5/10 but .5^{10}, or about .001.

But even after training, we get caught again. At 5% interest we’ll double our money in 14 years, rather than the “expected” 20. Did you naturally infer the Rule of 72 when learning about interest rates? Probably not. Understanding compound exponential growth with our linear brains is hard.

## Problem 2: Humans are a tad bit selfish

Take a look at the news. Notice how much of the negative news is the result of acting without considering others. I’m an optimist and *do* have hope for mankind, but that’s a separate discussion :).

In a room of 23, do you think of the 22 comparisons where **your** birthday is being compared against someone else’s? Probably.

Do you think of the **231** comparisons where someone who is not you is being checked against someone else who is not you? Do you realize there are so many? Probably not.

The fact that we neglect the **10 times as many** comparisons that don’t include us helps us see why the “paradox” can happen.

## Ok, fine, humans are awful: Show me the math!

The question: What are the chances that two people share a birthday in a group of 23?

Sure, we could list the pairs and count all the ways they could match. But that’s hard: there could be 1, 2, 3 or even 23 matches!

It’s like asking “What’s the chance of getting one or more heads in 23 coin flips?” There are so many possibilities: heads on the first throw, or the 3rd, or the last, or the 1st and 3rd, the 2nd and 21st, and so on.

How do we solve the coin problem? Flip it around (Get it? Get it?). Rather than counting every way to get heads, **find the chance of getting all tails, our “problem scenario”**.

If there’s a 1% chance of getting all tails (more like .5^23 but work with me here), there’s a 99% chance of having **at least one head**. I don’t know if it’s 1 head, or 2, or 15 or 23: we got heads, and that’s what matters. If we subtract the chance of a problem scenario from 1 we are left with the probability of a good scenario.

The same principle applies for birthdays. Instead of finding all the ways we match, **find the chance that everyone is different, the “problem scenario”**. We then take the opposite probability and get the chance of a match. It may be 1 match, or 2, or 20, but somebody matched, which is what we need to find.

## Explanation: Counting Pairs

With 23 people we have 253 pairs:

(Brush up on combinations and permutations if you like).

The chance of 2 people having different birthdays is:

Makes sense, right? There’s 364 out of 365 birthdays that are “OK”.

Having all **253 pairs** be different is like getting heads 253 times in a row (well, sort-of: let’s assume birthdays are independent). We use exponents to find the probability:

99.7260% is really close to one, but when you multiply it by itself a few hundred times, it shrinks. Really fast.

The chance that we have a match is: 1 – 49.95% = 50.05%, or just over half! If you want to find the probability of a match for any number of people n the formula is:

## Interactive Example

I didn’t believe we needed only 23 people. The math works out, but is it real?

You bet. Try the example below: Pick a number of items (365), a number of people (23) and run a few trials. You’ll see the theoretical match and your actual match as you run your trials. Go ahead, click the button (or see the full page).

As you run more and more trials (keep clicking!) the actual probability should approach the theoretical one.

## Examples and Takeaways

Here are a few lessons from the birthday paradox:

**sqrt(n)**is roughly the number you need to have a 50% chance of a match with n items. sqrt(365) is about 20. This comes into play in cryptography for the birthday attack.- Even though there are 2
^{128}(1e38) GUIDs, we only have 2^{64}(1e19) to use up before a 50% chance of collision. And 50% is really, really high. - You only need 13 people picking letters of the alphabet to have 95% chance of a match. Try it above (people = 13, items = 26).
- Exponential growth rapidly decreases the chance of picking unique items (aka it increases the cranes of a match). Remember: exponents are non-intuitive and humans are selfish!

After thinking about it a lot, the birthday paradox finally clicks with me. But I still check out the interactive example just to make sure.

## Appendix A: Repeated Multiplication Explanation (Geeky Math Alert!)

Remember how we assumed birthdays are independent? Well, they aren’t.

If Person 1 and Person 3 match, and Person 3 and 5 match, we know that 1 and 5 match also. The outcome of 1 and 5 depends on their results with 3, which means the results aren’t an independent 1/365 chance (in our case, it’s a 100% chance of a match).

When counting pairs we did math as if birthdays were like independent coin flips, and multiplied probabilities. This assumption isn’t strictly true but it’s “good enough” for a small number of people (23) compared to the sample size (365). It’s unlikely to have multiple people match and screw up the independence, so it’s a good approximation.

It’s unlikely, but it can happen. Let’s figure out the real chances of each person picking a different number:

- The first person has a 100% chance of a unique number (of course)
- The second has a (1 – 1/365) chance (all but 1 number from the 365)
- The third has a (1 – 2/365) chance (all but 2 numbers)
- The 23rd has a (1 – 22/365) (all but 22 numbers)

The multiplication looks pretty ugly:

But there’s a shortcut we can take. When x is close to 0, a coarse first-order Taylor approximation for e^{x} is:

so

Using our handy shortcut we can rewrite the big equation to:

But we remember that adding the numbers 1 to n = n(n + 1)/2. Don’t confuse this with n(n-1)/2, which is C(n,2) or the number of pairs of n items. They look almost the same!

Adding 1 to 22 is (22 * 23)/2 so we get:

Phew. This approximation is very close and good enough for government work, as they say. If you simplify the formula a bit and swap in *n* for 23 you get:

and

## Appendix B: The General Birthday Formula

Let’s generalize the formula to picking *n* people from *T* total items (instead of 365):

If we choose a probability (like 50% chance of a match) and solve for *n*:

Voila! If you take sqrt(T) items (17% more if you want to be picky) then you have about a 50-50 chance of getting a match. If you plug in other numbers you can solve for other probabilities:

Remember that m is the *desired chance of a match* (it’s easy to get confused, I did it myself). If you want a 90% chance of matching birthdays, plug m=90% and T=365 into the equation and see that you need 41 people.

Wikipedia has even more details to satisfy your inner nerd. Go forth and enjoy.

### Appendix C: Try it out!

Plug in your own numbers into the below:

The math here is actually wrong. The chances of individual pairs are not independent. You math would work if you take each pair and have them name a random number between 1 and 365.

With this math, taking a group of 365 people still results in a non-zero chance that they all have different birthdays.

Thanks for the info, you’re right. I did some more digging (good paper here) and birthdays aren’t mutually independent.

If Person 1 = Person 3, and Person 3 = Person 5, there isn’t an independent event that Person 1 = Person 5. The probability of 1 matching 5 has already been determined by the other statements.

From what I was able to gather, this is only a problem if there are existing overlapping pairs. For a small n relative to the number of outcomes (365), it’s unlikely to have multiple matches that affect the probability, so assuming independence may be ok for computing approximations.

The last formula is incorrect, it should be:

n ~ sqt(-2 ln(1-p)) sqt(T)

^^^

or else you are finding the probability to miss.

Thanks for the tip! I fixed up the article to use p(different) and p(match), which is much more clear.

The “take-away lesson” about GUIDs is wrong. GUIDs are (theoretically) guaranteed to be globally unique, because they include such things as the MAC address of your network card (something which is globally unique until some cheap NIC manufacturer starts recycling them) and the current time.

The catch is that because of the time factor, the current GUID algorithm won’t last forever. We will run out in a couple of centuries.

Hi, that’s a good point about MAC addresses. However, if you consider GUIDs as just a giant random number (for the purposes of the exercise), you are looking for how many “items” out of a pool of 2^128 you can distribute before having a 50% chance of collision.

For the birthday paradox, it’s about 23 items (of a pool of 365) before a 50% chance of collision. For GUIDs, it will be roughly 2^64 items before a 50% chance of collision.

There’s a bit more information here:

http://en.wikipedia.org/wiki/UUID

Hope this helps,

-Kalid

can the math in the birthday paradox applicable to pick3 lottery?

Hi Allan, I’m not too familiar with the rules of Pick3, but I’ll take a shot.

The birthday paradox helps find the chance that any two random numbers will “collide” in a set.

In Pick3, you don’t really care if two guesses collide… you want the guess to collide with the winning number. In this case, two losing tickets that both guessed 123 (when the real answer was 999) isn’t helpful.

I may be missing something though!

Hey, great blog.

“” A coarse first-order Taylor approximation for e^x is: \displaystyle{e^x \approx 1 + x}”

that’s just valid if x

[…] if x

[..] if x is far less than 1

I am doin a science fair experiment on this i need help–and i need to know if the math is over my head??!!

@nt: Thanks for the tip, I updated the article to make that more clear.

@Ashton: Hi Ashton, you might want to ask your math teacher to see if you’ve covered the necessary topics in class. You’ll probably need statistics and combinatorics.

hello kalid,

i read a few of your articles and think they are freaking awesome.

thanks and keep up the good work.

Hi Zhao, thanks for the comment! I’ll try to keep cranking out the posts :).

Heyy ;; i have no clue how to do this!

I think that the math behind this birthday paradox is wrong..

The chance of two people having same birthdays is 1/365 = 0.0027397

therefore p(n)= 0.0027397 ^C(n,2)

if we take an example of 23 people

we get p(23)= 0.0027397 ^ 253 ~=0

so how is it possible??

Hi, you’re correct 1/365 is the chance of 2 people having the same birthday. However, (1/365)^253 would be the chance of 253 people having the *same* birthday! (Which, as you see, is pretty close to zero).

For this problem, it’s important not to mix up 1/365 (the chance of 1 collision) and 364/365 (the chance of no collision). We first find the chance that somehow, everyone manages to be different:

p(23 people have different birthdays) = (364/365)^253

If there is a 40% chance that everyone is different, there is 1-40% = 60% chance that there was an overlap somewhere. Hope this helps. (Technically, we are assuming independent events but that subtlety is not important for the main point).

hi,

(364/365)^253 means that 253 people have different birthdays

when you check this for 366 people , there is a >=100% chance for the birthday paradox.

but when you use this fomula we get the answer as 1 – 2.6 * 10^-80 which is less than 1

why is it so??

AND I have never seen two people having the same birthday in my group which has a greater strength than 23.this cannot be a coincidence!!!

I still doubt that there is a 50% chance of people having the same birthday

Hi, when you make the probability like (364/365)^253, you are assuming independent events. What this means is that each comparison is “fresh”, with no memory of the past. It would be like having 2 people pick the same number out of 365, and choosing a different number each time.

This approximation makes the math easier, and is ok for small values. If you want the actual %, take a look at Appendix A.

Yep, the paradox seems strange, doesn’t it? Take a look at this page and run some experiments on your own to see:

http://betterexplained.com/examples/birthday/birthday.html

As you click “run trial”, you will see the actual match percentage for 23 people approach 50%, which is the predicted one. Hope this helps.

the math for the birthday paradox is in fact quite simple, the “problem scenario” probability is in fact

364/365 times 363/365 times … times (364-22)/365

you should think like this.

-person one chose a day of the year as a birthday

-person two chose a day of the year as its birthday, BUT DIFFERENT than person one’s choice.

-so on

-person 23 does the same BUT DIFFERENT than previous people choices.

This is exactly what I wrote above in probabilities

Oh yeah,… sorry the last fraction is (365-22)/365

bye

Yep, that’s right. Sometimes that multiplication can be long to do out — see Appendix A for a shortcut.

Thanks for this…im gonna use this as an idea for science fair!

Testing to see if the Birthday Paradox holds true.

23 in a room, 50% chance two will match!

Can’t wait!

Sounds great Brittany! And if you have 75 people at your fair, you’re almost guaranteed to have a match :).

It’s funny. There are actually two birthday paradoxes. The other comes from logic and is actually, actually, according to Quine, a veridical paradox, where it appears to be paradoxical, yet is proven true anyway, the fact that someone turns 7 when they are twenty-eight years old (born feb. 29), much like this birthday paradox.

What is interesting is that the two overlap. So to properly treat the birthday paradox (your version) you would have to take this into account.

So a very interesting treatment would be: what happens to the probability of sharing a birthday when you take into account feb 29, twins, triplets, etc, the fact (i believe) that there are higher frequencies of babies born during certain times of the year than others.

I might work this out, if asked, but I don’t think it would work out to 50% out of 23. It would be interesting to see how close it was though.

thx 4 the info it was confusing but really good, im going 2 use this 4 my science fair project

Does the dependency matter really at all?? I have just read it once, so maybe I don’t get it yet, but it seems you are just looking for at least 1 match?

50/50 chance of at least one match? If that is the case why would the dependency matter?

It seems since you are looking at each individual group at a time, that each event would be independent from the rest. Therefore looking at each group separately each group has a 1/365th possibility of matching?

hmm

I don’t know

I have a question: 6 people, one movie being advertized…3 people having the same birthday…and same birthday show on advertisement at the same time. What would the ‘chances’ be? This was an actual event.

I asked a question on 4-1-09 about the ‘birthday paradox’ and an actual event. The reason I would like to know the ‘odds’ of that happening, is because it was one event out of four similar events. Any suggestions on where I can find some answer to the ‘odds’ other than here?

Hi Ashton, you might want to ask your math teacher to see if you’ve covered the necessary topics in class. You’ll probably need statistics and combinatorics.

all my brothers and sisters(not by both same parents)have the birthday of 2or 16

The explanations given are all approximations, in order to get an exact result you follow the start to Appendix A, but don’t attempt to simplify with e^x. The solution is actually fairly simple, for n possibilities (days in the year) and k events (people at the party) we get a probability of:

1 – (P(n-1,k-1)/(n^(k-1))).

Where P(n,k) is the number of ways to pick k elements from a set of n, or n!/(n-k)!.

This will give an exact solution, the probability of finding two people with the same birthday from a crowd of 23 is more accurately: 50.7297234%

I hope that this makes sense, if it doesn’t, look at the page on combinatorics and/or think about the fact that (1-(j/n)) = ((n-j)/n) with reference to Appendix A.

Aren’t there 366 possible birthdays? (feb 29)

this is really interesting!! im doing a math project on this !! nice topic!!

@angelina: Awesome, glad you liked it!

the way i intuitively see the 50% is like this,

imagine throwing 23 point blobs of paint at a calender on the wall. Then move all the dates that hit into a tidy ~5×5 square in the corner.

Now throw another 23 blobs of paint at the wall.

To me, it is almost inconceivable that no paint blobs will now touch the dates in the 23 blob square in the corner (50/50 maybe)

I got it after third heading….HAHAHAHAHAHA!!!

Thank you.

Well detailed and nicely structured guide for a really misinterpreted problem.

great explaination. its helping. i’m doing a math project about diehard randomness test…. can you help me to understand of the test that is called birthday spacing test?

Your formula P(different) = e to the power of – (22*23/2*365) is incorrect. If you punc that in to a TI-83 plus, you’ll get zero as an answer. You need 22*23 in it’s own bracket and the same for the 2*365. The division sign would stay outside of the two brackets, in the middle. It should really look like this: P(different) = e to the power of – (22*23)/(2*365). It took my forever to understand what was wrong with the equations until I finally clicked ‘very close’ and say the other calculations. Haha. Hope this helps anyone who was as confused as I was. I thought my TI-83 was broken! (:

Umm, I’m not a mathamatician… so please excuse me if this is a stupid comment. I understand the principle behind the calculations. I even agree that they are correct. However, one thing which seems to me to be incorrect is the assumption that birthdays are prefectly evenly distributed throughout the year. An equal weight (likelyhood) is being assigned to each day of the year. I think in reality that there are far more birthdays at certain times of the year, and therefore on certain days of the year (9 months after xmas, valentines day, etc.) I don’t see anywhere where this is being included into the calculations. Can you explain? Thanks.

@Caro: Great question! Yes, you are absolutely right — we currently assume that birthdays are distributed evenly. To simplify the problem, we ignore the possibility that birthdays could have a certain spread — realistically, it may be slightly more than 23 birthdays to account for this. But, I doubt the real distribution is very much different from the ideal one (certain holidays only celebrated in certain countries, etc.) so it might all average out reasonably. But that’s a great point to bring up.

first i got a bit confused

dat wat d helll is birthday paradox

but after reading dis

it is damn easy

thank U

very much

In our retirement village we have a birthday book, which contains about 80 names. The birthdays are read out each month to a gathering of about 25 people. If a certain person is there the same time as me,we have a match. Otherwise. NO

Please stop confusing people. Let’s stop the confusion all over the world with this annoyingly wrong principal. I don’t mean computatively wrong. I mean, it is wrong to call it the birthday principal. It is a number principal with 365 set numbers principal.

This problem has been confusing people for the longest time, because no one will explain that it does not do what people think it is supposed to do. Which is calculate the odds that 365 people in a room will find someone else with their birthday.

The problem itself is actually very easy to understand. Even I can understand it and I never learned any advanced math. The equation is cheating. It has nothing to do with any applicable birthdays. There is no reason to delete each match after it is made.

This is not a paradox. This is a simple math problem, and its title confuses people into thinking that something impossible is happening, when its not, they are just being confused by an incorrectly named title of a principal.

Old thread, but still interesting. Here’s a simpler way of doing it – look at your Facebook birthdays, how many shared birthdays are there?

Indeed, export your friends birthdays, pick a sample of 23, and see if they match up – quite surprising!

LOL! At first I thought about my classroom, and instinct said how unlikely it was that two people had the same birthday, and then I realized we had a set of twins…

@Kat: Hah, an even easier way to see it in action

Your equation right after you mention “the multiplication looks pretty ugly” looks like it could be computed using factorial(!) notation, which many scientific calculators have:

1*(1-1/365)*(1-2/365)*…*(1-22/365)=

1*(364/365)*(363/365)*…*(343/365)=

365!/{(365-22)!*365^23}

But 365! is likely too big for many calculators to handle.

@Andy: Great point — and yep, probably much too large for normal calculators.

@gavin I went ahead and created a Facebook app to show the Birthday Paradox with your friends: http://apps.facebook.com/thebirthdayparadox/

…and then sometimes I put too much egg white into the batter….

oh dang it!! Wrong website!!

On average, Facebook members have befriended 130 other Facebook members, which means there is nearly 100% chance that every average member of Facebook shares a birthday with a person on his/her friend list. If any staff member from Facebook is listening, I’d like to know if the outcome matches the theory!

@Joe: Actually, in this case it means among your 130 friends it’s almost a 100% chance that two of them share a birthday (not necessarily with you though! Friend A and friend B could have a birthday in common).

Why doesn’t the following work :

We start with the first person in the group. The probability of another person in the group with the same birthday is 22/365 (since the probability of any one person having the same birthday is 1/365 and these are independent probabilities). Then we go to the next person. There is 21/365 chance of finding another person in the group with the same birthday. The next one is 20/365 and so on. And since these are all independent if each other we can add the probabilities, which gives us 253/365. This is the probability of finding 2 people in the group with the same bday.

What am I missing here?

Thanks!

@sonny: You can’t add the probabilities :). By that reasoning, if we had 30 people in the group, the chance would be (1 + 2 + … + 30) / 365 = 435 / 365, which is greater than 1. It is the right idea to consider each pair, though.

Thanks Khalid. So is there a way to solve solve this without using the ‘negative’.. that is not by calculating the probability of someone else in the group not having the same bday? Do it directly instead?

@sonny: Great question — I don’t think my probability knowledge is strong enough :). The issue is you need to enumerate every possible type of collision: 1 with 3, 1 and 2 with 3, 1 and 3 and 14… all of which are “problem scenarios”. It’s a bit like writing a spellcheck where you keep track of the possible typos vs. having the correct word and seeing if what you wrote is different from that :).

THis is great your the bomb man, how did you figure this our, science fair project here we go

@Chris: Glad you liked it.

I’m no mathematician but I am very intrigued by it . I have come up with a simple answer for this problem for thoughs who think in a way I do. I start off by assuming that there is on average 30 days in each month so imagine a calendar with 30 days so don’t imagine the specific days but when you think that each month has the same numbers it actually makes more since so instead of writing all the math think in normal ration terms. If you get 23 people in a room with the same birth number not month then you know you have about a 23 out of 30 chance not that’s pretty high well there’s only 12 months and thoughs chances slim down a bit but anyway that’s my quick thought on it

Hi, Kalid. I loved this post. Very interesting stuff!

I work with someone who was born on the same day in the same year in the same state only a few hours away (opposite coast from current residence) and only 2–3 hours apart. What are the chances?

@Jamal: Interesting way to think about it — breaking it down by birth “day” and then birth “month” (might be easier to see how common it is).

@roy: Thanks, glad you liked it! Wow… there should be a name for that, virtual twin :).

Thank You for the awesome facts!!! I love it. Really helped with my algebra project xD.

@Kat: Thanks!

No problem. I totally understand this and Im only in 7th grade! woo hoo. But yea. Im gonna reccomend to my friends some of them are doing the birthday paradox project too Thanks again Kalid!

@Kat: Whoa, that’s awesome you’re getting this so early, you’ve got quite a head start! More than welcome!

When I was in 7th grade my science teacher bet that there weren’t 2 people in our class of about 30 people who had the same birthday. We laughed our butts off at him because right away we had a set of twins in the class. Even once we removed them we all said our birthdays and we found the set of twins, me and a guy who all had the same birthday (Aug 3). There was also another pair of unrelated people who had the same birthday.

The math is slightly flawed in the respect that there are actually 366 days/year during leap years. Very interesting though.. -d

Went through entire grade school without anyone sharing my same birthday?

elementary K-6

secondary 7classes twice a year for another 6 years.

Some explain the chances of that happening?

@Don: True :). Might need to make slight adjustments or plug in 365.25 into the equation =)

@Kristina: Yep, with 30 people it starts to get pretty likely there’s be an overlap! Pretty amazing.

@Nathan: The trick to remember is the paradox is about everyone else not getting overlaps either (i.e. Billy and Joey could have an overlap, and it would count).

Hi

I need to calculate the probability of concurrency of 3 or more accident which are the same in the particular period. Is there any way to do this?

Indeed “only consider the scenarios we’re involved in”. Thanks for the remanding.

OK, I’m awful at math. I’m writing a piece right now about Olympians sharing the same birthday. Every day I log on to the official Olympic website and check the birthdays. There are roughly 10,960 athletes competing. From what my untrained eye can tell, it averages to about 30 birthdays per day. Can anyone help me out and explain the math a little better for me? Writing is my thing. Math makes my brain hurt. Thanks!

Hi Chris! Yep, for about 10,960 athletes you’d expect 10960 / 365 ~ 30 birthdays per day. In a room (or specific event), you can use the formula to figure out the chance of at least two people having a common birthday. If a track heat has 12 people, there’s a 16% chance of two people having the same birthday (see the formula at the bottom, but it’s 1 – e^(-12*11/(2 * 365)).

Thanks Kalid! That’s perfect.

Here’s what I came up with if anyone if interested. Just having some fun with numbers and the Olympics.

If I got something wrong, let me know.

http://tucsoncitizen.com/bear-down-and-blog/2012/08/02/happy-birthday-from-london-breaking-down-olympic-birthdays/

In high school I recall my teacher explaining this paradox. She said theoretically if there were 23 students in our class, the probability of two or more students have the same birthday is 50 percent. So my question is, in a class everyone is born in the same year would this reduce the probablity?

@Abdul: Great question. Offhand, I don’t think people being born in the same year should change things.

Solution below is much simpler. Just find probabilty of each one having diff B’day and then subtract from 1 to get 0.507297234 answer. Any thoughts?

364/365*363/365*362/365*361/365*360/365*359/365*358/365*357/365*356/365*355/365*354/365*353/365*352/365*351/365*350/365*349/365*348/365*347/365*346/365*345/365*344/365*343/365=0.492702766

1 -0.492702766 = 0.507297234

@Shambhu: Yep, that works! But it’s a pain to compute manually. The formula in Appendix A gives a shortcut vs. having to do all those 23 multiplications out.

I think you discount the formula at the beginning of Appendix A (1-1*(1-1/365)*(1-2/365)*…) too much by jumping immediately into an approximate shortcut. Let’s see what happens if we simplify that equation first.

The denominator of the equation is simple to work out – it’s 365 multiplied by itself as many times as there are people. For x people, the denominator will be 365^x.

The numerator also has a familiar pattern. For x people, It will be 365*364*363*…*365-x.

So, we have a pattern something like a factorial, but that stops after x numbers. How do we handle that? Yep, it’s our old friend the permutation formula!

So, the short form of the formula would be P(365,x)/365^x. Writing the long form of the formula, we end up with: x!/(((365-x)!)365^x)

Prefer to think of it with combinations instead of permutations? Permutations are just combinations with redundancies taken into account to focus on particular orders of events, or mathematically: P(365,x)=(C(365,x))x!

This makes the full formula: ((C(365,x))x!)/365^x

Yes the formula you write out at the start of Appendix A looks bad, but it simplifies quickly to a clear and understandable form. Examining it several ways in terms of combinations and permutations helps make it clearer.

As I wrote the formula above, that is, of course, the formula for no 2 people sharing a birthday.

To find the probability of at least 2 people sharing a birthday, as mentioned, we still need to subtract all that from 1.

i’m doing the project too

Hey.

I was wanting to take leap day into account and so I figured I should use 365.25

And by the way my sample size is 50 people

Would this work…

50*49=2450

2450/2=1225

so 1225 combinations

364.5/365.5=99.7264022%

so 99.7264022 is the chance of a combination not matching

.997264022^1225=3.48%

so 3.48 is the chance that all 1225 combinations don’t match

1-3.48%=96.5131327%

so 96.5131327 is the cance of at least one of them matching

That is how I worked it out and I’m not sure if it is correct so help me out please.

Hey.

I just realized that in my math I used 364.5 and 365.5 instead of 364.25 and 365.25 and that messed it up, but if I change that, did I have the right idea?

Haha. Oops.

Thanks so much the explanation. I apologize–I’m confused on one point which indicates: “…we could list the pairs and count all the ways they could match. But that’s hard: there could be 1, 2, 3 or even 23 matches!”

I don’t understand why wouldn’t the limit on matches be 22? The “target” can’t match with himself, (or can he?) Sorry, I’m sure I’m missing something obvious.

“…we could list the pairs and count all the ways they could match. But that’s hard: there could be 1, 2, 3 or even 23 matches!

Something doesn’t add up here. The first calculator shows that the birthday example with 365 persons would result in a 100% match, meaning at least 2 persons should have the same birthday. But it’s possible that all 365 persons have different birthdays (the first person born on January 1, the second on January 2 and the last on December 31).

Hi Mark, great catch. Yes, that should be 22 matches, appreciate the correction.

Hi Shark — the equation is a probabilistic argument. In fact, you hit “100%” (i.e., the limit of the javascript programming language) at around 90 people. At that point, the difference between 99.9999999… and 100.0 is too small to represent on the computer!

So it is theoretically possible to have 365 random people with 365 different birthdays. Practically, at around 100 randomly chosen people, you are virtually guaranteed to have a match (i.e., the probability of not having one is tiny, too small to be shown on a computer :)).

Khalid-thanks for the clarification and the website–it’s a remarkable service and resource.

This is the best webpage on the Birthday Paradox that I’ve found!

We are doing an elementary school “science” project. We picked the Birthday Paradox and did 40 trials (using mostly the internet) and came out with the expected (though counter-intuitive) result of about 50% pairs.

Now we’ve gotten to writing the “conclusion” of the report and realize that the answer involves apparently college-level math! Question: is there a simplified way to explain the paradox, at least to hint at why it works, that a smart elementary school student could understand?

Thanks!

Hi Steve, great question. I might try this: put the kids in groups of 10 and have them guess (before they start) how many handshakes they need so everyone in the group shakes hands. They might guess 10 (or 9, since that’s how many THEY need to do), but you’ll see it’s quite a large number (10*9/2 = 45). In the same way, the number of “birthdays to check” is not you against everyone else (22), but everyone by everyone (a much larger number). In rough terms, that’s why the odds are much closer to 50-50 instead of 22/365.

I used the Birthday Paradox concept in a math project of mine, and they told me some professor objected to it cause it’s his idea. It was my research and they were my results and raw data. What do you think i should do, submit it or change it?

So if I ask 23 people to pick a number between 1 and 365, there’s a 50% chance that at least two of them would choose the same number. This is the scariest thing I’ve ever heard of.

My son used the birthday paradox for his science fair project. I do have one question. Is it possible to calculate the odds of 3 people in a group of 25 having the same birthday (month/day)? Could I do this using Appendix C, by simply changing the /2 to /3 in the R3 line {pairs = (people * (people -1)) / 2}?

My 3rd grade daughter is testing the birthday paradox for a science fair project. The math itself is a little difficult, but testing the paradox is easier to understand. We have tested 14 samples and 12 samples produced a birthday match. Wouldn’t you expect it to be closer to half the samples? Is there an optimal sample size? I saw someone mention they did 40 samples.

Hi KGW, not sure what you mean about samples. I.e., you tested 14 samples (of 23 people), and of those 14 samples, 12 had a match? Yep, in theory it should be about half, but with a relatively small population, it’s easy to skew. Also, in the real world, birthdays probably aren’t perfectly evenly distributed, and the “clumpiness” may make matches easier.

For the purposes of the paradox though, it’s still startling that such small groups have so many matches! (So I think the experiment still makes that point :)).

I just completed the following survey: I asked people in the office to pick a number between 1 and 365 inclusive. I moved around office so nobody could hear any other person’s response.

It took 20 tries before one person matched an earlier response! Interestingly, the 18th person and the 20th person I asked picked “6.” So, if I had chosen the respondents in a different order, I could have gotten the match earlier.

If you’re interested in “triples”… a room with 88 or more people has an over-50% chance of at least one birthday being held by at least three people. For a room with 733 or more people, it’s guarenteed! (Per the pigeonhole principle. Hypothetically, a room of 732 people could consist of 366 pairs of people each sharing a birthday unique to that pair. The next person to walk into the room must have a birthday belonging to one of the pairs.)

I’ve been trying and failing to solve the problem for uneven distributions, such as if we assume all birthdays except Leap Day are equally likely and Leap Day is 25% as likely as the rest. There doesn’t seem to exist a simple formula for it on the Net…

I always explain it with a dartboard example. Put up a dartboard with 365 squares on it, put on a blindfold, and start throwing. You can start to see that randomly hitting it will reduce the space that you can hit that has not been hut before. Humans can “see” that analogy pretty well.

This is a wonderful website i learn alot from this

Somebody explain i still dont get it

SAMAKSHI:

The more people there are, the more opportunities for two of those people to share a birthday. When there are 23 people in a room, the number of opportunities is SO big that the chances of a match between two people is better than half.

So about half of all 23-person rooms should be able to say “Yes, two of the people in this room share a birthday”.

We’re not interested in a specific birthday, such as January 15. We’re asking about the situation where any two people have a birthday in common.

It can help to remember that if there were 367 people in the room, then the chances we have a hit are not just very high, but are actually 100%. EVERY room with 367 or more people has two or more people sharing a birthday. That’s because the only other possibility is that there are 367 unique birthdays in that room, and there aren’t that many birthdays to go around.

What if there were 366 people? Then it’s possible they each have their own birthday (including a Leap Day birthday!), but that’s EXTREMELY unlikely. So the chances of a match are very close to 100%.

What if there were 365 people? Just a bit less. 364 people? Slightly less than that.

… and so on, down to 23. At 23 people, the chances are just over 50%. At 22, they are below 50%. At 2, they are about 1/365, or well below 1%.

The whole thing is just a graph that curves differently than you might expect.

Great explanation.

For those who continue to doubt Mathematical equations (for some reason). I encourage you go to a random number generator website such as random.org and choose the field between 1 and 365. Write the numbers down and see how many you write down before you get a match/repeated number….not long!

Where a lot of people seem to get confused and doubt the equation is that they are stuck on themselves (such as comment 73)…It is not the odds of YOU having the same number as someone else in the room, its the odds of anyone having the same number as anyone, which you explained well. Great site.

Getting 10 heads in a row is actually .5^9 Because the .5 is accounting after you flipped a coin once.

Exact and easy formula is:

P(at least one shared birthday) = 1 – 364!/((365^n-1)*(365-n)!)

= 1 – 365!/((365^n)*(365-n)!)

Where n is the number of people in the room, and 365 is used as the number of days, thus not taking into account leap years.

The formula was found by simplifying:

P = 1- 365/365 * 364/365 * … * (365-(n-2))/365 * (365-(n-1))

I don’t see how the interactive example is correct. If you put 1000 items and only 142 people it says there is a 100% chance of a match. Surely this isn’t correct, for example if person 1 picked 1, person 2 picked 2, 3 picked 3 and so on up to 142 there wouldn’t be a match. Not very likely I know but possible (in fact is it not just as possible as every other selection?) , so there can’t be 100% likelihood of a match. Plus there are many other combination that wouldn’t create a matching pair.

If you keep with the birthday example all you need is 86 people for 100% chance of a matching pair, but it is “possible” for 86 people all to have different birthdays so how can it be 100%.

Sorry if I’m way off the mark, I just don’t get it.

@spope – I think it probably rounds.. so it would be 99.999102% (or something like that) so it rounds to 100%.. just a guess..

The formula can’t be exact. Using this formula, you would calculate a (though small) chance that in a room of 366 people there would be not a single mutual birthday. This can’t be correct as there are in this case more people than days in a year. Therefore the possibility to have not a single same birthday should be as zero as it could ever be.

Bo, good example. When you have 366 people the probability of everyone having different birthdays has a term which is zero. When you multiply by this zero term you get zero, meaning a probability of one that two people share the same birthday.

How come towards the end you minus it from 1?

Hi Jenny,

When you are calculating, you are working out how many persons DON’T have the same birthday as another. The ‘minus 1′ gives you your chances of another person having the same birthday.

1) 23 individuals x 22 partners = 506 couples. Divided by 2 for unique couples (i.e…Tom and Jane are the same couple of Jane and Tom)

2) 364/365 (Days of the year someone might NOT have the same birthday as you) Yx 253 couples. = 0.4995, being the chances of someone NOT having the same birthday as another person…. -1 to view -0.5005…which is only a quick way of seeing the answer when the real equation is 1 – 0.4995 to give you your answer being 0.5005 (50.05% likely of someone having the same birthday as another.

I approached this by figuring that there are 253 pairs and each pair has a 1/365 chance of having the same birthday. The probability that at least one pair would have the same birthday is 253*(1/365). Where is the flaw in my reasoning?

Thanks

In a room of 32 people a teacher pickedv a pupil at random. What is the probability of him picking someone with the same birthday as him?

dan: There are two problems with your approach.

One is that you are treating the probabilities as independent when they aren’t. Consider just three people: A, B, and C. There are three pairs out of these people (AB, AC, BC). By itself, the probability that a given pair has the same birthday is 1/365. But if AB shares a birthday, and BC shares a birthday, then the probability that AC shares a birthday is not 1/365 but simply 1. However, for the sake of a simple estimate this fact doesn’t make much difference, because with only 23 people, triple-birthdays are very rare.

The second problem is that independent probabilities don’t work that way — you can’t just add them up. The probability of rolling a one on a 6-sided die is 1/6. But the probability of rolling at least 1 one out of two dice is

not2/6. If it worked that way, then rolling 6 dice would give us a probability of 6/6, or 1 — not only guaranteeing at least 1 one, but at least 1 of every number, since all of the numbers are equivalent here. Of course it doesn’t work that way — if I roll six dice, getting a “straight” of all the numbers is extremelyunlikely, not guaranteed.Instead, you work out the dice probabilities by asking the opposite question: What is the probability of rolling anything but a one? With two dice, there are five-times-five equally-likely ways to do this (as if the dice only had 5 sides): a two and a two, a two and a three, etc, all the way to a six and a six. And here are 36 total equally-likely possible combinations: one-one, one-two, and so on. So our answer to

thisquestion is 25/36. That means the answer to our original question is just 1-25/36, or 11/36.Getting back to the birthdays, the way to determine the probability with 23 people is not to add the independent probabilities, because then you get a false guarantee of a pair at 28 people (there are 378 pairs, so you would find a probability of 378/365 for a pair, and that number can’t even be right in itself — probabilities should never exceed 1). Instead, we do multiplication, and what we multiply is the inverse scenario: 364/365, or the probability that two people don’t share a birthday.

We multiply this number by itself 253 times (in other words, raise it to the 253rd power). The result is a number slightly

less than1/2: 0.4995. This means that the probability for the reverse question is slightly more than 1/2, about 0.5005.However, this is slightly off because of the non-independence factor. The more correct way to calculate it is to build the sets of possible rooms that lack matches, and this isn’t as hard as you might think. (It’s simpler to understand than the methods shown in the blog post.)

The first person who enters the room is permitted to have any of 365 birthdays. The second person we put into it is only allowed one of 364 birthdays, because they can’t be a match. The third person is allowed one of 363 birthdays, because she can’t match either of the first two. All these possibilites multiply out, for the twenty-three people: 365 x 364 x 363x 362 x 361… x 345 x 344 x 343.

That’s our numerator: The total set of equally-likely combinations that fulfill a principle of lacking any matched pairs. For our denominator, we need the total set of possible combinations in general, and that would be 365 x 365 x 365 x 365… twenty-three times.

The resultant number is about 0.4927 — which, as before, is just under one-half. Hence, although we have a more precise probability, we still find that the same number of people (23) is the threshold here.

Anonymous (Ref – Comment 118)

Your example is different from the birthday paradox. Your example asks what the probability is of someone having the same birthday as a specific person…as opposed to someone having the same birthday as anyone in the room.

To answer your question…you could have 10,000 people in a room it doesn’t make a difference….the odds of picking a singular pupil at random and them having the same birthday as you is 1/365 (not halved again as some may think, as you are trying to match a pre-existing specific number)

Hope that makes sense?

-Clarkey-