Why can't we treat equations like one of Aesop's fables, with a lesson buried inside?

Take a look at the Pythagorean Theorem. It seems to be about triangles, right?

Well, sure. But is it only that? Seeing the equation:

has a literal interpretation "The sides of a triangle have a specific mathematical relationship."

Ok, fine. That's some nice zoology. Stepping back, what's really happening?

*The sides of a triangle, which point in different directions, have a mathematical relationship.*

Or better yet:

*Two objects, which exist in different dimensions, can still be compared.*

Here's an analogy: Who's the better athlete, Michael Jordan or Muhammad Ali?

We shouldn't ask who has a better jump shot or uppercut -- that comparison stays within a single dimension, clearly favoring one or the other. A better comparison would encompass all dimensions: *Who was more dominant compared to the competition? Who held more championships? Who advanced the sport more?*

Ah. We need a different type of "expansive" comparison to help each component see the other side.

In the triangle case, we have an "A direction" (East/West) and a "B direction" (North/South). Instead of comparing them directly, we square them to make area. Why?

Here's an intuition. Individual directions point in a single direction within our 2d universe. But area spans *every* direction available in our universe. By converting a single direction to its square (which points in all directions), we have a common "all directions" format that can be compared. It becomes "universal area vs. universal area" and not "pointing North vs. pointing East" (aka jump shots vs. uppercuts).

Will any type of universal self-comparison work? (Squaring, cubing, etc.)

No, unfortunately. The Pythagorean Theorem is special because it shows the *specific* comparison of squaring (a^{2} + b^{2} = c^{2}) keeps a simple relationship between the whole and its parts. There's probably a relationship for cubing, but it's not as clear cut.

**Pythagorean Theorem Fact:** The sides of a right triangle follow a

**Pythagorean Theorem Wisdom:** To compare different things, find a universal way to compare them. (Specifically, square yourself to create area.)

Equations aren't so boring when you look for the moral of the story, right?

In the Pythagorean Theorem, we can imagine spinning our 1d lines into area and comparing that. Here, we can spin each side into a circle:

And yowza, the area matches up!

This is a demonstration of the theorem; the *proof* shows that area will always combine neatly like this. (Read more.)

We can make our "self comparison" analogy more technical with vectors.

- vec a is the vector (a, 0)
- vec b is a vector in a different dimension, (0, b)
- vec c is a vector made from both: vec c = vec a + vec b = (a, 0) + (0, b) = (a,b)

The Pythagorean Theorem says "If we compare each item to itself, the combined self-comparisons of vec a and vec b equal the self-comparison of vec c".

`(c compared to itself) = (a compared to itself) + (b compared to itself)`

In the vector world, what's a self-comparison? A dot product with yourself.

The Pythagorean Theorem is stating:

which works because:

(Since vec a and vec b are perpendicular, their dot product is zero.)

And for the parts:

This specific self-comparison maintains a simple relationship between the whole and its parts (addition).

A simple glance at two vectors doesn't offer a built-in way to compare them; instead, use a derived *scalar* (single number) that allows a comparison to be made.

Learning isn't really about conveying information. That happens, sure, but the precondition is an environment to ask questions freely, risk being wrong, and updating your mental model for how things work. The student's curiosity ignites a feedback loop of progress, and facts come along for the ride.

However, this all hinges on psychological safety. When people say "I hate math" they mean "I hate how math makes me feel."

And by that, they really mean "I hate being expected to do things I never understood. I feel stupid/worthless/not good enough."

It's not the Pythagorean Theorem or fractions they hate -- it's what it means if you can't do them right. Swimming doesn't work when you're so tense your hands stay closed. Neither does cramming facts into a clenched mind. (By the way, not using the Pythagorean Theorem correctly simply means your mental model needs adjusting.)

I brushed against the edge of my psychological safety in college. A poorly-taught class made me question whether I was competent at math, good enough to do it, worthy enough to continue the path I wanted. Thankfully, my overall academic experience defended me from that conclusion ("I've made it this far...") so I came to another conclusion:

*My teacher doesn't seem to care if I truly understand these ideas. I have to find what makes sense for me.*

In a psychologically safe world, you ask questions and update your misunderstandings with complete ease. Tests are things you look *forward* to: don't you want to catch the leaky roof in your house early, so you can fix it?

But that's not the norm. Getting things wrong means you're stupid, or won't pass a checkpoint, not that your mental model has a hole to plug. Education theater has us nodding along and checking the boxes until the next class.

(Pet peeve: lessons on imaginary numbers that state "negative numbers have square roots" and ignore the utter confusion this creates in a student's mind. The population of France is 67 million, negative numbers have square roots. A fact is a fact, what's hard about that? Argh!)

When seeing a lesson, I silently run through a pyschological safety gut check:

- Can I imagine the teacher taking feedback from a student? When was the lesson last improved? How comfortable am I asking questions?
- How long did this topic take to be discovered historically? Did the teacher struggle when learning the idea? Have they told students what confused them most?
- Does the teacher want me to truly understand the material, not just memorize and move on?

When I personify Wikipedia, I see a hyper-literal robot that answers correctly but not helpfully. *You asked for a definition, and WikiBot3000 gave it to you. Your understanding was not my objective.*

When confusing ideas aren't acknowledged as such, I lose my trust in the teacher. Are they blind to how strange the concept is to newcomers? Are they trying to maintain a reputation as the un-befuddleable genius in the room? My math spidey sense goes haywire: what else will they gloss over?

The ELI5 (explain like I'm 5) subreddit, by contrast, fosters psychological safety by assuming you're speaking with a child. *This explanation is simplified, but in the right direction. I genuinely care if you understand it.*

Let's foster the psychological safety to ask foolish questions, acknowledge our confusion (in both teacher and student), and constantly refine our understanding. A teacher saying "Here's what confused me, and what finally made sense" goes a long way to an environment where we can actually learn.

Happy math.

]]>Richard Feynman is one of my explanation heroes. His biography contains dozens of stories, capturing a spirit of curiosity, awe, and a desire to figure things out your own way. Feynman's approach informed how both Scott and I approach learning.

Key takeaways:

Kalid:I think the first takeaway is recognizing when things make sense and it’s okay for things not to make sense. Something might click, great, if it doesn’t, maybe you can resolve it then or maybe you can write it down later.

Everyone has their own checklist or their own requirements but you should have something… Have a standard for yourself, it’s important.

I think Feynman would stop people and ask them for a plain English example if they were explaining something and he couldn’t understand it. Whatever it is for you, having that standard is important.

Scott:I think if I were to say the biggest takeaway from how Feynman was doing things was to be curious and I think that sounds really simple but you can see how (in the book) he gets himself into situations and how your intuitive response, like what you’d do by reflex, is not actually what he does in the book.

It might be his confidence or his charisma but I think a lot of it is his curiosity. He’s genuinely interested in trying to find out about things.

I feel as though this was brought up in the discussion about this curiosity. A lot of people asked “aren’t interest or curiosity just inherent qualities?”

You know, you can’t just snap your fingers and be as smart as Feynman so you shouldn’t be able to snap your fingers and be as curious. But, I actually disagree here.

I think that curiosity is something that you cultivate and it’s because a lot of the things that push us away from curiosity are these encrusted fears and aversions that we have to things from maybe negative exposures in the past. Particularly through school.

I think if you take it from the perspective that curiosity and interest is something you can cultivate to the extent that you want to tear down those barriers, I think there’s a huge benefit and possibility.

Feynman inspired our own learning techniques:

- ADEPT method - use Analogies, Diagrams, Examples, Plain-English and Technical descriptions
- Feynman technique - to learn something new, attempt to teach it in simple terms (make an ELI5 explanation for yourself or another student)

It's inspiring to read about a Nobel-prize winning expert unafraid to admit their ignorance and break difficult concepts into simpler terms (*"No, I don't understand, can you give me an example?"*). He had the humility to question his own understanding and the confidence to know he could eventually figure it out -- an approach we can all learn from.

Join Scott's book club for updates and discussions on upcoming books. Thanks again to Scott for having me on!

]]>Here's a few mental models I use to keep them straight.

Let's take a simple situation: You have 4 shirts and 8 pants, how many outfits can you make?

In essence, you are picking a spot on this grid:

Shirts and Pants exist in separate dimensions, whose area represents distinct solutions. We can pick any spot *in the grid* and we have 4 x 8 = 32 options.

Now, suppose we had 4 shirts and 8 pants and had to pick a single item to sell. Here, they lie along the same "clothing item" dimension:

We can randomly pick any point *along the line* and have 4 + 8 = 12 options.

Think "different dimensions vs. same dimesion" or "grid vs. line".

Another interpretation is AND (multiplication) vs. OR (addition).

Let's say we must pick one of 4 shirts AND one of 8 pants. We need both to stay out of trouble. The scenario is:

`pick among 4 shirts AND among 8 pants = 4 * 8 = 32 choices`

What if McDonald's softens their regulations and allows a shirt OR pants? (But not both -- yikes.) Then, we have:

`pick among 4 shirts OR among 8 pants = 4 + 8 = 12 choices`

Writing out the scenario is often easier to think through, especially with numerous dimensions (shirts, pants, hats, shoes).

As you internalize the analogies, you'll quickly recognize whether multiplication or addition is needed.

Let's go meta for a minute. The permutation formula is:

How can we think about this?

The numerator (n!) is the max volume assuming each of the n choices has its own dimension. The number of rearrangements of 8 people is 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1.

But suppose we only care about the first 3 decisions -- picking a Gold, Silver and Bronze among 8 contestants. In this case, we shrink our solution space by dividing out the 5 dimensions we aren't using (which has 5! options on its own). We are left with 8! / 5! = 8 * 7 * 6 = 336 choices, with the general formula frac(n!)((n-k)!).

(If multiplication creates dimensions, then division should remove them.)

Now, let's say the medals are identical: we're giving a tin can to 3 out of 8 people. We need to further remove dimensions, because we have 3! = 3 * 2 * 1 = 6 redundancies for each permutation in our solution space. We again shrink our solution space:

(I imagine the solution space volume getting denser.)

Ah! That's what's happening with the combination and permutation formula. We create the max volume and shrink it by the dimensions we are not using. Mentally translate the scenario into a version that makes sense to you.

Here's how I think through a few sample problems.

*You flip a coin 10 times. How many ways can you get at least 7 heads?*

First off, the total number of possibilities is 2^10 = 1024. Intuitively, I see each flip as a decision along a different dimension, not the same number line. This means we have 2 * 2 * 2 *... possibilities, not 2 + 2+ 2 + ... possibilities.

Geometrically, this would be a 10-dimensional "choice space", or, written out:

`(Heads OR tails) AND (Heads OR tails) AND (Heads OR tails) AND ...`

Ok. Now, how can we get at least 7 heads? That means we had 0 tails [10 heads], 1 tails [9 heads], 2 tails [8 heads], or 3 tails [7 heads].

Switching to the written description, this becomes:

`choices we want = (0 tails OR 1 tail OR 2 tails OR 3 tails)`

Given our 10 flips, the number of outcomes are:

- 0 tails = 1 choice (all heads)
- 1 tail = 10 choices (exactly one flip was tails)
- 2 tails = C(10,2) =
`10*9/(2*1) = 45 choices`

based on the combination formula - 3 tails = C(10,3) =
`10 * 9 * 8 / (3 * 2 * 1) = 720 / 6 = 120 choices`

So, the total is

`choices we want = (1 + 10 + 45 + 120) = 176`

And for kicks, the chance of seeing this happen is:

`176 / 1024 = 17.2%`

Multiplication goes beyond "repeated addition". It's a general notion of combining for which I'm still discovering interpretations. Let's not get tied into a single meaning.

Happy math.

Turning AND/OR statements into arithmetic maps nicely to Boolean logic.

If A and B are variables with the values 1 or 0, we can write:

`A AND B = A * B`

`A OR B = A + B`

In most languages, a positive number evaluates to "true", so A + B = 2 is true. Note that this OR is an "inclusive OR" that allows both values to be true. To force an exclusive OR, we could take the remainder after dividing by two:

`A XOR B = (A + B) % 2`

Most programming languages have separate operators for AND (`&&`

), OR (`||`

) and XOR (`^`

), but it's nice seeing how logic works with regular arithmetic.

Additionally, "if/then/else" statements can be converted to arithmetic.

If `y`

is a variable (1 or 0) that determines a result, instead of:

if (y) { result = ResultIfTrue; } else { result = ResultIfFalse; }

we can use the single statement:

`result = y * ResultIfTrue + (1 - y) * ResultIfFalse`

This version avoids the needs for branching, which is expensive for a CPU, and is a formula we can optimize with Calculus (used in machine learning algorithms).

]]>