This completed grid is the *outer product*, which can be separated into the:

**Dot product**, the interactions between similar dimensions (`x*x`

`y*y`

,`z*z`

)**Cross product**, the interactions between different dimensions (`x*y`

,`y*z`

,`z*x`

, etc.)

The dot product (vec(a) · vec(b)) measures similarity because it only accumulates interactions in matching dimensions.... Read article

]]>This completed grid is the *outer product*, which can be separated into the:

**Dot product**, the interactions between similar dimensions (`x*x`

`y*y`

,`z*z`

)**Cross product**, the interactions between different dimensions (`x*y`

,`y*z`

,`z*x`

, etc.)

The dot product (vec(a) · vec(b)) measures similarity because it only accumulates interactions in matching dimensions. It’s a simple calculation with 3 components.

The cross product (written vec(a) times vec(b)) has to measure a half-dozen “cross interactions”. The calculation looks complex but the concept is simple: accumulate 6 individual differences for the total.

Instead of thinking “When do I need the cross product?” think “When do I need interactions between different dimensions?”.

Area, for example, is formed by vectors pointing in different directions (the more orthogonal, the better). Indeed, the cross product measures the area spanned by two 3d vectors (source):

(The “cross product” assumes 3d vectors, but the concept extends to higher dimensions.)

Did the key intuition click? Let’s hop into the details.

The dot product represents vector similarity with a single number:

(Remember that trig functions are percentages.) Should the cross product (difference between interacting vectors) be a single number too?

Let’s try. Sine is the percentage difference, so we could use:

Unfortunately, we’re missing a lot of detail. `x`

is 100% different from both `y`

and `z`

, but shouldn’t `x*y`

and `x*z`

be different from each other? As Tolstoy wrote, “All happy families are alike; each unhappy family is unhappy in its own way.”

Instead, let’s express these unique differences as a vector:

The

*size*of the cross product is the numeric “amount of difference” (with sin(theta) as the percentage)The

*direction*of the cross product is based on both inputs: it’s the direction orthogonal to both (i.e., favoring neither)

A vector result represents the `x*y`

and `x*z`

separately, even though `y`

and `z`

are both “100% different” from `x`

.

(Should the dot product be turned into a vector too? Well, we have the inputs and a similarity percentage. There’s no new direction that isn’t available from either input.)

Two vectors determine a plane, and the cross product points in a direction different from both (source):

Here’s the problem: there’s two perpendicular directions. By convention, we assume a “right-handed system” (source):

If you hold your first two fingers like the diagram shows, your thumb will point in the direction of the cross product. I make sure the orientation is correct by sweeping my first finger from vec(a) to vec(b). With the direction figured out, the magnitude of the cross product is |a| |b| sin(theta), which is proportional to the magnitude of each vector and the “difference percentage” (sine).

To remember the right hand rule, write the `xyz`

order twice: `xyzxyz`

. Next, find the pattern you’re looking for:

`xy => z`

(`x`

cross`y`

is`z`

)`yz => x`

(`y`

cross`z`

is`x`

; we looped around:`y`

to`z`

to`x`

)`zx => y`

Now, `xy`

and `yx`

have opposite signs because they are forward and backward in our `xyzxyz`

setup.

So, without a formula, you should be able to calculate:

Again, this is because `x`

cross `y`

is positive `z`

in a right-handed coordinate system. I used unit vectors, but we could scale the terms:

A single vector can be decomposed into its 3 orthogonal parts:

When the vectors are crossed, each pair of orthogonal components (like a_x times b_y) casts a vote for where the orthogonal vector should point. 6 components, 6 votes, and their total is the cross product. (Similar to the gradient, where axis casts a vote for the direction of greatest increase.)

`xy => z`

and`yx => -z`

(assume vec(a) is first, so`xy`

means a_x b_y)`yz => x`

and`zy => -x`

`zx => y`

and`xz => -y`

`xy`

and `yx`

fight it out in the `z`

direction. If those terms are equal, such as in (2, 1, 0) times (2, 1, 1), there is no cross product component in the `z`

direction (2 – 2 = 0).

The final combination is:

where vec(n) is the unit vector normal to vec(a) and vec(b).

Don’t let this scare you:

- There’s 6 terms, 3 positive and 3 negative
- Two dimensions vote on the third (so the
`z`

term must only have`y`

and`x`

components) - The positive/negative order is based on the
`xyzxyz`

pattern

If you like, there is an algebraic proof, that the formula is both orthogonal and of size |a| |b| sin(theta), but I like the “proportional voting” intuition.

Again, we should do simple cross products in our head:

Why? We crossed the `x`

and `y`

axes, giving us `z`

(or vec(i) times vec(j) = vec(k), using those unit vectors). Crossing the other way gives -vec(k).

Here’s how I walk through more complex examples:

- Let’s do the last term, the z-component. That’s (1)(5) minus (4)(2), or 5 – 8 = -3. I did
`z`

first because it uses`x`

and`y`

, the first two terms. Try seeing (1)(5) as “forward” as you scan from the first vector to the second, and (4)(2) as backwards as you move from the second vector to the first. - Now the
`y`

component: (3)(4) – (6)(1) = 12 – 6 = 6 - Now the
`x`

component: (2)(6) – (5)(3) = 12 – 15 = -3

So, the total is (-3, 6, -3) which we can verify with Wolfram Alpha.

In short:

- The cross product tracks all the “cross interactions” between dimensions
- There are 6 interactions (2 in each dimension), with signs based on the
`xyzxyz`

order

**Connection with the Determinant**

You can calculate the cross product using the determinant of this matrix:

There’s a neat connection here, as the determinant (“signed area/volume”) tracks the contributions from orthogonal components.

There are theoretical reasons why the cross product (as an orthogonal vector) is only available in 0, 1, 3 or 7 dimensions. However, the cross product as a single number is essentially the determinant (a signed area, volume, or hypervolume as a scalar).

**Connection with Curl**

Curl measures the twisting force a vector field applies to a point, and is measured with a vector perpendicular to the surface. Whenever you hear “perpendicular vector” start thinking “cross product”.

We take the “determinant” of this matrix:

Instead of multiplication, the interaction is taking a partial derivative. As before, the vec(i) component of curl is based on the vectors and derivatives in the vec(j) and vec(k) directions.

**Relation to the Pythagorean Theorem**

The cross and dot product are like the orthogonal sides of a triangle:

For unit vectors, where |a| = |b| = 1 , we have:

I cheated a bit in the grid diagram, as we have to track the squared magnitudes (as done in the Pythagorean Theorem).

**Advanced Math**

The cross product & friends get extended in Clifford Algebra and Geometric Algebra. I’m still learning these.

**Cross Products of Cross Products**

Sometimes you’ll have a scenario like:

First, the cross product isn’t associative: order matters.

Next, remember what the cross product is doing: finding orthogonal vectors. If any two components are parallel (vec(a) parallel to vec(b)) then there are no dimensions pushing on each other, and the cross product is zero (which carries through to 0 times vec(c)).

But it’s ok for vec(a) and vec(c) to be parallel, since they are never directly involved in a cross product, for example:

Whoa! How’d we get back to vec(j)? We asked for a direction perpendicular to both vec(i) and vec(j), and made that direction perpendicular to vec(i) again. Being “doubly perpendicular” means you’re back on the original axis.

**Dot Product of Cross Products**

Now if we take

what happens? We’re forced to do vec(a) times vec(b) first, because vec(b) · vec(c) returns a scalar (single number) which can’t be used in a cross product.

If vec(a) and vec(c) are parallel, what happens? Well, vec(a) times vec(b) is perpendicular to vec(a), which means it’s perpendicular to vec(c), so the dot product with vec(c) will be zero.

I never really memorized these rules, I have to think through the interactions.

**Other Coordinate Systems**

The Unity game engine is left-handed, OpenGL (and most math/physics tools) are right-handed. Why?

In a computer game, `x`

goes horizontal, `y`

goes vertical, and `z`

goes “into the screen”. This results in a left-handed system. (Try it: using your right hand, you can see `x`

cross `y`

should point out of the screen).

**Applications of the Cross Product**

- Find the direction perpendicular to two given vectors.
- Find the signed area spanned by two vectors.
- Determine if two vectors are orthogonal (checking for a dot product of 0 is likely faster though).
- “Multiply” two vectors when only perpendicular cross-terms make a contribution (such as finding torque).
- With the quaternions (4d complex numbers), the cross product performs the work of rotating one vector around another (another article in the works!).

Happy math.

]]>While true, there’s a deeper principle at work.

**The Law of Interactions: The whole is based on the parts and the interaction between them.**... Read article

While true, there’s a deeper principle at work.

**The Law of Interactions: The whole is based on the parts and the interaction between them.**

The wording “Law of Cosines” gets you thinking about the mechanics of the formula, not what it means. Part of my learning strategy is rewording ideas into ones that make sense.

The Law of Cosines, after cranking through geometric steps we’re prone to forget, looks like c

^{2}= a^{2}+ b^{2}– 2abcos(C).This is suspiciously like the expansion that if c = (a + b), then c

^{2}= a^{2}+ b^{2}+ 2abThe difference is that 2ab has an extra factor, cos(C), which measures the “actual overlap percentage” (2ab assumes we fully overlap, i.e. where cos(C) = 1).

So, the Law of Cosines is really a generalization of how c

^{2}= (a + b)^{2}expands when components aren’t fully lined up. We’re treating geometric lines as terms in an algebraic expansion.

Imagine a restaurant with a single chef, Alice. She’s overworked, so Bob is hired as her assistant (sous chef).

Based on Alice’s current performance, and Bob’s performance in his interview, what happens when they work together?

Surely the new result must be their combined effort:

Hah! Office workers everywhere are rolling their eyes. You can’t just assume people contribute identically when they’re put together: there are interactions to account for.

Beyond their individual contributions, the two might slow each other down (*Where’d you put the whisk again?*), or find ways to work together (*I’m peeling carrots anyway, use some of mine.*).

In a system with several parts, start with the individual contributions and then ask if their interaction will:

- Help each other
- Hurt each other
- Ignore each other

The original idea that “Total = Alice + Bob” is more generally expressed as:

We need to separate the *list* of participants (Alice, Bob) from the result of their interaction.

Take the numbers 5 and 3. We can write them like so:

- Parts = (5, 3)

and we’re pretty sure they combine to make 8. But is there another way to get that conclusion?

Yes: we multiply. Beyond repeated counting, multiplication shows what happens when the parts of a system interact:

We’ve gone from “parts view”, (5, 3), to “interaction view”, (5 + 3)^{2}. The result of interaction mode says the system would result in 64 if it *did* interact with itself.

One caveat: when going to interaction view, we wrote down (5 + 3)(5 + 3), but we can’t simplify (5 + 3) = 8 on the outset. We’re using addition for bookkeeping until multiplication can combine the parts.

Oh, another caveat: why can we just add the interactions, but not the parts? Great question. The individual parts might be pointing in different dimensions, and don’t line up nicely on the same scale. The interacting parts turn into *area*, which can be combined to the same result no matter the orientation.

(I’ll investigate this concept more in a follow-up. It’s a neat idea that area is a generic, easily combinable quantity but individual paths are not.)

Simple setups like (5, 3) are easy to think through, like eyeballing 2x + 3 = 7 and guessing x = 2. But a more complex scenario like x^{2} + 3x = 15 requires a systematic approach.

The Law of Cosines is a systematic approach to working through the parts:

- List the parts
- Get every interaction as area
- Add to find the total contribution
- Convert into the equivalent “single part”

The last step is often implied. Once we’ve merged the jumble of interactions, we want the *single* part that could represent the entire system. Is there a single person (Charlie) whose efforts are identical to that of Alice and Bob working together?

The Law of Cosines gives us a way to find Charlie.

When two parts interact, they can help, hurt, or ignore each other:

- Perfect alignment means they help 100% (5 and 3)
- Perfect mis-alignment means they hurt 100% (5 and -3)
- Partial alignment or mis-alignment means they help or hurt by a percentage
- No alignment means they ignore each other

How do we measure alignment? With cosine.

Using our trig analogy, cosine is the *percentage* an angle moves along the ground.

A 0-degree angle follows the ground perfectly (100%), and moving vertically doesn’t follow it at all (0%). Other angles are a fraction in-between.

If the parts in our system can be written as paths, and we know the angle between them is theta (theta), then we can measure the overlap with cosine. One path acts as the ground, and the other is the path we’re following:

When paths are perfectly aligned, their full strength is used (ab and ba). The interaction factor cos(theta) modifies that strength to show much they *actually* work together.

So, our jumble of interactions becomes:

Phew! And that’s the Law of Cosines: collect every interaction, account for the alignment, and simplify it to a single part. (The formula is usually written without the square root, but usually you want c, not c^{2}.)

Now, why is the Law of Cosines often written with a negative sign? Well, the assumption is that in a typical triangle, a small *internal* angle C means the sides are negatively aligned, while theta (theta) is an *external* look at their alignment:

Similarly, a large internal angle means the sides are positively aligned, and will help each other. Typically, a small angle means you’re moving in the same direction, but this internal/external difference means we reverse the sign.

Personally, I don’t memorize whether there’s a positive or negative sign: I think about whether the parts will help or hurt each other in the scenario, and make the interaction positive or negative. Don’t be a slave to the formula.

Let’s say my triangle has side a = 10 and side b = 20. What is side c when the angle between a and b is:

**45 degrees in alignment**

Here, we need the Law of Cosines. a and b are pointing partially in the same direction. We switch to interaction mode to get to a common, combinable unit (area):

- a
^{2}= 100 - b
^{2}= 400 - 2ab = 2 · 10 · 20 = 400, but we need to adjust by the interaction factor. That is cos(45) = .707, so the real interaction factor is 400 · .707 = 282.8

The overall interactions are:

and the equivalent single side (c) is:

**70 degrees in mis-alignment**

Again, we need the Law of Cosines. We can see that the angles fight each other, so the interaction will be negative:

Our intuition says this arrangement should be *smaller* than the previous one (since the sides aren’t working together), and it is.

**Full alignment or mis-alignment**

When our “triangle” has an angle of 0 degrees (or 180), all the parts are lying flat. Here, the parts are in the same dimension, and can be treated as regular numbers:

- Fully aligned: 10 + 20 = 30
- Fully mis-aligned: 10 – 20 = -10 (pointing in direction of B).

The Law of Cosines still works, of course:

- Full alignment: a
^{2}+ b^{2}+ 2abcos(theta) = 100 + 400 + 400cos(0) = 900 and c = √(900) = 30 - Full mis-alignment: a
^{2}+ b^{2}– 2abcos(theta) = 100 + 400 + 400cos(180) = 100 which means c = √(100) = 10 (pointing backwards).

Again, we shouldn’t robotically follow the formula: have a rough idea what the result should be, and think through the calculations. (“The overall interaction is this, so the individual side would that…”).

Thinking of interactions is one interpretation: next time, we’ll see it as the Law of Projections.

Happy math.

The Law of Cosines resembles the Pythagorean Theorem, no?

Now you might suspect why. The Pythagorean Theorem is the special case of *zero interaction*, which happens when the sides are at right angles. After all, 90 degree angle is vertical, and has 0% overlap with the ground.

The Law of Cosines becomes:

If we know the parts won’t interact, we can ignore interaction effects. However, the *self-interactions* are still there and must be combined: a^{2} and b^{2} are fine, but the crossover terms ab and ba disappear.

Here’s another version of the Pythagorean Theorem. We can’t combine a and b directly, so combine their interactions and reduce them to a single part:

You might be hankering for a geometric proof. Here’s one from quora, based on a paper by Knuth:

The insight is that we take our original a-b-c triangle and scale it by a (giving the a^{2}-ab-ac triangle) and b (giving the ab-b^{2}-bc triangle). These two triangles build a larger, similar triangle ac-bc-c^{2}, and with some trig, the bottom portion can be shown to equal a^{2} + b^{2} – 2abcos(theta).

While interesting, I don’t like these types of proofs up front. The Law of Cosines is about interactions, not re-arranging triangles. Does this explanation get you thinking about what cosine represents? About when it should be positive, negative, or zero?

Concept | Law of Cosines |
---|---|

Analogy | Imagine an assistant chef whose interactions may (or may not) be helpful. |

Diagram | |

Example | Suppose a = 10 and b = 20 in a triangle. If they are aligned 45-degrees, their interaction is a^{2} + b^{2} + 2abcos(45) = 782.8 and the remaining side is √(782.8) = 27.97 units long. |

Plain-English | The Law of Interactions: The whole is based on the parts and the interaction between them. |

Technical | Triangle with internal angle C: c^{2} = a^{2} + b^{2} – 2abcos(C) General interaction: c ^{2} = a^{2} + b^{2} + 2abcos(theta) |

Ok, that’s a neat connection, and maybe we can prove it by drawing some right triangles (of course) and re-arranging terms.

But what does it mean?... Read article

]]>Ok, that’s a neat connection, and maybe we can prove it by drawing some right triangles (of course) and re-arranging terms.

But what does it mean?

Rather than the Law of Sines, think of the Law of Equal Perspectives:

**Each angle & side can independently find the circle that wraps up the whole triangle.** This connection lets us start with one angle and work out facts about the others.

I occasionally frighten the neighborhood children by unchaining the mutant gorilla in my front yard.

The kids run screaming, telling different stories of what they’ve seen:

“Alice claims the monster was 20 feet tall, but we all know she exaggerates by doubling. And Billy’s a bit of a crybaby, and said it was 30 feet tall. Charlie’s fairly no-nonsense and said the beast was exactly 10 feet high.”

If we know a kid’s “exaggeration factor” and the size they claim, we can deduce the true size of the monster. (Furious George has a name, you know.)

Even better, we can predict what *other* kids might have said: If Alice claimed it was 40 feet, what would Charlie have said?

What do kids running from monsters have to do with triangles? Well, every triangle is trapped inside its own Monster Circle:

Whatever triangle we draw, there’s *some* circle trying to gobble it up (technically, “circumscribe it”). Try this page to explore an example on your own.

Now here’s the magic: just knowing a single angle and its corresponding side, we can figure out the Monster Circle.

Here’s how. Let’s say we have a triangle like this:

We don’t know anything except the angle A (call it 30 degrees) and the length of side a (call it an inch).

First off: is this the correct drawing of the triangle? Probably not! We don’t know the other sides, so this is equally valid:

It still has the same angle (A = 30 degrees) and the size of the base hasn’t changed (still one inch).

What if we start drawing more possibilities?

Whoa. From A‘s point of view *all* the possible triangles that have “A=30 degrees, a=1 inch” are on this circle. Whatever B and C end up being, they need to pick an option from this circle.

Similarly, we can argue this from the other perspectives:

- We can lock down angle B and side b, and trace out a circle of possibilities
- We can lock down angle C and side c, and trace out a circle of possibilities

This is the meaning of the Law of Sines: each angle unknowingly generates the same circle as the others.

(How do we *prove*, not just see that the possibilities lie on a circle? That’s the Inscribed Angle Theorem, for another day.)

We’ve figured out that there *is* a Monster Circle, now let’s see how big it is. Um… how?

Remember, we can slide around the circle and keep A (30 degrees) and a (1 inch) the same. So let’s slide until we make a right triangle:

Ah! Now we can use sine. Remember that sine is the percentage height compared to the max possible. The max possible height is the full diameter (d) of the Monster Circle.

(Why is a 90-degree angle across from the full diameter? Draw a square inside the circle, touching the sides. It must be symmetric, the diagonals pass through the center along the diameter, and are opposite a 90-degree angle.)

With a little re-arranging, we get:

Using the same logic for the other sides, we get:

In a way, sin(A) is the “exaggeration factor” that converts the size the angle measured (a) to the full diameter (d). Each angle is a different kid, and some really misjudge the size of the full circle based on what they see. (90-degrees is right on target.)

In our example above, A is 30 degrees and a is 1 inch.

We can calculate the diameter pretty fast. First, we get the sine:

That means our length a is 50% of the max height, so the full diameter must be 2 inches.

This isn’t enough to figure out the triangle by itself. Let’s say angle B comes along and says it is 45 degrees. How long is b?

Well,

which means that b is .707 of the max diameter. Therefore,

Previously, I would plug numbers into the Law of Sines formula and chug away algebraically. Now I can think in terms of the Monster Circle: “Ok, I have the max diameter. I take the sine, and get the fraction of the max diameter for that side.”

Most books write the formula with sin(A) in the numerator. It might read better “Sine A over A” but it distorts the conclusion that frac(a)(sin(A)) is the size of the circle.

Put the concept in your own words. The “Law of Sines” is a generic description of what’s in the formula, but the “Law of Equal Perspectives” explains what it means:

- All parts of the triangle have a perspective on the whole
- Sine is the “exaggeration factor” that scales up an individual side to the full diameter. (Sine is the percentage of the max possible, and we divide by it.)

Happy math.

Technically, because B is over 90 degrees, we can’t ever spin it and have either A or C be a right angle (if we could, the triangle would have over 180 degrees).

What to do? Realize the 180-degree complement of B (call it B·) acts like a stand-in on the other side:

B· has the same sine as B, which should make sense: they both point upwards along the same trajectory. To help us sleep better at night, we start with B· in the right-angle setup:

and get to the same conclusion as before. Phew.

However, the fact that B and B· can be swapped can lead to problems.

If I have a triangle where I know A (30 degrees) and a (1 inch), and then say b is 1.5 inches, what can you deduce?

The max diameter is 2 inches as before, so

Unfortunately, there are two angles with that sine value: a calculator says sin^{-1}(.75) = 48 degrees, but 180 – 48 = 132 degrees would work too (more details).

Also, the triangle may not be possible given a hypothetical scenario. If I say b is 3 inches, you know something’s amiss. The max diameter was already calculated to be 2. Even a 90-degree angle, the best possible, could only have a side of 2 inches.

ADEPT Topic | Law of Sines |
---|---|

Analogy | Imagine kids describing the same monster with varying degrees of exaggeration. |

Diagram | |

Example | Suppose A=30 and a=1 inch. Since sin(A) = 0.5, the Monster Circle is 1 / 0.5 = 2 inches wide. Given another angle, I can figure out the length of its side. If B = 45 degrees, then side b takes up sin(45) = .707 of the diameter, and is 1.414 inches. |

Plain-English | Any angle + side can deduce the size of the wrapping circle. |

Technical |

**Make explanations ADEPT: Use an Analogy, Diagram, Example, Plain-English description, and then a Technical description.**

Here’s how to teach yourself a difficult idea, or explain one to others.... Read article

]]>**Make explanations ADEPT: Use an Analogy, Diagram, Example, Plain-English description, and then a Technical description.**

Here’s how to teach yourself a difficult idea, or explain one to others.

Most new concepts are variations, extensions, or combinations of what we already know. So start there!

In our decades of life, we’ve encountered thousands of objects and experiences. Surely *one* of them is vaguely similar to this new topic and can be the starting point.

Here’s an example: Imaginary numbers. Most lessons introduce them in a void, simply saying “negative numbers can have square roots too.”

Argh. How about this:

- Negative numbers were distrusted until the 1700s: How could you have
*less*than nothing? - We overcame this by realizing numbers could exist on a number line, allowing us to move forward or backward from zero.
- Imaginary numbers express the idea that we can move upwards and downwards, or
*rotate*around the number line.

Instead of just going East/West, we can go North/South too – or even spin around in a circle. Neat!

Analogies are fuzzy, not 100% accurate, and yet astoundingly useful. They’re a raft to get across the river, and leave behind once you’ve crossed.

We often think diagrams are a crutch if you aren’t macho enough to directly interpret the symbols. Guess what? Academic progress on imaginary numbers took off only *after* the diagrams were made!

Favor the easiest-to-absorb explanation, whether that comes from text, diagram, or interpretative dance. From there, we can work to untangle the symbols.

So, here’s a visualization:

Imaginary numbers let us rotate around the number line, not just move side-to-side.

Starting to get a visceral sense for what they can *do*, right?

Half our brain is dedicated to vision processing, so let’s use it. (And hey, maybe for this topic, twirling around in an interpretative dance would help.)

Oh, now’s our chance to hit the student with the fancy terminology, right?

Nope. Don’t tell someone the way things are: let them experience it. (How fun is hearing about the great dinner I had last night? The movie you didn’t get to see?)

But that’s what we do for math. “Someone smarter than you thought this through, found out all the cool connections, and labeled the pieces. Memorize what they discovered.”

That’s no fun: let people make progress themselves. Using the rotation analogy, what happens after 4 turns?

How about 2 turns? 4 turns clockwise?

If you genuinely experienced an idea, you should be excited to describe it:

- Imaginary numbers seem to point North, and we can get to them with a single clockwise turn.
- Oh! I guess they can point South too, by turning the other way.
- 4 turns gets us pointing in the positive direction again
- It seems like two turns points us backwards

These are all correct conclusions, just not yet written in the language of math. But you can still reason in plain English!

The final step is to convert our personal understanding to the formal notation. It’s like sharing a song you’ve made up: you can hum it to yourself, but need sheet music for other people to use.

Math is the sheet music we’ve agreed upon to share ideas. So, here’s the technical terminology:

- We say
*i*(lowercase) is 1.0 in the imaginary dimension - Multiplying by
*i*is a 90-degree counter-clockwise turn, to face “up” (here’s why). Multiplying by*-i*points us South - It’s true that starting at 1.0 and taking 4 turns puts us at our starting point:

And two turns points us negative:

which simplifies to:

so

In other words, *i* is “halfway” to -1. (Square roots find the halfway point when using multiplication.)

Starting to get a feel for it? Just spitting out “i is the square root of -1″ isn’t helpful. It’s not explaining, it’s *telling*. Nothing was experienced, nothing was internalized.

Give people the chance to make an idea their own.

I used to be satisfied with a technical description and practice problem. Not anymore.

ADEPT is a checklist of what I need to feel comfortable with an idea. I don’t think I’ve actually learned a topic unless I have a metaphor that ties everything together. Here’s a few places to look:

- Analogy – ?
- Diagram – Google Images
- Example – Khan Academy for practice problems
- Plain-English – Forums like /r/math or Math Overflow
- Technical – Wikipedia or MathWorld

Unfortunately, there aren’t many resources focused on analogies, especially for math, so you have to make your own. (This site exists to share mine.)

It seems logical to assume we can present facts in order, like transmitting data to a computer. But who actually learns like that?

I prefer the blurry-to-sharp approach to teaching:

Start with a rough analogy and sharpen it until you’re covering the technical details.

Sometimes, you need to untangle a technical description on your own, so must work backwards to the analogy.

Starting with the technical details:

- Can you explain them in your own words?
- Can you solve an example problem, describing the steps in your own words?
- Can you create a diagram that represents how the concept fits together for you?
- Can you relate the concept to what you already know?

With this initial analogy, layer in new details and examples, and see if it holds up. (It doesn’t need to be perfect, but iterate.)

If we’re honest, we’ll admit that we forget 95% of what we learn in a class. What sticks? A scattered analogy or diagram. So, make them for yourself, to bootstrap the rest of the understanding as needed.

In a year, you probably won’t remember much about imaginary numbers. But the quick analogy of “rotation” or “spinning” might trigger a flurry of recognition.

I’m wary of making a contrived acronym, but ADEPT does capture what I need to internalize a new concept. Let’s stop being shy about thinking out loud: does a fact-only presentation really work for you? What other components do you need? I have a soft, squishy brain that needs the connecting glue, not just data.

Scott Young uses the Feynman Technique to explain concepts in everyday words and work backwards to an analogy and diagram. (Richard Feynman was a world-class expositor and physicist, and one of my teaching heroes.)

Prof. Barb Oakley runs an excellent, free course on Learning How To Learn. I was honored to do an interview with her for the class:

Click to watch the interview — I recommend the full course. The first session had over 180,000 students and was a great success.

Beyond any technique, raise your standards to find (or create) explanations that truly work for you. It’s the only way to have concepts stick.

Happy math.

“BE” is a nice prefix for the style to use when teaching:

Brevity is beautiful.

Empathy makes us human. Use your natural style, relate to common experience, and anticipate questions in your explanation.

I’ve yet to complain that a lesson respected my time too much, or related too well to how I thought.

ADEPT is like a nutrition label for an explanation: what are the key ingredients?

Concept | Euler’s Formula |
---|---|

Analogy | Imaginary numbers spin exponential growth into a circle. |

Diagram | |

Example | Let’s figure out the value of `3^i` . (It’s on the unit circle.) |

Plain-English | Raising an exponent to an imaginary power spins you on the unit circle. The same destination can be written with polar (distance and angle) or rectangular coordinates (real part and imaginary part). |

Technical |

Concept | Fourier Transform |
---|---|

Analogy | Like filtering a smoothie into ingredients, the Fourier Transform extracts the circular paths within a pattern. |

Diagram | Smoothie being filtered: |

Example | Split the sequence `(4 0 0 0)` into circular components: |

Plain-English / Technical |

Concept | Distributed Version Control |
---|---|

Analogy | Distributed Version Control is like sharing changes to a group shopping list with your friends. |

Diagram / Example | |

Plain-English | We check out, check in, branch, and share differences (“diffs”). |

Technical | `git checkout -b branchname` `git diff branchname` |

Combine ingredients with your own style. Steps might merge, but shouldn’t be skipped without a good reason (*“Zombies coming, no time for biochem, use this serum for the cure.”*). The site cheatsheet has a large collection of analogies.