Easy Trig Identities With Euler’s Formula

Trig identities are notoriously difficult to memorize: here’s how to learn them without losing your mind.

Starting from the Pythagorean Theorem and similar triangles, we can find connections between sin, cos, tan and friends (read the article on trig).

Can we go deeper? Maybe we can connect sine with itself (sin-ception). In math terms, we’re looking for formulas like this (full cheatsheet):

\displaystyle{ \sin(a + b) = \sin(a)\cos(b) + \sin(b)\cos(a) } \displaystyle{ \cos(a + b) = \cos(a)\cos(b) - \sin(a)\sin(b) }

Instead of memorizing these bad mamma jammas, let’s learn to draw the formulas. Euler’s Formula makes it easy.

Connections In Algebra

In algebra, we study relationships like this:

\displaystyle{ (a+b)^2 = a^2 + 2ab + b^2 }

Working out 172 directly is cumbersome. But we can simplify it to:

\displaystyle{17^2 = (10 + 7)^2 = 10^2 + (2 \times 10 \times 7) + 7^2 = 100 + 140 + 49 = 289}

In the computer era, sure, we can just crunch 172 directly. The important aspect is realizing that (a + b)2 can be broken into simpler ingredients: a2, b2, a, b. This is useful in factoring, simplifying equations, and so on.

Connections In Trig

Let’s turn trig into plain English. What does this mean?

\displaystyle{\sin(a + b) = ?}

Remembering that sine is “height (as a percentage of max)”, this equation asks: If we add two angles, what is their total height?

A quick guess might be to combine the individual heights:

\displaystyle{\sin(a + b) = \sin(a) + \sin(b)}

It looks clean, but isn’t quite right. If we keep adding up angles, their height increases until the max (100%), then starts decreasing.

The relationship between angle and height can’t be simple addition.

Now here’s the weird thing: I can draw what the new height should be (It’s right there!), but I can’t turn my drawing into an equation.

Or can I?

Drawing With Euler’s Formula

Euler’s Formula lets us create a circular path using complex numbers:

Crucially, multiplying complex numbers performs a rotation. Aha! We can use Euler’s Formula to draw the rotation we need:

  • Start with 1.0, which is at 0 degrees.
  • Multiply by eia, which rotates by a.
  • Multiply by eib, which rotates by b.
  • Final position = 1.0 · eia · eib = ei(a+b), or 1.0 at the angle (a+b)

The complex exponential ei(a+b) is pretty gnarly. Just like breaking apart 172, let’s multiply out the pieces:

\displaystyle{e^{i(a+b)} = e^{ia} \cdot e^{ib} }

\displaystyle{ = [\cos(a) + i\sin(a)] \cdot [\cos(b) + i\sin(b)]}

\displaystyle{ = [\cos(a)\cos(b) - \sin(a)\sin(b)] + i[\sin(a)\cos(b) + \sin(b)\cos(a)]}

\displaystyle{ = [\text{combined width}] + i[\text{combined height}]}

Now we’re talking! This version easily separates the horizontal position (real component) and vertical position (imaginary component):

  • Combined height: sin(a + b) = sin(a)cos(b) + sin(b)cos(a)
  • Combined width: cos(a + b) = cos(a)cos(b) – sin(a)sin(b)

Boom: two annoying-to-remember trig identities in a single computation. Not a bad deal.

Understanding The Equation

Now that we’ve found the equation, let’s grok its meaning. When we add the heights, here’s what’s happening:

  • The full height of the blue triangle (sin(a)) can’t be used, since the red triangle doesn’t extend as far. (Why? When we add angle b, we’re moving at a steeper angle with the same hypotenuse. We gained vertical distance and lost horizontal distance.) We’re effectively “sliding back” sin(a), reducing it by a factor of cos(b).
  • The full height of the red triangle (sin(b)) can’t be used either, since it’s at an angle. We’re “turning” sin(b), reducing it by a factor of cos(a).

Remember that sine and cosine are percentages. In this case,

\displaystyle{\sin(a + b) = [\sin(a) \times \text{\% we get for a}] + [\sin(b) \times \text{\% we get for b}]}


\displaystyle{\sin(a + b) = [\sin(a) \times \cos(b)] + [\sin(b) \times \cos(a)]}

Sure, we would like to get the full height of each triangle. But from the diagram, we see a slides back and b is twisted, so height we actually get is reduced. Think of each cosine as a tax on your height, reducing the amount you take home. (Have a height of .90? That’s nice, Papa Cosine will let you keep 75%. Pay up the rest, sucka!).

Now, what happens for small angles, like sin(.01 + .02)?

We could plug and chug this. But I’m guessing the result is about:

\displaystyle{\sin(.01 + .02) \sim \sin(.03) \sim .03}

Why? My mental diagram for small angles is this:

There’s no perceptible difference between the ideal heights (sin(a) and sin(b)) and the “taxed” versions (sin(a)cos(b) and sin(b)cos(a)).

  • For tiny angles, sin(a + b) is a vertical line. It barely loses any height due to the parts sliding or twisting.
  • For small angles, cosine (the percent we keep), is close to 100%. We’re keeping the vast, vast majority of the height we have.
  • sin(x) sim x is a common approximation for small angles (often used in Calculus). Essentially, it says sin(x) is a line for a brief time period. For small angles, sin(a + b) sim sin(a) + sin(b) sim a + b.

For cosine, we have a similar diagram:

  • This time, the conversion factor matches up (cosine with cosine, sine with sine).
  • The full width of the first triangle (cos(a)) gets scaled down to match the width of the second.
  • The sine term is negative since it pushes us backwards, reducing our height. We can use similar triangles to extract out this piece.

I’m not typically thinking about the parts in the diagram, though it’s nice to see how they work a few times. If you just need the trig identity, crank through it algebraically with Euler’s Formula.

Why do we care about trig identities?

Good question. A few reasons:

1. Because you have to (the worst reason). Many trig classes have you memorize these identities so you can be quizzed later (argh). You don’t need to memorize them, you can work out the formula in about a minute. Save your precious brain space for something else.

2. We can now “factor” trig functions into simper parts. We can now separate sine into smaller parts, which is useful in Calculus.

For example, to find the derivative of sine, we need:

\displaystyle{\lim_{dx \to 0} \frac{\sin(a + dx) - \sin(a)}{dx}}

and we let dx go to zero. This is tricky to work on directly, but using the sin(a + b) formula we have

\displaystyle{\frac{\sin(a + dx)}{dx} = \frac{\sin(a)\cos(dx) + \sin(dx)\cos(a)  - \sin(a)}{dx}}

As dx goes to zero, cos(dx) = 1 (zero angle is full width), so we have:

\displaystyle{= \frac{\sin(a)(1) + \sin(dx)\cos(a)  - \sin(a)}{dx} = \frac{\sin(dx)\cos(a)}{dx} = \left(\frac{\sin(dx)}{dx}\right)\cos(a)}

And as dx goes to zero, sin(dx) and dx become equal:

\displaystyle{\lim_{dx \to 0} \frac{\sin(dx)}{dx} = 1}

Plugging this in, we get cos(a) as the derivative of sin(a). Phew! Working with trig functions isn’t always easy, but at least it’s manageable.

3. It’s computationally efficient. If you’re doing a computer graphics, and frequently calculating sine/cosine (for dot products let’s say), trig identities are useful shortcuts. In the past, these identities were used similar to log tables to make hand-done calculations easier.

4. Math is about seeing connections. Because trig functions are derived from circles and exponential functions, they seem to show up everywhere. Sometimes you simplify a scenario by going from trig to exponents, or vice versa.

5. Deepen your knowledge of Euler’s Formula. Master Euler’s formula and you’ve mastered circles. And from there, the world! (Editor’s note: Kalid’s pinky appears to be affixed to his mouth. We’re working on it.)

See, Euler’s formula lets us draw a circle and read off a position. That’s amazing! We can avoid a lot of painful geometry with a few multiplications. If you’re doing any advanced math, letting Leonhard Euler deep into your soul is well worth it. He’s good company.

That’s it for today. Happy math.

Appendix: Resources and Extended Formulas

You can mix & match trig identities to create a bunch of new ones.

Subtraction formula: replace b with -b

\displaystyle{\sin(a - b) = \sin(a)\cos(-b) + \sin(-b)\cos(a)}

Double-angle formula: replace b with a

\displaystyle{\sin(2a) = \sin(a + a) = \sin(a)\cos(a) + \sin(a)\cos(a) = 2\sin(a)\cos(a)}

This makes sense: after accounting for the conversion factor, we add the height to itself.

\displaystyle{\sin(2a) = \sin(a + a) = \sin(a)\cos(a) + \sin(a)\cos(a) = 2\sin(a)\cos(a)}

Half-angle formula: replace and solve

Start with the double-angle formula and solve for sin(a), which is half the angle used in sin(2a). Trig without tears (a great resource and name) has more details:


A few other references I found helpful:

Avoiding The Adjective Fallacy

You’re reading this, so I’ll assume your English is pretty good. What’s wrong with these phrases?

  • Old little lady
  • Red big dog
  • Vietnamese spicy food

Do you have a logical reason why they sound strange? Or are they just “off”?

You probably didn’t think, “In 3rd grade I mastered the Royal Order of Adjectives:

  1. Determiner
  2. Observation
  3. Size
  4. Shape
  5. Age
  6. Color
  7. Origin
  8. Material
  9. Qualifier

… and upon applying them, noticed several errors. Old little lady is incorrect because rules #3 and #5 are swapped — a childish mistake, really. The next…”

Ugh. Describing Gran Gran isn’t a logic puzzle. But guess what students learning English are taught?

Even as a native speaker, could you construct this chart? Is this how you’d teach someone English?

The Adjective Fallacy is trying to learn by mastering the formal rules. Just because a concept can be rigorously defined doesn’t mean we should study it that way.

We didn’t become good at English by studying a chart: we developed an ear for the language and know how it should sound. And “old little lady” sounds off.

Similarly, getting good at math doesn’t mean marching through a gauntlet of rules on every problem. It’s having a native speaker’s feeling about what works or doesn’t.

“303 x 13 = 5074” looks strange, but not because we computed the left-hand side. It’s weird because odd numbers can’t multiply to become even (intuition). The last digit of the result should be 3×3= 9. 5074 is too large, since 300 x 10 (similar numbers nearby) is only 3000. Our Spidey Sense is blaring that the computation looks wrong.

My learning goal is knowing enough to make rough predictions on my own. I want a horse sense for algebra, calculus, trig, and even imaginary exponents, without scurrying off to apply an equation.

Rules aren’t inherently bad: they summarize, resolve ambiguous cases, and help us practice our weak spots. The question is how much to use them when starting off.

Learn enough rules to get started – don’t attempt to master them from the outset. See examples in a larger context and let the pattern-matching machinery of your brain get to work.

Learning Math

Math is a language too. Here’s a gut check: Would my current math study technique have helped me learn English?

If an English class spent a month on the adjective chart we’d have a talk with the teacher. But a Calculus class that spends weeks on the formal theory of limits is typical. Can we admit that studying this much detail, this early, doesn’t build fluency?

Pondering that question made me realize I had large gaps in trigonometry and calculus. I could only describe concepts using the adjective chart I’d memorized with a furrowed brow. (I’ll describe my grandma, just give me a minute!)

Enough was enough: embrace approaches that actually help you, like seeing the big picture first. In Calculus, that might mean seeing an integral in the first lesson:

That’s what Calculus does: break a shape into pieces (the derivative), and glue it together in various ways (the integral). If you like this style of teaching, check out the full Calculus series.

A typical calculus syllabus covers integrals in week 12, after months of “building a foundation”. Better not use a complete sentence until we’ve studied adjectives, nouns and verbs separately, right? (My hand wringing could solve the energy crisis.)

The path to understanding isn’t always the most structured.

Happy math.

Update: After research, this concept is called tacit knowledge, or “we know more than we can tell” (Michael Polanyi). Tacit knowledge is acquired through experience, and complements the explicit knowledge written as rules.

Learning To Learn: Intuition Isn’t Optional

My learning progress skyrocketed after adopting a new standard: Intuition Isn’t Optional.

Imagine a chef who follows a new recipe to the letter. No matter how it looks, no matter the reviews the recipe has, if the dish doesn’t taste good we know something is wrong. A sense of taste is the ultimate cooking tool.

When learning, we defer to external indicators (tests, teachers) to inform us we’ve learned something. External standards are made to be objective and easily-verified (Did you pick the correct answer?), but the important, subjective question is how well a concept sits in your mind. Did you actually experience it?

My checklist of truly learning a topic means it is:

  • Understandable: Did I have an aha! moment? Can I explain the concept in simple language? Does it connect to other topics I know?

  • Memorable: Do I have an analogy, diagram, or example that will stick with me for months or years?

  • Enjoyable: Do I want to revisit or use this knowledge? Don’t study literature in a way that makes you hate reading.

That’s my current definition of “intuitive understanding”, and for subjects I care about, I keep digging until I have all three aspects.

It’s ok to take your time (calculus took years to become enjoyable) and it’s ok to not care about everything equally (biology isn’t particularly compelling for me). I firmly believe any subject can become intuitive if I put in the effort to find analogies, diagrams, examples, plain-english descriptions, and technical details (the ADEPT method).

So, how do you set your own learning standard?

Step 1: Study Famous Learners

Let’s not recreate the wheel: famous learners have already described their thinking process, which we can adopt. It’s not about memorizing Einstein’s Theory of Relativity, it’s about internalizing the mindset that could lead to that idea.

Here’s a few viewpoints that resonated for me:

“Education is what remains after one has forgotten what one has learned in school.” —Albert Einstein

“The only real valuable thing is intuition.” —Albert Einstein

  • True learning goes beyond memorized facts. While I can forget the equation of a circle, I can’t forget that it’s round. And knowing it’s perfectly round quickly leads me back to the equation.

“The noblest pleasure is the joy of understanding.” —Da Vinci

  • True understanding implies joy. And practically, you’ll only continue studying what you like.

“To teach effectively a teacher must develop a feeling for his subject; he cannot make his students sense its vitality if he does not sense it himself. He cannot share his enthusiasm when he has no enthusiasm to share. How he makes his point may be as important as the point he makes; he must personally feel it to be important.” —George Póyla

“Education is the kindling of a flame, not the filling of a vessel.” —Socrates

  • We aren’t robots, and we should embrace the subjective aspects of learning. A teacher’s goal goes beyond knowledge-transfer to enjoyment-transfer.

The Humane Representation of Thought from Bret Victor

  • There are deeper, richer levels of understanding than what’s traditionally used. Explore a higher standard.

“I think most people can learn a lot more than they think they can. They sell themselves short without trying. One bit of advice: it is important to view knowledge as sort of a semantic tree — make sure you understand the fundamental principles, ie the trunk and big branches, before you get into the leaves/details or there is nothing for them to hang on to.” —Elon Musk

  • Your own standards greatly influence your understanding. External tests won’t check if facts are comfortably connected.

I have a larger collection of quotes that help align my thinking.

Step 2: Ask Questions That Check Your Standards

After rummaging through quotes that resonate, build a set of questions that capture your standard. For me, it became:

  • Do I have a visceral, ingrained analogy? Can it help solve problems?
  • Can I explain the concept to others? Do they want to explain it to their friends afterwards?
  • Will I remember the essential idea after a few months or years?
  • Can I find something to enjoy in the topic? Will I return after I inevitably forget 95% of it?

Questions seem to prompt more interest than a statement: “Do I have an analogy?” vs. “I must have an analogy”.

With this approach, strange corners of math I didn’t previously enjoy (like Euler’s Formula) became mysteries to solve: what is the insight here? Can I express it in a plain-English sentence? (Here’s a shot: Continuous rotation means you’re moving in a circle.)

Setting new standards helps take control of your education and overcome longstanding demons.

When people say “I hate math” I doubt they actually hate numbers (arithmetic), patterns & relationships (algebra), or shapes (geometry). They hate lessons that don’t contain insight, enjoyment, and basic human empathy. It’s fine to be disinterested in Ancient Egyptian Civilization, but hate comes from getting lost on a tour and spending the night near a sarcophagus.

These are the questions that helped me: what are your standards for learning?

(Thanks to Scott Young, Uri Bram, and Tom Miller for brainstorming ideas.)

Vector Calculus: Understanding the Cross Product

The cross product accumulates interactions between different dimensions. Taking two vectors, we can write every combination of components in a grid:

cross product interaction grid

This completed grid is the outer product, which can be separated into the:

  • Dot product, the interactions between similar dimensions (x*x y*y, z*z)

  • Cross product, the interactions between different dimensions (x*y,y*z, z*x, etc.)

The dot product (vec(a) · vec(b)) measures similarity because it only accumulates interactions in matching dimensions. It’s a simple calculation with 3 components.

The cross product (written vec(a) times vec(b)) has to measure a half-dozen “cross interactions”. The calculation looks complex but the concept is simple: accumulate 6 individual differences for the total.

Instead of thinking “When do I need the cross product?” think “When do I need interactions between different dimensions?”.

Area, for example, is formed by vectors pointing in different directions (the more orthogonal, the better). Indeed, the cross product measures the area spanned by two 3d vectors (source):

(The “cross product” assumes 3d vectors, but the concept extends to higher dimensions.)

Did the key intuition click? Let’s hop into the details.

Defining the Cross Product

The dot product represents vector similarity with a single number:

\displaystyle{\text{dot product} = (a_x, a_y, a_z) \cdot (b_x, b_y, b_z) = a_x b_x + a_y b_y + a_z b_z = \|\vec{a}\| \|\vec{b}\| \cos(\theta)}

(Remember that trig functions are percentages.) Should the cross product (difference between interacting vectors) be a single number too?

Let’s try. Sine is the percentage difference, so we could use:

\displaystyle{\text{cross product candidate} = \text{amount of difference} = \|\vec{a}\| \|\vec{b}\| \sin(\theta)}

Unfortunately, we’re missing a lot of detail. x is 100% different from both y and z, but shouldn’t x*y and x*z be different from each other? As Tolstoy wrote, “All happy families are alike; each unhappy family is unhappy in its own way.”

Instead, let’s express these unique differences as a vector:

  • The size of the cross product is the numeric “amount of difference” (with sin(theta) as the percentage)

  • The direction of the cross product is based on both inputs: it’s the direction orthogonal to both (i.e., favoring neither)

A vector result represents the x*yand x*z separately, even though y and z are both “100% different” from x.

(Should the dot product be turned into a vector too? Well, we have the inputs and a similarity percentage. There’s no new direction that isn’t available from either input.)

Geometric Interpretation

Two vectors determine a plane, and the cross product points in a direction different from both (source):

Here’s the problem: there’s two perpendicular directions. By convention, we assume a “right-handed system” (source):

If you hold your first two fingers like the diagram shows, your thumb will point in the direction of the cross product. I make sure the orientation is correct by sweeping my first finger from vec(a) to vec(b). With the direction figured out, the magnitude of the cross product is |a| |b| sin(theta), which is proportional to the magnitude of each vector and the “difference percentage” (sine).

The Cross Product For Orthogonal Vectors

To remember the right hand rule, write the xyz order twice: xyzxyz. Next, find the pattern you’re looking for:

  • xy => z (x cross y is z)
  • yz => x (y cross z is x; we looped around: y to z to x)
  • zx => y

Now, xy and yx have opposite signs because they are forward and backward in our xyzxyz setup.

So, without a formula, you should be able to calculate:

\displaystyle{\vec{x} \times \vec{y} = (1, 0, 0) \times (0, 1, 0) = (0, 0, 1) = \vec{z}}

Again, this is because x cross y is positive z in a right-handed coordinate system. I used unit vectors, but we could scale the terms:

\displaystyle{(3, 0, 0) \times (0, 4, 0) = (0, 0, 12)}

Calculating The Cross Product

A single vector can be decomposed into its 3 orthogonal parts:

\displaystyle{ \vec{a} = (a_x, a_y, a_z) = (a_x, 0, 0)  + (0, a_y, 0) + (0, 0, a_z)}

\displaystyle{ \vec{b} = (b_x, b_y, b_z) = (b_x, 0, 0)  + (0, b_y, 0) + (0, 0, b_z)}

When the vectors are crossed, each pair of orthogonal components (like a_x times b_y) casts a vote for where the orthogonal vector should point. 6 components, 6 votes, and their total is the cross product. (Similar to the gradient, where axis casts a vote for the direction of greatest increase.)

  • xy => z and yx => -z (assume vec(a) is first, so xy means a_x b_y)
  • yz => x and zy => -x
  • zx => y and xz => -y

xy and yx fight it out in the z direction. If those terms are equal, such as in (2, 1, 0) times (2, 1, 1), there is no cross product component in the z direction (2 – 2 = 0).

The final combination is:

\displaystyle{(a_x, a_y, a_z) \times (b_x, b_y, b_z) = (a_y b_z - a_z b_y, a_z b_x - a_x b_z, a_x b_y - a_y b_x) = \|a\| \|b\| \sin(\theta) \vec{n}}

where vec(n) is the unit vector normal to vec(a) and vec(b).

Don’t let this scare you:

  • There’s 6 terms, 3 positive and 3 negative
  • Two dimensions vote on the third (so the z term must only have y and x components)
  • The positive/negative order is based on the xyzxyz pattern

If you like, there is an algebraic proof, that the formula is both orthogonal and of size |a| |b| sin(theta), but I like the “proportional voting” intuition.

Example Time

Again, we should do simple cross products in our head:

\displaystyle{(1, 0, 0) \times (0, 1, 0) = (0, 0, 1)}

Why? We crossed the x and y axes, giving us z (or vec(i) times vec(j) = vec(k), using those unit vectors). Crossing the other way gives -vec(k).

Here’s how I walk through more complex examples:

\displaystyle{(1, 2, 3) \times (4, 5, 6) = ?}

  • Let’s do the last term, the z-component. That’s (1)(5) minus (4)(2), or 5 – 8 = -3. I did z first because it uses x and y, the first two terms. Try seeing (1)(5) as “forward” as you scan from the first vector to the second, and (4)(2) as backwards as you move from the second vector to the first.
  • Now the y component: (3)(4) – (6)(1) = 12 – 6 = 6
  • Now the x component: (2)(6) – (5)(3) = 12 – 15 = -3

So, the total is (-3, 6, -3) which we can verify with Wolfram Alpha.

In short:

  • The cross product tracks all the “cross interactions” between dimensions
  • There are 6 interactions (2 in each dimension), with signs based on the xyzxyz order


Connection with the Determinant

You can calculate the cross product using the determinant of this matrix:

\mathbf{u\times v}=\begin{vmatrix}
\mathbf{i} & \mathbf{j} & \mathbf{k}\\
u_1 & u_2 & u_3\\
v_1 & v_2 & v_3\\

 \mathbf{u\times v}=
u_2 & u_3\\
v_2 & v_3
u_1 & u_3\\
v_1 & v_3
u_1 & u_2\\
v_1 & v_2

There’s a neat connection here, as the determinant (“signed area/volume”) tracks the contributions from orthogonal components.

There are theoretical reasons why the cross product (as an orthogonal vector) is only available in 0, 1, 3 or 7 dimensions. However, the cross product as a single number is essentially the determinant (a signed area, volume, or hypervolume as a scalar).

Connection with Curl

Curl measures the twisting force a vector field applies to a point, and is measured with a vector perpendicular to the surface. Whenever you hear “perpendicular vector” start thinking “cross product”.

We take the “determinant” of this matrix:

\begin{vmatrix} \vec{i} & \vec{j} & \vec{k} \\  \\
{\frac{\partial}{\partial x}} & {\frac{\partial}{\partial y}} & {\frac{\partial}{\partial z}} \\
 \\  F_x & F_y & F_z \end{vmatrix}

\displaystyle{\nabla \times \vec{F} = \left(\frac{\partial F_z}{\partial y}  - \frac{\partial F_y}{\partial z}\right) \vec{i} + \left(\frac{\partial F_x}{\partial z} - \frac{\partial F_z}{\partial x}\right) \vec{j} + \left(\frac{\partial F_y}{\partial x} - \frac{\partial F_x}{\partial y}\right) \vec{k}}

Instead of multiplication, the interaction is taking a partial derivative. As before, the vec(i) component of curl is based on the vectors and derivatives in the vec(j) and vec(k) directions.

Relation to the Pythagorean Theorem

The cross and dot product are like the orthogonal sides of a triangle:

\displaystyle{a^2 + b^2 = c^2 }

For unit vectors, where |a| = |b| = 1 , we have:

\displaystyle{ \|\text{dot product}\|^2 + \|\text{cross product}\|^2 = \cos^2 + \sin^2 = 1}

I cheated a bit in the grid diagram, as we have to track the squared magnitudes (as done in the Pythagorean Theorem).

Advanced Math

The cross product & friends get extended in Clifford Algebra and Geometric Algebra. I’m still learning these.

Cross Products of Cross Products

Sometimes you’ll have a scenario like:

\displaystyle{\vec{a} \times \vec{b} \times \vec{c} = ? }

First, the cross product isn’t associative: order matters.

Next, remember what the cross product is doing: finding orthogonal vectors. If any two components are parallel (vec(a) parallel to vec(b)) then there are no dimensions pushing on each other, and the cross product is zero (which carries through to 0 times vec(c)).

But it’s ok for vec(a) and vec(c) to be parallel, since they are never directly involved in a cross product, for example:

\displaystyle{\vec{i} \times \vec{j} \times \vec{i} = \vec{k} \times \vec{i} = \vec{j} }

Whoa! How’d we get back to vec(j)? We asked for a direction perpendicular to both vec(i) and vec(j), and made that direction perpendicular to vec(i) again. Being “doubly perpendicular” means you’re back on the original axis.

Dot Product of Cross Products

Now if we take

\displaystyle{\vec{a} \times \vec{b} \cdot \vec{c} = ? }

what happens? We’re forced to do vec(a) times vec(b) first, because vec(b) · vec(c) returns a scalar (single number) which can’t be used in a cross product.

If vec(a) and vec(c) are parallel, what happens? Well, vec(a) times vec(b) is perpendicular to vec(a), which means it’s perpendicular to vec(c), so the dot product with vec(c) will be zero.

I never really memorized these rules, I have to think through the interactions.

Other Coordinate Systems

The Unity game engine is left-handed, OpenGL (and most math/physics tools) are right-handed. Why?

In a computer game, x goes horizontal, y goes vertical, and z goes “into the screen”. This results in a left-handed system. (Try it: using your right hand, you can see x cross y should point out of the screen).

Applications of the Cross Product

  • Find the direction perpendicular to two given vectors.
  • Find the signed area spanned by two vectors.
  • Determine if two vectors are orthogonal (checking for a dot product of 0 is likely faster though).
  • “Multiply” two vectors when only perpendicular cross-terms make a contribution (such as finding torque).
  • With the quaternions (4d complex numbers), the cross product performs the work of rotating one vector around another (another article in the works!).

Happy math.

Intuition For The Law Of Cosines

The Law of Cosines is presented as a geometric result that relates the parts of a triangle:

\displaystyle{ c^2 = a^2 + b^2 - 2ab\cos(C) }

While true, there’s a deeper principle at work.

The Law of Interactions: The whole is based on the parts and the interaction between them.

The wording “Law of Cosines” gets you thinking about the mechanics of the formula, not what it means. Part of my learning strategy is rewording ideas into ones that make sense.

  • The Law of Cosines, after cranking through geometric steps we’re prone to forget, looks like c2 = a2 + b2 – 2abcos(C).

  • This is suspiciously like the expansion that if c = (a + b), then c2 = a2 + b2 + 2ab

  • The difference is that 2ab has an extra factor, cos(C), which measures the “actual overlap percentage” (2ab assumes we fully overlap, i.e. where cos(C) = 1).

  • So, the Law of Cosines is really a generalization of how c2 = (a + b)2 expands when components aren’t fully lined up. We’re treating geometric lines as terms in an algebraic expansion.

Analogy: The Assistant Chef

Imagine a restaurant with a single chef, Alice. She’s overworked, so Bob is hired as her assistant (sous chef).

Based on Alice’s current performance, and Bob’s performance in his interview, what happens when they work together?

Surely the new result must be their combined effort:

\displaystyle{ \text{Total Contribution} = \text{Alice's Work} + \text{Bob's Work}}

Hah! Office workers everywhere are rolling their eyes. You can’t just assume people contribute identically when they’re put together: there are interactions to account for.

Beyond their individual contributions, the two might slow each other down (Where’d you put the whisk again?), or find ways to work together (I’m peeling carrots anyway, use some of mine.).

In a system with several parts, start with the individual contributions and then ask if their interaction will:

  • Help each other
  • Hurt each other
  • Ignore each other

The original idea that “Total = Alice + Bob” is more generally expressed as:

\displaystyle{ \text{Total Contribution} = \text{Alice's Work} + \text{Bob's Work} + \text{Interaction effects} }

Exploring The Scenario

We need to separate the list of participants (Alice, Bob) from the result of their interaction.

Take the numbers 5 and 3. We can write them like so:

  • Parts = (5, 3)

and we’re pretty sure they combine to make 8. But is there another way to get that conclusion?

Yes: we multiply. Beyond repeated counting, multiplication shows what happens when the parts of a system interact:

\displaystyle{(5 + 3)(5 + 3) = 5^2 + (5)(3) + (3)(5) + 3^2 = 25 + 15 + 15 + 9 = 64 }

We’ve gone from “parts view”, (5, 3), to “interaction view”, (5 + 3)2. The result of interaction mode says the system would result in 64 if it did interact with itself.

One caveat: when going to interaction view, we wrote down (5 + 3)(5 + 3), but we can’t simplify (5 + 3) = 8 on the outset. We’re using addition for bookkeeping until multiplication can combine the parts.

Oh, another caveat: why can we just add the interactions, but not the parts? Great question. The individual parts might be pointing in different dimensions, and don’t line up nicely on the same scale. The interacting parts turn into area, which can be combined to the same result no matter the orientation.

(I’ll investigate this concept more in a follow-up. It’s a neat idea that area is a generic, easily combinable quantity but individual paths are not.)

Generalizing the Principle

Simple setups like (5, 3) are easy to think through, like eyeballing 2x + 3 = 7 and guessing x = 2. But a more complex scenario like x2 + 3x = 15 requires a systematic approach.

The Law of Cosines is a systematic approach to working through the parts:

  1. List the parts
  2. Get every interaction as area
  3. Add to find the total contribution
  4. Convert into the equivalent “single part”

The last step is often implied. Once we’ve merged the jumble of interactions, we want the single part that could represent the entire system. Is there a single person (Charlie) whose efforts are identical to that of Alice and Bob working together?

The Law of Cosines gives us a way to find Charlie.

What’s the Deal with Cosine?

When two parts interact, they can help, hurt, or ignore each other:

  • Perfect alignment means they help 100% (5 and 3)
  • Perfect mis-alignment means they hurt 100% (5 and -3)
  • Partial alignment or mis-alignment means they help or hurt by a percentage
  • No alignment means they ignore each other

How do we measure alignment? With cosine.

Using our trig analogy, cosine is the percentage an angle moves along the ground.

A 0-degree angle follows the ground perfectly (100%), and moving vertically doesn’t follow it at all (0%). Other angles are a fraction in-between.

If the parts in our system can be written as paths, and we know the angle between them is theta (theta), then we can measure the overlap with cosine. One path acts as the ground, and the other is the path we’re following:

\displaystyle{\text{Overlap percentage} = \cos(\text{angle between them}) }

\displaystyle{\text{Actual interaction} = \text{Max Interaction} \cdot \text{Overlap percentage} = ab\cos(\theta)) }

When paths are perfectly aligned, their full strength is used (ab and ba). The interaction factor cos(theta) modifies that strength to show much they actually work together.

So, our jumble of interactions becomes:

\displaystyle{\text{Overall behavior} = a^2 + b^2 + ab\cos(\theta) + ba\cos(\theta) }

\displaystyle{\text{Overall behavior} = a^2 + b^2 + 2ab\cos(\theta)}

\displaystyle{\text{Single part} = \sqrt{ a^2 + b^2 + 2ab\cos(\theta)}}

Phew! And that’s the Law of Cosines: collect every interaction, account for the alignment, and simplify it to a single part. (The formula is usually written without the square root, but usually you want c, not c2.)

Now, why is the Law of Cosines often written with a negative sign? Well, the assumption is that in a typical triangle, a small internal angle C means the sides are negatively aligned, while theta (theta) is an external look at their alignment:

Similarly, a large internal angle means the sides are positively aligned, and will help each other. Typically, a small angle means you’re moving in the same direction, but this internal/external difference means we reverse the sign.

Personally, I don’t memorize whether there’s a positive or negative sign: I think about whether the parts will help or hurt each other in the scenario, and make the interaction positive or negative. Don’t be a slave to the formula.

Quick Practice Problem

Let’s say my triangle has side a = 10 and side b = 20. What is side c when the angle between a and b is:

45 degrees in alignment

Here, we need the Law of Cosines. a and b are pointing partially in the same direction. We switch to interaction mode to get to a common, combinable unit (area):

  • a2 = 100
  • b2 = 400
  • 2ab = 2 · 10 · 20 = 400, but we need to adjust by the interaction factor. That is cos(45) = .707, so the real interaction factor is 400 · .707 = 282.8

The overall interactions are:

\displaystyle{100 + 400 + 282.8 = 782.8}

and the equivalent single side (c) is:

\displaystyle{ \sqrt{782.8} = 27.97 \text{cm} }

70 degrees in mis-alignment

Again, we need the Law of Cosines. We can see that the angles fight each other, so the interaction will be negative:

\displaystyle{\text{total interaction} = a^2 + b^2 - 2AB\cos(\theta) = 100 + 400 - (2)(10)(20)\cos(70) = 363.19} \displaystyle{ c = \sqrt{636.8} = 19.05 \text{cm} }

Our intuition says this arrangement should be smaller than the previous one (since the sides aren’t working together), and it is.

Full alignment or mis-alignment

When our “triangle” has an angle of 0 degrees (or 180), all the parts are lying flat. Here, the parts are in the same dimension, and can be treated as regular numbers:

  • Fully aligned: 10 + 20 = 30
  • Fully mis-aligned: 10 – 20 = -10 (pointing in direction of B).

The Law of Cosines still works, of course:

  • Full alignment: a2 + b2 + 2abcos(theta) = 100 + 400 + 400cos(0) = 900 and c = √(900) = 30
  • Full mis-alignment: a2 + b2 – 2abcos(theta) = 100 + 400 + 400cos(180) = 100 which means c = √(100) = 10 (pointing backwards).

Again, we shouldn’t robotically follow the formula: have a rough idea what the result should be, and think through the calculations. (“The overall interaction is this, so the individual side would that…”).

Thinking of interactions is one interpretation: next time, we’ll see it as the Law of Projections.

Happy math.

Appendix: Pythagorean Theorem

The Law of Cosines resembles the Pythagorean Theorem, no?

Now you might suspect why. The Pythagorean Theorem is the special case of zero interaction, which happens when the sides are at right angles. After all, 90 degree angle is vertical, and has 0% overlap with the ground.

The Law of Cosines becomes:

\displaystyle{ c^2 = a^2 + b^2 + interaction }

\displaystyle{ c^2 = a^2 + b^2 + 0}

If we know the parts won’t interact, we can ignore interaction effects. However, the self-interactions are still there and must be combined: a2 and b2 are fine, but the crossover terms ab and ba disappear.

Here’s another version of the Pythagorean Theorem. We can’t combine a and b directly, so combine their interactions and reduce them to a single part:

\displaystyle{c = \sqrt{a^2 + b^2} }

Appendix: The Geometric Proof

You might be hankering for a geometric proof. Here’s one from quora, based on a paper by Knuth:

The insight is that we take our original a-b-c triangle and scale it by a (giving the a2-ab-ac triangle) and b (giving the ab-b2-bc triangle). These two triangles build a larger, similar triangle ac-bc-c2, and with some trig, the bottom portion can be shown to equal a2 + b2 – 2abcos(theta).

While interesting, I don’t like these types of proofs up front. The Law of Cosines is about interactions, not re-arranging triangles. Does this explanation get you thinking about what cosine represents? About when it should be positive, negative, or zero?

Appendix: Another Way to Remember

Imagine sides A and B are pointing in the same direction along the horizontal number line. This means c = a + b and the Law of Cosines reduces to:

\displaystyle{(a + b)^2 = a^2 + b^2 + 2ab}

So, for a 180-degree interior angle, we get a regular algebraic statement. This helps me remember, on the fly, when to add vs. subtract. We add 2abcos(theta) when the interior angle is large.

ADEPT Summary

Concept Law of Cosines
Analogy Imagine an assistant chef whose interactions may (or may not) be helpful.
Example Suppose a = 10 and b = 20 in a triangle. If they are aligned 45-degrees, their interaction is a2 + b2 + 2abcos(45) = 782.8 and the remaining side is √(782.8) = 27.97 units long.
Plain-English The Law of Interactions: The whole is based on the parts and the interaction between them.
Technical Triangle with internal angle C: c2 = a2 + b2 – 2abcos(C)
General interaction: c2 = a2 + b2 + 2abcos(theta)

Intuition For The Law Of Sines

The Law Of Sines is something I memorized in a class once, but didn’t internalize:

\displaystyle{\frac{\sin(A)}{a} = \frac{\sin(B)}{b} = \frac{\sin(C)}{c} }

Ok, that’s a neat connection, and maybe we can prove it by drawing some right triangles (of course) and re-arranging terms.

But what does it mean?

Rather than the Law of Sines, think of the Law of Equal Perspectives:

Each angle & side can independently find the circle that wraps up the whole triangle. This connection lets us start with one angle and work out facts about the others.

Analogy: Kids Describing A Monster

I occasionally frighten the neighborhood children by unchaining the mutant gorilla in my front yard.

The kids run screaming, telling different stories of what they’ve seen:

“Alice claims the monster was 20 feet tall, but we all know she exaggerates by doubling. And Billy’s a bit of a crybaby, and said it was 30 feet tall. Charlie’s fairly no-nonsense and said the beast was exactly 10 feet high.”

If we know a kid’s “exaggeration factor” and the size they claim, we can deduce the true size of the monster. (Furious George has a name, you know.)

Even better, we can predict what other kids might have said: If Alice claimed it was 40 feet, what would Charlie have said?

Triangles And The Monster Circle

What do kids running from monsters have to do with triangles? Well, every triangle is trapped inside its own Monster Circle:

Whatever triangle we draw, there’s some circle trying to gobble it up (technically, “circumscribe it”). Try this page to explore an example on your own.

Now here’s the magic: just knowing a single angle and its corresponding side, we can figure out the Monster Circle.

Here’s how. Let’s say we have a triangle like this:

We don’t know anything except the angle A (call it 30 degrees) and the length of side a (call it an inch).

First off: is this the correct drawing of the triangle? Probably not! We don’t know the other sides, so this is equally valid:

It still has the same angle (A = 30 degrees) and the size of the base hasn’t changed (still one inch).

What if we start drawing more possibilities?

Whoa. From A‘s point of view all the possible triangles that have “A=30 degrees, a=1 inch” are on this circle. Whatever B and C end up being, they need to pick an option from this circle.

Similarly, we can argue this from the other perspectives:

  • We can lock down angle B and side b, and trace out a circle of possibilities
  • We can lock down angle C and side c, and trace out a circle of possibilities

This is the meaning of the Law of Sines: each angle unknowingly generates the same circle as the others.

(How do we prove, not just see that the possibilities lie on a circle? That’s the Inscribed Angle Theorem, for another day.)

Calculating The Actual Size

We’ve figured out that there is a Monster Circle, now let’s see how big it is. Um… how?

Remember, we can slide around the circle and keep A (30 degrees) and a (1 inch) the same. So let’s slide until we make a right triangle:

Ah! Now we can use sine. Remember that sine is the percentage height compared to the max possible. The max possible height is the full diameter (d) of the Monster Circle.

(Why is a 90-degree angle across from the full diameter? Draw a square inside the circle, touching the sides. It must be symmetric, the diagonals pass through the center along the diameter, and are opposite a 90-degree angle.)

With a little re-arranging, we get:

\displaystyle{ \frac{a}{\sin(A)} = d }

Using the same logic for the other sides, we get:

\displaystyle{ \frac{a}{\sin(A)} = \frac{b}{\sin(B)} = \frac{c}{\sin(C)} = d }

In a way, sin(A) is the “exaggeration factor” that converts the size the angle measured (a) to the full diameter (d). Each angle is a different kid, and some really misjudge the size of the full circle based on what they see. (90-degrees is right on target.)

Practice Problem

In our example above, A is 30 degrees and a is 1 inch.

We can calculate the diameter pretty fast. First, we get the sine:

\displaystyle{\sin(30) = 0.5}

That means our length a is 50% of the max height, so the full diameter must be 2 inches.

This isn’t enough to figure out the triangle by itself. Let’s say angle B comes along and says it is 45 degrees. How long is b?


\displaystyle{sin(45) = .707}

which means that b is .707 of the max diameter. Therefore,

\displaystyle{ b = .707 \cdot \text{2 inches} =  \text{1.414 inches}}

Previously, I would plug numbers into the Law of Sines formula and chug away algebraically. Now I can think in terms of the Monster Circle: “Ok, I have the max diameter. I take the sine, and get the fraction of the max diameter for that side.”

Most books write the formula with sin(A) in the numerator. It might read better “Sine A over A” but it distorts the conclusion that frac(a)(sin(A)) is the size of the circle.

Put the concept in your own words. The “Law of Sines” is a generic description of what’s in the formula, but the “Law of Equal Perspectives” explains what it means:

  • All parts of the triangle have a perspective on the whole
  • Sine is the “exaggeration factor” that scales up an individual side to the full diameter. (Sine is the percentage of the max possible, and we divide by it.)

Happy math.

Appendix: Obtuse Angles

Technically, because B is over 90 degrees, we can’t ever spin it and have either A or C be a right angle (if we could, the triangle would have over 180 degrees).

What to do? Realize the 180-degree complement of B (call it ) acts like a stand-in on the other side:

has the same sine as B, which should make sense: they both point upwards along the same trajectory. To help us sleep better at night, we start with in the right-angle setup:

\displaystyle{\sin(B*) = \sin(B) = \frac{b}{d} }

and get to the same conclusion as before. Phew.

However, the fact that B and can be swapped can lead to problems.

If I have a triangle where I know A (30 degrees) and a (1 inch), and then say b is 1.5 inches, what can you deduce?

The max diameter is 2 inches as before, so

\displaystyle{\sin(B) = \frac{1.5}{2} = .75}

Unfortunately, there are two angles with that sine value: a calculator says sin-1(.75) = 48 degrees, but 180 – 48 = 132 degrees would work too (more details).

Also, the triangle may not be possible given a hypothetical scenario. If I say b is 3 inches, you know something’s amiss. The max diameter was already calculated to be 2. Even a 90-degree angle, the best possible, could only have a side of 2 inches.

ADEPT Summary

ADEPT Topic Law of Sines
Analogy Imagine kids describing the same monster with varying degrees of exaggeration.
Example Suppose A=30 and a=1 inch. Since sin(A) = 0.5, the Monster Circle is 1 / 0.5 = 2 inches wide. Given another angle, I can figure out the length of its side. If B = 45 degrees, then side b takes up sin(45) = .707 of the diameter, and is 1.414 inches.
Plain-English Any angle + side can deduce the size of the wrapping circle.
Technical \displaystyle{\frac{a}{\sin(A)} = \frac{b}{\sin(B)} = \frac{c}{\sin(C)} = \text{diameter of circle} }

Learn Difficult Concepts with the ADEPT Method

After a decade of writing explanations, I’ve simplified the strategy I use to get new concepts to click.

Make explanations ADEPT: Use an Analogy, Diagram, Example, Plain-English description, and then a Technical description.

ADEPT method of learning

Here’s how to teach yourself a difficult idea, or explain one to others.

Analogy: What Else Is It Like?

Most new concepts are variations, extensions, or combinations of what we already know. So start there!

In our decades of life, we’ve encountered thousands of objects and experiences. Surely one of them is vaguely similar to this new topic and can be the starting point.

Here’s an example: Imaginary numbers. Most lessons introduce them in a void, simply saying “negative numbers can have square roots too.”

Argh. How about this:

  • Negative numbers were distrusted until the 1700s: How could you have less than nothing?
  • We overcame this by realizing numbers could exist on a number line, allowing us to move forward or backward from zero.
  • Imaginary numbers express the idea that we can move upwards and downwards, or rotate around the number line.

Instead of just going East/West, we can go North/South too – or even spin around in a circle. Neat!

Analogies are fuzzy, not 100% accurate, and yet astoundingly useful. They’re a raft to get across the river, and leave behind once you’ve crossed.

Diagram: Engage That Half Of Your Brain

We often think diagrams are a crutch if you aren’t macho enough to directly interpret the symbols. Guess what? Academic progress on imaginary numbers took off only after the diagrams were made!

Favor the easiest-to-absorb explanation, whether that comes from text, diagram, or interpretative dance. From there, we can work to untangle the symbols.

So, here’s a visualization:

imaginary numbers

Imaginary numbers let us rotate around the number line, not just move side-to-side.

Starting to get a visceral sense for what they can do, right?

Half our brain is dedicated to vision processing, so let’s use it. (And hey, maybe for this topic, twirling around in an interpretative dance would help.)

Example: Let Me Experience The Idea

Oh, now’s our chance to hit the student with the fancy terminology, right?

Nope. Don’t tell someone the way things are: let them experience it. (How fun is hearing about the great dinner I had last night? The movie you didn’t get to see?)

But that’s what we do for math. “Someone smarter than you thought this through, found out all the cool connections, and labeled the pieces. Memorize what they discovered.”

That’s no fun: let people make progress themselves. Using the rotation analogy, what happens after 4 turns?

How about 2 turns? 4 turns clockwise?

Plain-English Description: Use Your Own Words

If you genuinely experienced an idea, you should be excited to describe it:

  • Imaginary numbers seem to point North, and we can get to them with a single clockwise turn.
  • Oh! I guess they can point South too, by turning the other way.
  • 4 turns gets us pointing in the positive direction again
  • It seems like two turns points us backwards

These are all correct conclusions, just not yet written in the language of math. But you can still reason in plain English!

Technical Description: Learn The Formalities

The final step is to convert our personal understanding to the formal notation. It’s like sharing a song you’ve made up: you can hum it to yourself, but need sheet music for other people to use.

Math is the sheet music we’ve agreed upon to share ideas. So, here’s the technical terminology:

  • We say i (lowercase) is 1.0 in the imaginary dimension
  • Multiplying by i is a 90-degree counter-clockwise turn, to face “up” (here’s why). Multiplying by -i points us South
  • It’s true that starting at 1.0 and taking 4 turns puts us at our starting point:

\displaystyle{1 * i * i * i * i = 1 }

And two turns points us negative:

\displaystyle{1 * i * i = -1 }

which simplifies to:

\displaystyle{i^2 = -1}


\displaystyle{i = \sqrt{-1}}

In other words, i is “halfway” to -1. (Square roots find the halfway point when using multiplication.)

Starting to get a feel for it? Just spitting out “i is the square root of -1” isn’t helpful. It’s not explaining, it’s telling. Nothing was experienced, nothing was internalized.

Give people the chance to make an idea their own.

The Mental Checklist

I used to be satisfied with a technical description and practice problem. Not anymore.

ADEPT is a checklist of what I need to feel comfortable with an idea. I don’t think I’ve actually learned a topic unless I have a metaphor that ties everything together. Here’s a few places to look:

Unfortunately, there aren’t many resources focused on analogies, especially for math, so you have to make your own. (This site exists to share mine.)

Modifying the Learning Order

It seems logical to assume we can present facts in order, like transmitting data to a computer. But who actually learns like that?

I prefer the blurry-to-sharp approach to teaching:

Start with a rough analogy and sharpen it until you’re covering the technical details.

Sometimes, you need to untangle a technical description on your own, so must work backwards to the analogy.

Starting with the technical details:

  • Can you explain them in your own words?
  • Can you solve an example problem, describing the steps in your own words?
  • Can you create a diagram that represents how the concept fits together for you?
  • Can you relate the concept to what you already know?

With this initial analogy, layer in new details and examples, and see if it holds up. (It doesn’t need to be perfect, but iterate.)

If we’re honest, we’ll admit that we forget 95% of what we learn in a class. What sticks? A scattered analogy or diagram. So, make them for yourself, to bootstrap the rest of the understanding as needed.

In a year, you probably won’t remember much about imaginary numbers. But the quick analogy of “rotation” or “spinning” might trigger a flurry of recognition.

The Goal: Explanations That Actually Work

I’m wary of making a contrived acronym, but ADEPT does capture what I need to internalize a new concept. Let’s stop being shy about thinking out loud: does a fact-only presentation really work for you? What other components do you need? I have a soft, squishy brain that needs the connecting glue, not just data.

Scott Young uses the Feynman Technique to explain concepts in everyday words and work backwards to an analogy and diagram. (Richard Feynman was a world-class expositor and physicist, and one of my teaching heroes.)

Prof. Barb Oakley runs an excellent, free course on Learning How To Learn. I was honored to do an interview with her for the class:

Click to watch the interview — I recommend the full course. The first session had over 180,000 students and was a great success.

Beyond any technique, raise your standards to find (or create) explanations that truly work for you. It’s the only way to have concepts stick.

Happy math.


“BE” is a nice prefix for the style to use when teaching:

  • Brevity is beautiful.

  • Empathy makes us human. Use your natural style, relate to common experience, and anticipate questions in your explanation.

I’ve yet to complain that a lesson respected my time too much, or related too well to how I thought.

Appendix: ADEPT Summaries

ADEPT is like a nutrition label for an explanation: what are the key ingredients?

Concept Euler’s Formula
Analogy Imaginary numbers spin exponential growth into a circle.
Example Let’s figure out the value of 3^i. (It’s on the unit circle.)
Plain-English Raising an exponent to an imaginary power spins you on the unit circle. The same destination can be written with polar (distance and angle) or rectangular coordinates (real part and imaginary part).
Technical \displaystyle{e^{ix} = \cos(x) + i\sin(x)}

Concept Fourier Transform
Analogy Like filtering a smoothie into ingredients, the Fourier Transform extracts the circular paths within a pattern.
Diagram Smoothie being filtered:
Example Split the sequence (4 0 0 0) into circular components:
Plain-English / Technical

Concept Distributed Version Control
Analogy Distributed Version Control is like sharing changes to a group shopping list with your friends.
Diagram / Example
Plain-English We check out, check in, branch, and share differences (“diffs”).
Technical git checkout -b branchname
git diff branchname

Combine ingredients with your own style. Steps might merge, but shouldn’t be skipped without a good reason (“Zombies coming, no time for biochem, use this serum for the cure.”). The site cheatsheet has a large collection of analogies.

Learning math? Think like a cartoonist.

What’s the essential skill of a cartoonist? Drawing ability? Humor? A deep well of childhood trauma?

I’d say it’s an eye for simplification, capturing the essence of an idea.

For example, let’s say we want to understand Ed O’Neill:

A literal-minded artist might portray him like this:

While the technical skill is impressive, does it really capture the essence of the man? Look at his eyes in particular.

A cartoonist might draw this:

Wow! The cartoonist recognizes:

  • The unique shape of his head. Technically, his head is an oval, like yours. But somehow, making his jaw wider than the rest of his head is perfect.

  • The wide-eyed bewilderment. The whites of his eyes, the raised brows, the pursed lips – the cartoonist saw and amplified the emotion inside.

So, who really “gets it”? It seems the technical artist worries more about the shading of his eyes than the message they contain.

Numbers Began With Cartoons

Think about the first numbers, the tally system:


Those are… drawings! Cartoons! Caricatures of an idea!

They capture the essence of “existing” or “having something” without the specifics of what it represents.

Og the Cavemen Accountant might have tried drawing individual stick figures, buffalos, trees, and so on. Eventually he might realize a shortcut: draw a line and call it a buffalo. This captures the essence of “something is there” and our imaginations do the rest.

Math is an ongoing process of simplifying ideas to their cartoon essence. Even the beloved equals sign (=) started as a drawing of two identical lines, and now we can write “3 + 5 = 8” instead of “three plus five is equal to eight”. Much better, right?

So let’s be cartoonists, seeing an idea — really capturing it — without getting trapped in technical mimicry. Perfect reproductions come in after we’ve seen the essence.

Technically Correct: The Worst Kind Of Correct

We agree that multiplication makes things bigger, right?

Ok. Pick your favorite number. Now, multiply it by a random number. What happens?

  • If that random number is negative, your number goes negative
  • If that random number is between 0 and 1, your number is destroyed or gets smaller
  • If that random number is greater than 1, your number will get larger

Hrm. It seems multiplication is more likely to reduce a number. Maybe we should teach kids “Multiplication generally reduces the original number.” It’ll save them from making mistakes later.

No! It’s a technically correct and real-life-ily horrible way to teach, and will confuse them more. If the technically correct behavior of multiplication is misleading, can you imagine what happens when we study the formal definitions of more advanced math?

There’s a fear that without every detail up front, people get the wrong impression. I’d argue people get the wrong impression because you provide every detail up front.

As George Box wrote, “All models are wrong, but some are useful.”

A knowingly-limited understanding (“Multiplication makes things bigger”) is the foothold to reach a more nuanced understanding. (“People generally multiply positive numbers greater than 1, so multiplication makes things larger. Let’s practice. Later, we’ll explore what happens if numbers are negative, or less than one.”)


I wrap my head around math concepts by reducing them to their simplified essence:

  • Imaginary numbers let us rotate numbers. Don’t start by defining i as the square root of -1. Show how if negative numbers represent a 180-degree rotation, imaginary numbers represent a 90-degree one.

  • The number e is a little machine that grows as fast as it can. Don’t start with some arcane technical definition based on limits. Show what happens when we compound interest with increasing frequency.

  • The Pythagorean Theorem explains how all shapes behave (not just triangles). Don’t whip out a geometric proof specific to triangles. See what circles, squares, and triangles have in common, and show that the idea works for any shape.

  • Euler’s Formula makes a circular path. Don’t start by analyzing sine and cosine. See how exponents and imaginary numbers create “continuous rotation”, i.e. a circle.

Avoid the trap of the guilty expert, pushed to describe every detail with photorealism. Be the cartoonist who seeks the exaggerated, oversimplified, and yet accurate truth of the idea.

Happy math.

PS. Here’s my cheatsheet full of “cartoonified” descriptions of math ideas.

How To Think With Exponents And Logarithms

Here’s a trick for thinking through problems involving exponents and logs. Just ask two questions:

Are we talking about inputs (cause of the change) or outputs (the actual change that happened?)

  • Logarithms reveal the inputs that caused the growth
  • Exponents find the final result of growth

Are we talking about the grower’s perspective, or an observer’s?

  • e and the natural log are from the grower’s instant-by-instant perspective
  • Base 10, Base 2, etc. are measurements convenient for a human observer

In my head, I put the options in a table:

exponent points of view

I have thoughts like “I need the cause, from the grower’s perspective… that’s the natural log.”. (Natural log is abbreviated with lowercase LN, from the high-falutin’ logarithmus naturalis.)

I was frustrated with classes that described the inner part of the table, the raw functions, without the captions that explained when to use them!

That won’t fly, let’s get direct practice thinking with logs and exponents.

Scenario: Describing GDP Growth

Here’s a typical example of growth:

  • From 2000 to 2010, the US GDP changed from 9.9 trillion to 14.4 trillion

Ok, sure, those numbers show change happened. But we probably want insight into the cause: What average annual growth rate would account for this change?

Immediately, my brain thinks “logarithms” because we’re working backwards from the growth to the rate that caused it. I start with a thought like this:

\displaystyle{\text{logarithm of change} \rightarrow \text{cause of growth} }

A good start, but let’s sharpen it up.

First, which logarithm should we use?

By default, I pick the natural logarithm. Most events end up being in terms of the grower (not observer), and I like “riding along” with the growing element to visualize what’s happening. (Radians are similar: they measure angles in terms of the mover.)

Next question: what change do we apply the logarithm to?

We’re really just interested in the ratio between start and finish: 9.9 trillion to 14.4 trillion in 10 years. This is the same growth rate as going from $9.90 to $14.40 in the same period.

We can sharpen our thought:

\displaystyle{\text{natural logarithm of growth ratio} \rightarrow \text{cause of growth} }

\displaystyle{\ln(\frac{14.4}{9.9}) = .374}

Ok, the cause was a rate of .374 or 37.4%. Are we done?

Not yet. Logarithms don’t know about how long a change took (we didn’t plug in 10 years, right?). They give us a rate as if all the change happened in a single time period.

The change could indeed be a single year of 37.4% continuous growth, or 2 years of 18.7% growth, or some other combination.

From the scenario, we know the change took 10 years, so the rate must have been:

\displaystyle{ \text{rate} = \frac{.374}{10} = .0374 = 3.74\%}

From the viewpoint of instant, continuous growth, the US economy grew by 3.74% per year.

Are we done now? Not quite!

This continuous rate is from the grower’s perspective, as if we’re “riding along” with the economy as it changes. A banker probably cares about the human-friendly, year-over-year difference. We can figure this out by letting the continuous growth run for a year:

\displaystyle{\text{exponent with rate and time} \rightarrow \text{effect of growth} }

\displaystyle{e^{\text{rate} \cdot \text{time}} = \text{growth}}

\displaystyle{e^{.0374 \cdot 1} = 1.0381}

The year-over-year gain is 3.8%, slightly higher than the 3.74% instantaneous rate due to compounding. Here’s another way to put it:

  • From an instant-by-instant basis, a given part of the economy is growing by 3.74%, modeled by e.0374 · years
  • On a year-by-year basis, with compounding effects worked out, the economy grows by 3.81%, modeled by 1.0381years

In finance, we may want the year-over-year change which can be compared nicely with other trends. In science and engineering, we prefer modeling behavior on an instantaneous basis.

Scenario: Describing Natural Growth

I detest contrived examples like “Assume bacteria doubles every 24 hours, find its growth formula.”. Do bacteria colonies replicate on clean human intervals, and do we wait around for an exact doubling?

A better scenario: “Hey, I found some bacteria, waited an hour, and the lump grew from 2.3 grams to 2.32 grams. I’m going to lunch now. Figure out how much we’ll have when I’m back in 3 hours.”

Let’s model this. We’ll need a logarithm to find the growth rate, and then an exponent to project that growth forward. Like before, let’s keep everything in terms of the natural log to start.

The growth factor is:

\displaystyle{\text{logarithm of change} \rightarrow \text{cause of growth} }

\displaystyle{\ln(\text{growth}) = \ln(2.32/2.3) = .0086 = .86\%}

That’s the rate for one hour, and the general model to project forward will be

\displaystyle{\text{exponent with rate and time} \rightarrow \text{effect of growth} }

\displaystyle{e^{.0086 \cdot \text{hours}} \rightarrow \text{effect of growth} }

If we start with 2.32 and grow for 3 hours we’ll have:

\displaystyle{2.32 \cdot e^{.0086 \cdot 3} = 2.38}

Just for fun, how long until the bacteria doubles? Imagine waiting for 1 to turn to 2:

\displaystyle{1 \cdot e^{.0086 \cdot \text{hours}} = 2}

We can mechanically take the natural log of both sides to “undo the exponent”, but let’s think intuitively.

If 2 is the final result, then ln(2) is the growth input that got us there (some rate × time). We know the rate was .0086, so the time to get to 2 would be:

\displaystyle{ \text{hours} = \frac{\ln(2)}{\text{rate}} = \frac{.693}{.0086} = 80.58}

The colony will double after ~80 hours. (Glad you didn’t stick around?)

What Does The Perspective Change Really Mean?

Figuring out whether you want the input (cause of growth) or output (result of growth) is pretty straightforward. But how do you visualize the grower’s perspective?

Imagine we have little workers who are building the final growth pattern (see the article on exponents):

compound interest

If our growth rate is 100%, we’re telling our initial worker (Mr. Blue) to work steadily and create a 100% copy of himself by the end of the year. If we follow him day-by-day, we see he does finish a 100% copy of himself (Mr. Green) at the end of the year.

But… that worker he was building (Mr. Green) starts working as well. If Mr. Green first appears at the 6-month mark, he has a half-year to work (same annual rate as Mr. Blue) and he builds Mr. Red. Of course, Mr. Red ends up being half done, since Mr. Green only has 6 months.

What if Mr. Green showed up after 4 months? A month? A day? A second? If workers begin growing immediately, we get the instant-by-instant curve defined by ex:

continuous growth

The natural log gives a growth rate in terms of an individual worker’s perspective. We plug that rate into ex to find the final result, with all compounding included.

Using Other Bases

Switching to another type of logarithm (base 10, base 2, etc.) means we’re looking for some pattern in the overall growth, not what the individual worker is doing.

Each logarithm asks a question when seeing a change:

  • Log base e: What was the instantaneous rate followed by each worker?
  • Log base 2: How many doublings were required?
  • Log base 10: How many 10x-ings were required?

Here’s a scenario to analyze:

  • Over 30 years, the transistor counts on typical chips went from 1000 to 1 billion

How would you analyze this?

  • Microchips aren’t a single entity that grow smoothly over time. They’re separate editions, from competing companies, and indicate a general tech trend.
  • Since we’re not “riding along” with an expanding microchip, let’s use a scale made for human convenience. Doubling is easier to think about than 10x-ing.

With these assumptions we get:

\displaystyle{\text{logarithm of change} \rightarrow \text{cause of growth} }

\displaystyle{\log_2(\frac{\text{1 billion}}{1000}) = \log_2(\text{1 million}) \sim \text{20 doublings}}

The “cause of growth” was 20 doublings, which we know occurred over 30 years. This averages 2/3 doublings per year, or 1.5 years per doubling — a nice rule of thumb.

From the grower’s perspective, we’d compute ln(text(1 billion)/1000) / text(30 years) = 46% continuous growth (a bit harder to relate to in this scenario).

We can summarize our analysis in a table:


Learning is about finding the hidden captions behind a concept. When is it used? What point view does it bring to the problem?

My current interpretation is that exponents ask about cause vs. effect and grower vs. observer. But we’re never done; part of the fun is seeing how we can recaption old concepts.

Happy math.

Appendix: The Change Of Base Formula

Here’s how to think about switching bases. Assuming a 100% continuous growth rate,

  • ln(x) is the time to grow to x
  • ln(2) is the time to grow to 2

Since we have the time to double, we can see how many would “fit” in the total time to grow to x:

\displaystyle{\text{number of doublings from 1 to x} = \frac{\ln(x)}{\ln(2)} = \log_2(x)}

For example, how many doublings occur from 1 to 64?

Well, ln(64) = 4.158. And ln(2) = .693. The number of doublings that fit is:

\displaystyle{\frac{\ln(64)}{\ln(2)} = \frac{4.158}{.693} = 6}

In the real world, calculators may lose precision, so use a direct log base 2 function if possible. And of course, we can have a fractional number: Getting from 1 to the square root of 2 is “half” a doubling, or log2(1.414) = 0.5.

Changing to log base 10 means we’re counting the number of 10x-ings that fit:

\displaystyle{\text{number of 10x-ings from 1 to x} = \frac{\ln(x)}{\ln(10)} = \log_{10}(x) }

Neat, right? Read Using Logarithms in the Real World for more examples.

Understand Ratios with “Oomph” and “Often”

Ratios summarize a scenario with a number, such as “income per day”. Unfortunately, this hides the explanation for how the result came about.

For example, look at two businesses:

  • Annie’s Art Gallery sells a single, $1000 piece every day
  • Frank’s Fish Emporium sells 250 trout at $4/each every day

By the numbers, they’re identical $1000/day operations, right? Hah.

Here’s how each business actually behaves:

\displaystyle{\mathit{\frac{Dollars}{Day} = \frac{Dollars}{Transaction} \cdot \frac{Transactions}{Day} }}

Transactions are the workhorse that drive income, but they’re lost in the dollars/day description. When studying an idea, separate the results into Oomph and Often:

\displaystyle{\mathit{ Result = Oomph \cdot Often = \frac{Dollars}{Transaction} \cdot \frac{Transactions}{Day} }}

With Oomph and Often, I visualize two distinct levers to increase. A ratio like dollars/day makes me stumble through thoughts like: “For better results, I need 1/day to improve… which means the day gets shorter… How’s that possible? Oh, that must be the portion of the day used for each transaction…”.

Why make it difficult? Rewrite the ratio to include the root case: What’s the Oomph, and how Often does it happen?

Horsepower, Torque, RPM

In physics, we define everyday concepts like “power” with a formal ratio:

\displaystyle{\mathit{ Power = \frac{Work}{Time} }}

Ok. Power can be explained by a ratio, but we’re already in inverted-thinking mode. Just another hassle when exploring an already-tricky concept.

How about this:

\displaystyle{\mathit{ Power = Oomph \cdot Often }}

Easier, I think. What could Oomph and Often mean?

Well, Oomph is probably the work we do (such as moving a weight) and Often is how frequently we do it (how many reps did you put in?).

In the same minute, suppose Frank lifted 100lbs ten times, while Annie lifted 1000lbs once. From the equation, they have the same power (though to be honest, I’m more frightened by Annie.)

An engine mechanic might internalize power like this:

\displaystyle{\mathit{ Power = \frac{Work}{Revolution} \cdot \frac{Revolutions}{Time} }}

\displaystyle{\mathit{ Horsepower = Torque \cdot RPM }}

What does that mean?

  • Torque is the Oomph, or how much weight (and how far) can be moved by a turn of the engine (i.e., moving 500lbs by 1 foot)

  • RPM (revolutions per minute) is how frequently the engine turns

A motorcycle engine is designed for reps, i.e. spinning the wheels quickly. It doesn’t need much torque — just enough to pull itself and a few passengers — but it needs to send that to the wheels again and again.

A bulldozer is designed for “Oomph”, such as knocking over a wall. We don’t need to tap into that work very frequently, as one destroyed wall per minute is great, thanks.

I’m not a physicist or car guy, but I can at least conceptualize the tradeoffs with the Oomph/Often metaphor.

Gears can change the tradeoff between Oomph and Often in a given engine. If you’re going uphill, fighting gravity, what do you want more of? If you’re cruising on a highway? Trying to start from a standstill? Driving over slippery snow? Lost the brakes and need to slow down the car?

Oomph/Often gets me thinking intuitively, Work/Time does not.

Variation: Electric Power

Electric power has the same ratio as mechanical power:

\displaystyle{\mathit{ Electric \ Power = \frac{Work}{Time} }}

Yikes. It’s not clear what this means. How about:

\displaystyle{\mathit{ Electric \ Power = Oomph \cdot Often }}

It’s hard to have ideas out of the blue, but we might imagine something (a mini-engine?) is moving the Oomph around inside the wire. If we call it a “charge” then we have:

\displaystyle{\mathit{ Electric \ Power = \frac{Work}{Charge} \cdot \frac{Charges}{Time} }}

And we can give those subparts formal names:

  • Voltage (Oomph): How much work each charge contributes

  • Current (Often): How quickly charges are moving through the wire

Now we get the familiar:

\displaystyle{\mathit{ Electric \ Power = Voltage \cdot Current }}

Boomshakala! I don’t have a good intuition for electricity, at least my goal is clear: find analogies where voltage means Oomph, and current means Often.

And still, we can take a crack at intuitive thinking: when you get zapped by a doorknob in winter, was that Oomph or Often? What attribute should batteries maximize? What’s better for moving energy through stubborn power lines? (Vive la résistance!)

The ratios think every type of power reduces to a generic Work/Time calculation. The Oomph/Often metaphor gets us thinking about Torque/RPM in one scenario and Voltage/Current in another.

What’s Really Going On? Parameters, Baby.

The Oomph/Often viewpoint lets us think about the true cause of the ratio. Instead of dollars and days, we wonder how the actual transactions affect the outcome:

  • Can we increase the size of each transaction?

  • Can we increase the number each day?

In formal terms, we’ve introduced a new parameter to explain the interaction. To change a ratio from a/b to one parameterized by x, we can do:

\displaystyle{\frac{a}{b} = \frac{(a/x)}{(b/x)} = (a/x) \cdot \frac{1}{(b/x)} = \frac{a}{x} \cdot \frac{x}{b} }

We change our viewpoint to see x as the key component. In math, we often switch viewpoints to simplify problems:

  • Instead of asking what happens to the observer, can we change parameters and ask what the mover sees? (Degrees vs. radians.)

  • Can we see a giant function as being parameterized by smaller ones? (See the chain rule.)

  • Can we express probabilities as odds, instead of percentages? (It makes Bayes Theorem easier.)

Adjusting parameters is a way to morph an idea that doesn’t click into one that does. Since I don’t naturally think with inverted units, I’ve made it easier on myself: deal with two multiplications, instead of a division.

Happy math.

How To Learn Trigonometry Intuitively

Trig mnemonics like SOH-CAH-TOA focus on computations, not concepts:

body proportions

TOA explains the tangent about as well as x2 + y2 = r2 describes a circle. Sure, if you’re a math robot, an equation is enough. The rest of us, with organic brains half-dedicated to vision processing, seem to enjoy imagery. And “TOA” evokes the stunning beauty of an abstract ratio.

I think you deserve better, and here’s what made trig click for me.

  • Visualize a dome, a wall, and a ceiling
  • Trig functions are percentages to the three shapes

Motivation: Trig Is Anatomy

Imagine Bob The Alien visits Earth to study our species.

Without new words, humans are hard to describe: “There’s a sphere at the top, which gets scratched occasionally” or “Two elongated cylinders appear to provide locomotion”.

After creating specific terms for anatomy, Bob might jot down typical body proportions:

  • The armspan (fingertip to fingertip) is approximately the height
  • A head is 5 eye-widths wide
  • Adults are 8 head-heights tall

body proportions

How is this helpful?

Well, when Bob finds a jacket, he can pick it up, stretch out the arms, and estimate the owner’s height. And head size. And eye width. One fact is linked to a variety of conclusions.

Even better, human biology explains human thinking. Tables have legs, organizations have heads, crime bosses have muscle. Our biology offers ready-made analogies that appear in man-made creations.

Now the plot twist: you are Bob the alien, studying creatures in math-land!

Generic words like “triangle” aren’t overly useful. But labeling sine, cosine, and hypotenuse helps us notice deeper connections. And scholars might study haversine, exsecant and gamsin, like biologists who find a link between your fibia and clavicle.

And because triangles show up in circles…

body proportions

…and circles appear in cycles, our triangle terminology helps describe repeating patterns!

Trig is the anatomy book for “math-made” objects. If we can find a metaphorical triangle, we’ll get an armada of conclusions for free.

Sine/Cosine: The Dome

Instead of staring at triangles by themselves, like a caveman frozen in ice, imagine them in a scenario, hunting that mammoth.

Pretend you’re in the middle of your dome, about to hang up a movie screen. You point to some angle “x”, and that’s where the screen will hang.

Trig dome

The angle you point at determines:

  • sine(x) = sin(x) = height of the screen, hanging like a sign
  • cosine(x) = cos(x) = distance to the screen along the ground [“cos” ~ how “close”]
  • the hypotenuse, the distance to the top of the screen, is always the same

Want the biggest screen possible? Point straight up. It’s at the center, on top of your head, but it’s big dagnabbit.

Want the screen the furthest away? Sure. Point straight across, 0 degrees. The screen has “0 height” at this position, and it’s far away, like you asked.

The height and distance move in opposite directions: bring the screen closer, and it gets taller.

Tip: Trig Values Are Percentages

Nobody ever told me in my years of schooling: sine and cosine are percentages. They vary from +100% to 0 to -100%, or max positive to nothing to max negative.

Let’s say I paid $14 in tax. You have no idea if that’s expensive. But if I say I paid 95% in tax, you know I’m getting ripped off.

An absolute height isn’t helpful, but if your sine value is .95, I know you’re almost at the top of your dome. Pretty soon you’ll hit the max, then start coming down again.

How do we compute the percentage? Simple: divide the current value by the maximum possible (the radius of the dome, aka the hypotenuse).

That’s why we’re told “Sine = Opposite / Hypotenuse”. It’s to get a percentage! A better wording is “Sine is your height, as a percentage of the hypotenuse”. (Sine becomes negative if your angle points “underground”. Cosine becomes negative when your angle points backwards.)

Let’s simplify the calculation by assuming we’re on the unit circle (radius 1). Now we can skip the division by 1 and just say sine = height.

Every circle is really the unit circle, scaled up or down to a different size. So work out the connections on the unit circle and apply the results to your particular scenario.

Try it out: plug in an angle and see what percent of the height and width it reaches:

The growth pattern of sine isn’t an even line. The first 45 degrees cover 70% of the height, and the final 10 degrees (from 80 to 90) only cover 2%.

This should make sense: at 0 degrees, you’re moving nearly vertical, but as you get to the top of the dome, your height changes level off.

Tangent/Secant: The Wall

One day your neighbor puts up a wall right next to your dome. Ack, your view! Your resale value!

But can we make the best of a bad situation?

Trig dome

Sure. What if we hang our movie screen on the wall? You point at an angle (x) and figure out:

  • tangent(x) = tan(x) = height of screen on the wall
  • distance to screen: 1 (the screen is always the same distance along the ground, right?)
  • secant(x) = sec(x) = the “ladder distance” to the screen

We have some fancy new vocab terms. Imagine seeing the Vitruvian “TAN GENTleman” projected on the wall. You climb the ladder, making sure you can “SEE, CAN’T you?”. (Yeah, he’s naked… won’t forget the analogy now, will you?)

Let’s notice a few things about tangent, the height of the screen.

  • It starts at 0, and goes infinitely high. You can keep pointing higher and higher on the wall, to get an infinitely large screen! (That’ll cost ya.)

  • Tangent is just a bigger version of sine! It’s never smaller, and while sine “tops off” as the dome curves in, tangent keeps growing.

How about secant, the ladder distance?

  • Secant starts at 1 (ladder on the floor to the wall) and grows from there
  • Secant is always longer than tangent. The leaning ladder used to put up the screen must be longer than the screen itself, right? (At enormous sizes, when the ladder is nearly vertical, they’re close. But secant is always a smidge longer.)

Remember, the values are percentages. If you’re pointing at a 50-degree angle, tan(50) = 1.19. Your screen is 19% larger than the distance to the wall (the radius of the dome).

(Plug in x=0 and check your intuition that tan(0) = 0, and sec(0) = 1.)

Cotangent/Cosecant: The Ceiling

Amazingly enough, your neighbor now decides to build a ceiling on top of your dome, far into the horizon. (What’s with this guy? Oh, the naked-man-on-my-wall incident…)

Well, time to build a ramp to the ceiling, and have a little chit chat. You pick an angle to build and work out:

Trig dome

  • cotangent(x) = cot(x) = how far the ceiling extends before we connect
  • cosecant(x) = csc(x) = how long we walk on the ramp
  • the vertical distance traversed is always 1

Tangent/secant describe the wall, and COtangent and COsecant describe the ceiling.

Our intuitive facts are similar:

  • If you pick an angle of 0, your ramp is flat (infinite) and never reachers the ceiling. Bummer.
  • The shortest “ramp” is when you point 90-degrees straight up. The cotangent is 0 (we didn’t move along the ceiling) and the cosecant is 1 (the “ramp length” is at the minimum).

Visualize The Connections

A short time ago I had zero “intuitive conclusions” about the cosecant. But with the dome/wall/ceiling metaphor, here’s what we see:

Trig overall

Whoa, it’s the same triangle, just scaled to reach the wall and ceiling. We have vertical parts (sine, tangent), horizontal parts (cosine, cotangent), and “hypotenuses” (secant, cosecant). (Note: the labels show where each item “goes up to”. Cosecant is the full distance from you to the ceiling.)

Now the magic. The triangles have similar facts:

Trig identities

From the Pythagorean Theorem (a2 + b2 = c2) we see how the sides of each triangle are linked.

And from similarity, ratios like “height to width” must be the same for these triangles. (Intuition: step away from a big triangle. Now it looks smaller in your field of view, but the internal ratios couldn’t have changed.)

This is how we find out “sine/cosine = tangent/1”.

I’d always tried to memorize these facts, when they just jump out at us when visualized. SOH-CAH-TOA is a nice shortcut, but get a real understanding first!

Gotcha: Remember Other Angles

Psst… don’t over-focus on a single diagram, thinking tangent is always smaller than 1. If we increase the angle, we reach the ceiling before the wall:

Trig alternative

The Pythagorean/similarity connections are always true, but the relative sizes can vary.

(But, you might notice that sine and cosine are always smallest, or tied, since they’re trapped inside the dome. Nice!)

Summary: What Should We Remember?

For most of us, I’d say this is enough:

  • Trig explains the anatomy of “math-made” objects, such as circles and repeating cycles
  • The dome/wall/ceiling analogy shows the connections between the trig functions
  • Trig functions return percentages, that we apply to our specific scenario

You don’t need to memorize 12 + cot2 = csc2, except for silly tests that mistake trivia for understanding. In that case, take a minute to draw the dome/wall/ceiling diagram, fill in the labels (a tan gentleman you can see, can’t you?), and create a cheatsheet for yourself.

In a follow-up, we’ll learn about graphing, complements, and using Euler’s Formula to find even more connections.

Appendix: The Original Definition Of Tangent

You may see tangent defined as the length of the tangent line from the circle to the x-axis (geometry buffs can work this out).


As expected, at the top of the circle (x=90) the tangent line can never reach the x-axis and is infinitely long.

I like this intuition because it helps us remember the name “tangent”, and here’s a nice interactive trig guide to explore:

Trig interactive

Still, it’s critical to put the tangent vertical and recognize it’s just sine projected on the back wall (along with the other triangle connections).

Appendix: Inverse Functions

Trig functions take an angle and return a percentage. sin(30) = .5 means a 30-degree angle is 50% of the max height.

The inverse trig functions let us work backwards, and are written sin-1 or arcsin (“arcsine”), and often written asin in various programming languages.

If our height is 25% of the dome, what’s our angle?

Now what about something exotic, like inverse secant? Often times it’s not available as a calculator function (even the one I built, sigh).

Looking at our trig cheatsheet, we find an easy ratio where we can compare secant to 1. For example, secant to 1 (hypotenuse to horizontal) is the same as 1 to cosine:

\displaystyle{\frac{sec}{1} = \frac{1}{cos}}

Suppose our secant is 3.5, i.e. 350% of the radius of the unit circle. What’s the angle to the wall?

\frac{\sec}{1} &= \frac{1}{\cos} = 3.5 \\
\cos &= \frac{1}{3.5} \\
\arccos(\frac{1}{3.5}) &= 73.4

Appendix: A Few Examples

Example: Find the sine of angle x.

Sine Example

Ack, what a boring question. Instead of “find the sine” think, “What’s the height as a percentage of the max (the hypotenuse)?”.

First, notice the triangle is “backwards”. That’s ok. It still has a height, in green.

What’s the max height? By the Pythagorean theorem, we know

3^2 + 4^2 &= \text{hypotenuse}^2 \\
25 &= \text{hypotenuse}^2 \\
5 &= \text{hypotenuse}

Ok! The sine is the height as a percentage of the max, which is 3/5 or .60.

Follow-up: Find the angle.

Of course. We have a few ways. Now that we know sine = .60, we can just do:

\displaystyle{\arcsin(.60) = 36.9}

Here’s another approach. Instead of using sine, notice the triangle is “up against the wall”, so tangent is an option. The height is 3, the distance to the wall is 4, so the tangent height is 3/4 or 75%. We can use arctangent to turn the percentage back into an angle:

\displaystyle{\tan = \frac{3}{4} = .75 }

\displaystyle{\arctan(.75) = 36.9}

Example: Can you make it to shore?

Boat Example

You’re on a boat with enough fuel to sail 2 miles. You’re currently .25 miles from shore. What’s the largest angle you could use and still reach land? Also, the only reference available is Hubert’s Compendium of Arccosines, 3rd Ed. (Truly, a hellish voyage.)

Ok. Here, we can visualize the beach as the “wall” and the “ladder distance” to the wall is the secant.

First, we need to normalize everything in terms of percentages. We have 2 / .25 = 8 “hypotenuse units” worth of fuel. So, the largest secant we could allow is 8 times the distance to the wall.

We’d like to ask “What angle has a secant of 8?”. But we can’t, since we only have a book of arccosines.

We use our cheatsheet diagram to relate secant to cosine: Ah, I see that “sec/1 = 1/cos”, so

\sec &= \frac{1}{\cos} = 8 \\
\cos &= \frac{1}{8} \\
\arccos(\frac{1}{8}) &= 82.8

A secant of 8 implies a cosine of 1/8. The angle with a cosine of 1/8 is arccos(1/8) = 82.8 degrees, the largest we can afford.

Not too bad, right? Before the dome/wall/ceiling analogy, I’d be drowning in a mess of computations. Visualizing the scenario makes it simple, even fun, to see which trig buddy can help us out.

In your problem, think: am I interested in the dome (sin/cos), the wall (tan/sec), or the ceiling (cot/csc)?

Happy math.

Update: The owner of Grey Matters put together interactive diagrams for the analogies (drag the slider on the left to change the angle):



Site Update: New Design + Intuition Cheatsheet

After months of work with the help of Neil, a great designer, and my Excel-blogging friend Andrew, I’m happy to launch a brand-new design.

My goals were to be friendly, readable, and easy-to-navigate. Here’s a quick before-and-after:

New Logo

Neil did a fantastic job here — I’d been looking for a way to convey a welcoming, conversational tone.

New Homepage

A site about explanations should describe what it does simply, right?

Better Readability

The fonts are bumped up, there’s more breathing room, and pages are optimized for iPads/iPhones. Instead of a text-dense cram session, I want an unhurried walkthrough of insights.

Intuition Cheatsheet

My favorite feature is a site summary that reduces insights to a few words. Previously, I had trouble navigating the various articles, and I bet you did too :). Readers of the newsletter got a sneak peek, and I have a PDF version I’ll be sending out to subscribers as well.

Overall, BetterExplained is an excited friend who shares what really helps ideas click, not an authority trying to be the grand poombah of math. Let’s have a good time on this journey of learning.

Hope you enjoy the new site, feedback is welcome!