Starting from the Pythagorean Theorem and similar triangles, we can find connections between sin, cos, tan and friends (read the article on trig).... Read article

]]>Starting from the Pythagorean Theorem and similar triangles, we can find connections between sin, cos, tan and friends (read the article on trig).

Can we go deeper? Maybe we can connect sine with *itself* (sin-ception). In math terms, we’re looking for formulas like this (full cheatsheet):

Instead of memorizing these bad mamma jammas, let’s learn to *draw* the formulas. Euler’s Formula makes it easy.

In algebra, we study relationships like this:

Working out 17^{2} directly is cumbersome. But we can simplify it to:

In the computer era, sure, we can just crunch 17^{2} directly. The important aspect is realizing that (a + b)^{2} can be broken into simpler ingredients: a^{2}, b^{2}, a, b. This is useful in factoring, simplifying equations, and so on.

Let’s turn trig into plain English. What does this mean?

Remembering that sine is “height (as a percentage of max)”, this equation asks: *If we add two angles, what is their total height?*

A quick guess might be to combine the individual heights:

It looks clean, but isn’t quite right. If we keep adding up angles, their height increases until the max (100%), then starts decreasing.

The relationship between angle and height can’t be simple addition.

Now here’s the weird thing: I can draw what the new height should be (*It’s right there!*), but I can’t turn my drawing into an equation.

Or can I?

Euler’s Formula lets us create a circular path using complex numbers:

Crucially, multiplying complex numbers performs a rotation. Aha! We can use Euler’s Formula to draw the rotation we need:

- Start with 1.0, which is at 0 degrees.
- Multiply by e
^{ia}, which rotates by a. - Multiply by e
^{ib}, which rotates by b. - Final position = 1.0 · e
^{ia}· e^{ib}= e^{i(a+b}), or 1.0 at the angle (a+b)

The complex exponential e^{i(a+b}) is pretty gnarly. Just like breaking apart 17^{2}, let’s multiply out the pieces:

Now we’re talking! This version easily separates the horizontal position (real component) and vertical position (imaginary component):

- Combined height: sin(a + b) = sin(a)cos(b) + sin(b)cos(a)
- Combined width: cos(a + b) = cos(a)cos(b) – sin(a)sin(b)

Boom: two annoying-to-remember trig identities in a single computation. Not a bad deal.

Now that we’ve found the equation, let’s grok its meaning. When we add the heights, here’s what’s happening:

- The full height of the blue triangle (sin(a)) can’t be used, since the red triangle doesn’t extend as far. (Why? When we add angle b, we’re moving at a steeper angle with the same hypotenuse. We gained vertical distance and lost horizontal distance.) We’re effectively “sliding back” sin(a), reducing it by a factor of cos(b).
- The full height of the red triangle (sin(b)) can’t be used either, since it’s at an angle. We’re “turning” sin(b), reducing it by a factor of cos(a).

Remember that sine and cosine are percentages. In this case,

or

Sure, we would *like* to get the full height of each triangle. But from the diagram, we see a slides back and b is twisted, so height we *actually* get is reduced. Think of each cosine as a tax on your height, reducing the amount you take home. (*Have a height of .90? That’s nice, Papa Cosine will let you keep 75%. Pay up the rest, sucka!*).

Now, what happens for small angles, like sin(.01 + .02)?

We could plug and chug this. But I’m guessing the result is about:

Why? My mental diagram for small angles is this:

There’s no perceptible difference between the ideal heights (sin(a) and sin(b)) and the “taxed” versions (sin(a)cos(b) and sin(b)cos(a)).

- For tiny angles, sin(a + b) is a vertical line. It barely loses any height due to the parts sliding or twisting.
- For small angles, cosine (the percent we keep), is close to 100%. We’re keeping the vast, vast majority of the height we have.
- sin(x) sim x is a common approximation for small angles (often used in Calculus). Essentially, it says sin(x) is a line for a brief time period. For small angles, sin(a + b) sim sin(a) + sin(b) sim a + b.

For cosine, we have a similar diagram:

- This time, the conversion factor matches up (cosine with cosine, sine with sine).
- The full width of the first triangle (cos(a)) gets scaled down to match the width of the second.
- The sine term is negative since it pushes us backwards, reducing our height. We can use similar triangles to extract out this piece.

I’m not typically thinking about the parts in the diagram, though it’s nice to see how they work a few times. If you just need the trig identity, crank through it algebraically with Euler’s Formula.

Good question. A few reasons:

**1. Because you have to (the worst reason).** Many trig classes have you memorize these identities so you can be quizzed later (argh). You don’t need to *memorize* them, you can work out the formula in about a minute. Save your precious brain space for something else.

**2. We can now “factor” trig functions into simper parts.** We can now separate sine into smaller parts, which is useful in Calculus.

For example, to find the derivative of sine, we need:

and we let dx go to zero. This is tricky to work on directly, but using the sin(a + b) formula we have

As dx goes to zero, cos(dx) = 1 (zero angle is full width), so we have:

And as dx goes to zero, sin(dx) and dx become equal:

Plugging this in, we get cos(a) as the derivative of sin(a). Phew! Working with trig functions isn’t always easy, but at least it’s manageable.

**3. It’s computationally efficient.** If you’re doing a computer graphics, and frequently calculating sine/cosine (for dot products let’s say), trig identities are useful shortcuts. In the past, these identities were used similar to log tables to make hand-done calculations easier.

**4. Math is about seeing connections.** Because trig functions are derived from circles and exponential functions, they seem to show up everywhere. Sometimes you simplify a scenario by going from trig to exponents, or vice versa.

**5. Deepen your knowledge of Euler’s Formula.** Master Euler’s formula and you’ve mastered circles. And from there, the world! (*Editor’s note: Kalid’s pinky appears to be affixed to his mouth. We’re working on it.*)

See, Euler’s formula lets us *draw* a circle and read off a position. That’s amazing! We can avoid a lot of painful geometry with a few multiplications. If you’re doing any advanced math, letting Leonhard Euler deep into your soul is well worth it. He’s good company.

That’s it for today. Happy math.

You can mix & match trig identities to create a bunch of new ones.

**Subtraction formula: replace b with -b**

**Double-angle formula: replace b with a**

This makes sense: after accounting for the conversion factor, we add the height to itself.

**Half-angle formula: replace and solve**

Start with the double-angle formula and solve for sin(a), which is half the angle used in sin(2a). Trig without tears (a great resource and name) has more details:

http://brownmath.com/twt/double.htm

A few other references I found helpful:

]]>- Old little lady
- Red big dog
- Vietnamese spicy food

Do you have a logical reason why they sound strange? Or are they just “off”?

You probably didn’t think, “In 3rd grade I mastered the Royal Order of Adjectives:

- Determiner
- Observation
- Size
- Shape
- Age
- Color
- Origin
- Material
- Qualifier

… and upon applying them, noticed several errors.... Read article

]]>- Old little lady
- Red big dog
- Vietnamese spicy food

Do you have a logical reason why they sound strange? Or are they just “off”?

You probably didn’t think, “In 3rd grade I mastered the Royal Order of Adjectives:

- Determiner
- Observation
- Size
- Shape
- Age
- Color
- Origin
- Material
- Qualifier

… and upon applying them, noticed several errors. *Old little lady* is incorrect because rules #3 and #5 are swapped — a childish mistake, really. The next…”

Ugh. Describing Gran Gran isn’t a logic puzzle. But guess what students learning English are taught?

Even as a native speaker, could you construct this chart? Is this how you’d teach someone English?

**The Adjective Fallacy is trying to learn by mastering the formal rules.** Just because a concept *can* be rigorously defined doesn’t mean we should study it that way.

We didn’t become good at English by studying a chart: we developed an ear for the language and know how it *should* sound. And “old little lady” sounds off.

Similarly, getting good at math doesn’t mean marching through a gauntlet of rules on every problem. It’s having a native speaker’s feeling about what works or doesn’t.

“303 x 13 = 5074” looks strange, but not because we computed the left-hand side. It’s weird because odd numbers can’t multiply to become even (intuition). The last digit of the result should be 3×3= 9. 5074 is too large, since 300 x 10 (similar numbers nearby) is only 3000. Our Spidey Sense is blaring that the computation looks wrong.

My learning goal is knowing enough to make rough predictions on my own. I want a horse sense for algebra, calculus, trig, and even imaginary exponents, *without* scurrying off to apply an equation.

Rules aren’t inherently bad: they summarize, resolve ambiguous cases, and help us practice our weak spots. The question is how much to use them when starting off.

Learn enough rules to get started – don’t attempt to master them from the outset. See examples in a larger context and let the pattern-matching machinery of your brain get to work.

Math is a language too. Here’s a gut check: **Would my current math study technique have helped me learn English?**

If an English class spent a month on the adjective chart we’d have a talk with the teacher. But a Calculus class that spends weeks on the formal theory of limits is typical. Can we admit that studying this much detail, this early, doesn’t build fluency?

Pondering that question made me realize I had large gaps in trigonometry and calculus. I could only describe concepts using the adjective chart I’d memorized with a furrowed brow. (*I’ll describe my grandma, just give me a minute!*)

Enough was enough: embrace approaches that *actually* help you, like seeing the big picture first. In Calculus, that might mean seeing an integral in the first lesson:

That’s what Calculus *does*: break a shape into pieces (the derivative), and glue it together in various ways (the integral). If you like this style of teaching, check out the full Calculus series.

A typical calculus syllabus covers integrals in week 12, after months of “building a foundation”. Better not use a complete sentence until we’ve studied adjectives, nouns and verbs separately, right? (My hand wringing could solve the energy crisis.)

The path to understanding isn’t always the most structured.

Happy math.

Update: After research, this concept is called tacit knowledge, or “we know more than we can tell” (Michael Polanyi). Tacit knowledge is acquired through experience, and complements the explicit knowledge written as rules.

]]>Imagine a chef who follows a new recipe to the letter. No matter how it looks, no matter the reviews the recipe has, if the dish *doesn’t taste good* we know something is wrong.... Read article

Imagine a chef who follows a new recipe to the letter. No matter how it looks, no matter the reviews the recipe has, if the dish *doesn’t taste good* we know something is wrong. A sense of taste is the ultimate cooking tool.

When learning, we defer to external indicators (tests, teachers) to inform us we’ve learned something. External standards are made to be objective and easily-verified (Did you pick the correct answer?), but the important, subjective question is how well a concept sits in your mind. Did you actually experience it?

My checklist of truly learning a topic means it is:

**Understandable:**Did I have an aha! moment? Can I explain the concept in simple language? Does it connect to other topics I know?**Memorable**: Do I have an analogy, diagram, or example that will stick with me for months or years?**Enjoyable**: Do I want to revisit or use this knowledge? Don’t study literature in a way that makes you hate reading.

That’s my current definition of “intuitive understanding”, and for subjects I care about, I keep digging until I have all three aspects.

It’s ok to take your time (calculus took years to become enjoyable) and it’s ok to not care about everything equally (biology isn’t particularly compelling for me). I firmly believe any subject can become intuitive if I put in the effort to find analogies, diagrams, examples, plain-english descriptions, and technical details (the ADEPT method).

So, how do you set your own learning standard?

Let’s not recreate the wheel: famous learners have already described their thinking process, which we can adopt. It’s not about memorizing Einstein’s Theory of Relativity, it’s about internalizing the mindset that could lead to that idea.

Here’s a few viewpoints that resonated for me:

“Education is what remains after one has forgotten what one has learned in school.” —Albert Einstein

“The only real valuable thing is intuition.” —Albert Einstein

- True learning goes beyond memorized facts. While I can forget the equation of a circle, I can’t forget that it’s round. And knowing it’s perfectly round quickly leads me back to the equation.

“The noblest pleasure is the joy of understanding.” —Da Vinci

- True understanding implies joy. And practically, you’ll only continue studying what you like.

“To teach effectively a teacher must develop a feeling for his subject; he cannot make his students sense its vitality if he does not sense it himself. He cannot share his enthusiasm when he has no enthusiasm to share. How he makes his point may be as important as the point he makes; he must personally feel it to be important.” —George Póyla

“Education is the kindling of a flame, not the filling of a vessel.” —Socrates

- We aren’t robots, and we should embrace the subjective aspects of learning. A teacher’s goal goes beyond knowledge-transfer to enjoyment-transfer.

The Humane Representation of Thought from Bret Victor

- There are deeper, richer levels of understanding than what’s traditionally used. Explore a higher standard.

“I think most people can learn a lot more than they think they can. They sell themselves short without trying. One bit of advice: it is important to view knowledge as sort of a semantic tree — make sure you understand the fundamental principles, ie the trunk and big branches, before you get into the leaves/details or there is nothing for them to hang on to.” —Elon Musk

- Your own standards greatly influence your understanding. External tests won’t check if facts are comfortably connected.

I have a larger collection of quotes that help align my thinking.

After rummaging through quotes that resonate, build a set of questions that capture your standard. For me, it became:

- Do I have a visceral, ingrained analogy? Can it help solve problems?
- Can I explain the concept to others? Do they want to explain it to their friends afterwards?
- Will I remember the essential idea after a few months or years?
- Can I find something to enjoy in the topic? Will I return after I inevitably forget 95% of it?

Questions seem to prompt more interest than a statement: “Do I have an analogy?” vs. “I must have an analogy”.

With this approach, strange corners of math I didn’t previously enjoy (like Euler’s Formula) became mysteries to solve: what *is* the insight here? Can I express it in a plain-English sentence? (Here’s a shot: Continuous rotation means you’re moving in a circle.)

Setting new standards helps take control of your education and overcome longstanding demons.

When people say “I hate math” I doubt they actually hate numbers (arithmetic), patterns & relationships (algebra), or shapes (geometry). They hate lessons that don’t contain insight, enjoyment, and basic human empathy. It’s fine to be disinterested in Ancient Egyptian Civilization, but *hate* comes from getting lost on a tour and spending the night near a sarcophagus.

These are the questions that helped me: what are your standards for learning?

(*Thanks to Scott Young, Uri Bram, and Tom Miller for brainstorming ideas.*)

This completed grid is the *outer product*, which can be separated into the:

**Dot product**, the interactions between similar dimensions (`x*x`

`y*y`

,`z*z`

)**Cross product**, the interactions between different dimensions (`x*y`

,`y*z`

,`z*x`

, etc.)

The dot product (vec(a) · vec(b)) measures similarity because it only accumulates interactions in matching dimensions.... Read article

]]>This completed grid is the *outer product*, which can be separated into the:

**Dot product**, the interactions between similar dimensions (`x*x`

`y*y`

,`z*z`

)**Cross product**, the interactions between different dimensions (`x*y`

,`y*z`

,`z*x`

, etc.)

The dot product (vec(a) · vec(b)) measures similarity because it only accumulates interactions in matching dimensions. It’s a simple calculation with 3 components.

The cross product (written vec(a) times vec(b)) has to measure a half-dozen “cross interactions”. The calculation looks complex but the concept is simple: accumulate 6 individual differences for the total.

Instead of thinking “When do I need the cross product?” think “When do I need interactions between different dimensions?”.

Area, for example, is formed by vectors pointing in different directions (the more orthogonal, the better). Indeed, the cross product measures the area spanned by two 3d vectors (source):

(The “cross product” assumes 3d vectors, but the concept extends to higher dimensions.)

Did the key intuition click? Let’s hop into the details.

The dot product represents vector similarity with a single number:

(Remember that trig functions are percentages.) Should the cross product (difference between interacting vectors) be a single number too?

Let’s try. Sine is the percentage difference, so we could use:

Unfortunately, we’re missing a lot of detail. `x`

is 100% different from both `y`

and `z`

, but shouldn’t `x*y`

and `x*z`

be different from each other? As Tolstoy wrote, “All happy families are alike; each unhappy family is unhappy in its own way.”

Instead, let’s express these unique differences as a vector:

The

*size*of the cross product is the numeric “amount of difference” (with sin(theta) as the percentage)The

*direction*of the cross product is based on both inputs: it’s the direction orthogonal to both (i.e., favoring neither)

A vector result represents the `x*y`

and `x*z`

separately, even though `y`

and `z`

are both “100% different” from `x`

.

(Should the dot product be turned into a vector too? Well, we have the inputs and a similarity percentage. There’s no new direction that isn’t available from either input.)

Two vectors determine a plane, and the cross product points in a direction different from both (source):

Here’s the problem: there’s two perpendicular directions. By convention, we assume a “right-handed system” (source):

If you hold your first two fingers like the diagram shows, your thumb will point in the direction of the cross product. I make sure the orientation is correct by sweeping my first finger from vec(a) to vec(b). With the direction figured out, the magnitude of the cross product is |a| |b| sin(theta), which is proportional to the magnitude of each vector and the “difference percentage” (sine).

To remember the right hand rule, write the `xyz`

order twice: `xyzxyz`

. Next, find the pattern you’re looking for:

`xy => z`

(`x`

cross`y`

is`z`

)`yz => x`

(`y`

cross`z`

is`x`

; we looped around:`y`

to`z`

to`x`

)`zx => y`

Now, `xy`

and `yx`

have opposite signs because they are forward and backward in our `xyzxyz`

setup.

So, without a formula, you should be able to calculate:

Again, this is because `x`

cross `y`

is positive `z`

in a right-handed coordinate system. I used unit vectors, but we could scale the terms:

A single vector can be decomposed into its 3 orthogonal parts:

When the vectors are crossed, each pair of orthogonal components (like a_x times b_y) casts a vote for where the orthogonal vector should point. 6 components, 6 votes, and their total is the cross product. (Similar to the gradient, where axis casts a vote for the direction of greatest increase.)

`xy => z`

and`yx => -z`

(assume vec(a) is first, so`xy`

means a_x b_y)`yz => x`

and`zy => -x`

`zx => y`

and`xz => -y`

`xy`

and `yx`

fight it out in the `z`

direction. If those terms are equal, such as in (2, 1, 0) times (2, 1, 1), there is no cross product component in the `z`

direction (2 – 2 = 0).

The final combination is:

where vec(n) is the unit vector normal to vec(a) and vec(b).

Don’t let this scare you:

- There’s 6 terms, 3 positive and 3 negative
- Two dimensions vote on the third (so the
`z`

term must only have`y`

and`x`

components) - The positive/negative order is based on the
`xyzxyz`

pattern

If you like, there is an algebraic proof, that the formula is both orthogonal and of size |a| |b| sin(theta), but I like the “proportional voting” intuition.

Again, we should do simple cross products in our head:

Why? We crossed the `x`

and `y`

axes, giving us `z`

(or vec(i) times vec(j) = vec(k), using those unit vectors). Crossing the other way gives -vec(k).

Here’s how I walk through more complex examples:

- Let’s do the last term, the z-component. That’s (1)(5) minus (4)(2), or 5 – 8 = -3. I did
`z`

first because it uses`x`

and`y`

, the first two terms. Try seeing (1)(5) as “forward” as you scan from the first vector to the second, and (4)(2) as backwards as you move from the second vector to the first. - Now the
`y`

component: (3)(4) – (6)(1) = 12 – 6 = 6 - Now the
`x`

component: (2)(6) – (5)(3) = 12 – 15 = -3

So, the total is (-3, 6, -3) which we can verify with Wolfram Alpha.

In short:

- The cross product tracks all the “cross interactions” between dimensions
- There are 6 interactions (2 in each dimension), with signs based on the
`xyzxyz`

order

**Connection with the Determinant**

You can calculate the cross product using the determinant of this matrix:

There’s a neat connection here, as the determinant (“signed area/volume”) tracks the contributions from orthogonal components.

There are theoretical reasons why the cross product (as an orthogonal vector) is only available in 0, 1, 3 or 7 dimensions. However, the cross product as a single number is essentially the determinant (a signed area, volume, or hypervolume as a scalar).

**Connection with Curl**

Curl measures the twisting force a vector field applies to a point, and is measured with a vector perpendicular to the surface. Whenever you hear “perpendicular vector” start thinking “cross product”.

We take the “determinant” of this matrix:

Instead of multiplication, the interaction is taking a partial derivative. As before, the vec(i) component of curl is based on the vectors and derivatives in the vec(j) and vec(k) directions.

**Relation to the Pythagorean Theorem**

The cross and dot product are like the orthogonal sides of a triangle:

For unit vectors, where |a| = |b| = 1 , we have:

I cheated a bit in the grid diagram, as we have to track the squared magnitudes (as done in the Pythagorean Theorem).

**Advanced Math**

The cross product & friends get extended in Clifford Algebra and Geometric Algebra. I’m still learning these.

**Cross Products of Cross Products**

Sometimes you’ll have a scenario like:

First, the cross product isn’t associative: order matters.

Next, remember what the cross product is doing: finding orthogonal vectors. If any two components are parallel (vec(a) parallel to vec(b)) then there are no dimensions pushing on each other, and the cross product is zero (which carries through to 0 times vec(c)).

But it’s ok for vec(a) and vec(c) to be parallel, since they are never directly involved in a cross product, for example:

Whoa! How’d we get back to vec(j)? We asked for a direction perpendicular to both vec(i) and vec(j), and made that direction perpendicular to vec(i) again. Being “doubly perpendicular” means you’re back on the original axis.

**Dot Product of Cross Products**

Now if we take

what happens? We’re forced to do vec(a) times vec(b) first, because vec(b) · vec(c) returns a scalar (single number) which can’t be used in a cross product.

If vec(a) and vec(c) are parallel, what happens? Well, vec(a) times vec(b) is perpendicular to vec(a), which means it’s perpendicular to vec(c), so the dot product with vec(c) will be zero.

I never really memorized these rules, I have to think through the interactions.

**Other Coordinate Systems**

The Unity game engine is left-handed, OpenGL (and most math/physics tools) are right-handed. Why?

In a computer game, `x`

goes horizontal, `y`

goes vertical, and `z`

goes “into the screen”. This results in a left-handed system. (Try it: using your right hand, you can see `x`

cross `y`

should point out of the screen).

**Applications of the Cross Product**

- Find the direction perpendicular to two given vectors.
- Find the signed area spanned by two vectors.
- Determine if two vectors are orthogonal (checking for a dot product of 0 is likely faster though).
- “Multiply” two vectors when only perpendicular cross-terms make a contribution (such as finding torque).
- With the quaternions (4d complex numbers), the cross product performs the work of rotating one vector around another (another article in the works!).

Happy math.

]]>