Let's try a broader interpretation: **The Pythagorean Theorem explains how 2D area can be combined.**

Here's what I mean. Suppose we have two lines lying around (the creatively named *Line A* and *Line B*). We can spin them to create area:

Ok, fun enough. Where's the mystery?

Well, what happens if we *combine* the line segments before spinning them?

Whoa. The area swept out seems to change. Should simply *moving* the lines, not lengthening them, change the area?

Eyeballing the diagram above, it sure seems like the area grew. Let's work out the specifics.

As an example, suppose $a = 6$ and $b = 8$. When they're swept into circles ($\text{area} = \pi r^2$) we get:

For a total of $36\pi + 64\pi = 100\pi$.

The combined segment has length $c = a + b = 14$, and when we spin it we get:

Uh oh. That's way more area than before.

What happened? Well, Circle A didn't change. But Circle B is much less than Ring B (just look at it!).

The issue: When Line B spins on its own, it can only reach 8 units out as it sweeps. When we attach Line B to Line A, it reaches out 6 + 8 = 14 units. Now the circular sweep covers more area, meaning Circle B is smaller than Ring B.

Mathematically, here's what happened.

Ignore $\pi$ for a moment since it's a common term. When expanding $c^2 = (a + b)^2 = a^2 + 2ab + b^2$, there's a new $2ab$ term that has to go somewhere. Because Circle A doesn't change, this extra area must appear in Ring B.

It... sort of makes sense that the area changes, but I don't like it. Just moving things around shouldn't have this effect! Can the area ever be the same?

Sure, if we remove the $2ab$ term. The easy fix is to set $a=0$, but that's cheating and you know it.

Let's find a clever solution. Intuitively, the question is: **How can Line A's length not help Line B as it spins?**

Tilt it! As we rotate Line B, there's less benefit from Line A's length. Ladders are useless when lying on the floor, right?

When we go Full Perpendicular™, the $2ab$ term disappears and Circle B = Ring B. (In vector terms, the dot product is zero: $a \cdot b = 0$).

Ah -- that's the meaning of the Pythagorean Theorem. **When line segments are perpendicular, the same area is swept whether the lines are combined or separated.**

It's not a bad idea to make sure the numbers line up.

Since the segments are now perpendicular, we know $c^2 = a^2 + b^2$, so:

Now we can calculate:

Tada! The Ring and Circle sweep the same area.

In our example, we have Circle A = $36\pi$, Circle B = $64 \pi$, $c = \sqrt{36 + 64} = 10$. The ring width is $10 - 6 = 4$.

The Pythagorean Theorem is about more than triangles. When components are perpendicular, the area they make is independent of how they are arranged.

- The Law of Cosines explicitly shows the $2ab$ term which assumed to be zero in the Pythagorean Theorem. The area of Ring B can even be "negative" if we tilt Line B to point inside.
- We can combine area from multiple dimensions ($x^2 + y^2 + z^2 + ...$). As long as they are mutually perpendicular, the area swept by each dimension is the area swept by the total.
- The Pythagorean Theorem is a relationship in the 2D area domain ($c^2 = a^2 + b^2$). We start here and convert this to a relationship in the 1D domain ($c = \sqrt{a^2 + b^2}$). The conversion happens so often we forget where it began.
- More on sweeping area: https://www.cut-the-knot.org/Curriculum/Geometry/PythFromRing.shtml

Happy math.

]]>The standard quadratic formula is a lot to remember:

It's a maze of numbers, letters, and square roots. It's derived from "completing the square" on a general quadratic equation ($ax^2 + bx + c = 0$). There are several good explanations for the standard formula, here's my intuition for a variation.

Here's our typical starting equation:

First off, why leave $a$ hanging around? Divide that fella out, and get:

In fact, pretend $a$ was never there. You'd combine similar terms ($3x + 4x = 7x$) before doing any other work, right? In a similar way, demand that $x^2$ appear by itself (with no coefficients) before you begin solving.

After dividing any coefficients attached to $x^2$, our equation is in the format:

Ah, that's a better starting point.

(Note: $b$ and $c$ are what we label the coefficients after all simplifications. For example, when starting with $3x^2 + 6x + 9 = 0$ and simplifying to $x^2 + 2x + 3 = 0$, we'd assign $b=2$ and $c=3$.)

Let's put on our geometry goggles and assume our quadratic equation refers to area:

An ongoing insight is that math doesn't have dimensions or units -- just raw quantities. We can *decide* that in this scenario, every quantity refers to the area of a 2d shape:

- $x^2$ is our square ($x * x$)
- $bx$ is a rectangular overhang ($b * x$)
- $c$ is an offset independent of $x$ ($c * 1$)

Solving the equation means: what length $x$ makes the square, overhang, and offset cancel to zero?

Without an offset ($x^2 + bx = 0$), canceling the total area is easy: just set $x = 0$ or $x = -b$, which collapses one side of the rectangle or the other. (Note that $x$ can have *negative* length, to cancel the width of the overhang.)

But that offset makes us do extra work.

The trick to canceling the offset is completing the square. First, we move half the overhang to the top of the square:

Next, we borrow from the Bank of Zero to fill in the corner:

This part is magic. We can conjure up any quantity if we promise to cancel it later (0 = 1 - 1). So, we borrow material to complete the square, and subtract it again:

Then we can move the extra pieces to the other side:

Let's fill in some specifics. How big is the corner? Half the overhang ($\frac{b}{2}$), squared. Time for some algebra:

Tada! It's a... slightly less complex quadratic formula.

This equation is simpler than the quadratic formula, but it's still gnarly.

$b$ is the width of the full overhang, and $\frac{b}{2}$ is the piece we move. Since that's the plan, why not write things in terms of the part we want? Let's make $b$ half the overhang:

This means our starting equation can be written:

$x^2 + 2bx + c = 0$

Where $b$ is now the "radius" (not full diameter) of the overhang. Completing the square and solving gives us:

Pretty clean!

Let's solve this equation:

My thought process: first, divide everything by 3. No need to leave things sitting around.

Next, let's find radius of the overhang. The entire linear coefficient is 2, so the radius is 1. Using the radius formula, we get:

Pretty fast, right?

And to factor the equation (writing it as a set of multiplications) we do:

(Verify with wolfram alpha: roots of 3x^2 + 6x + 24 )

Which version of the formula should you use? I'd rather use a simple formula on a simple equation, vs. a complicated formula on a complicated equation.

**Don't be afraid to rewrite equations**

The standard quadratic formula is fine, but I found it hard to memorize. Who says we can't modify equations to fit our thinking? Ideas like "remove $a$ from the equation" and "use the radius, not diameter" simplifies things, and nicknames like "square, overhang, offset" make the parts memorable.

Practically, we often memorize the equations we're given, but it doesn't mean you can't try a version that makes sense for you.

**Why are the roots negative**?

It seems strange to have formulas that begin with a negative sign:

Typically, we need negative lengths to fight the area added by the overhang and make the area collapse to zero. Depending on the values of $a$, $b$ and $c$, the solutions can be positive, negative, zero, or complex.

**What is negative area?**

This seems to be overlooked in discussions, but when completing the square we can have "negative" area. Negative area is created by sides of imaginary length.

Instead of positive and negative area, I think of colors (green/red). Green area is positive, with real sides (healthy land that grows crops). Red area is negative, with imaginary sides (poisoned land?). The math works out ($3i \cdot 3i = 9 i^2 = 9(-1) = -9$) , but our geometric concepts might need some upgrades.

**Moving from 2d to 1d**

Another aha! moment is realizing what happens when we take a square root. We're changing out interpretation from 2 dimensions back down to 1:

Taking the square root is like looking at our shapes *edgewise* and comparing the resulting lengths. The equations have no fixed dimensions -- just interpretations of quantities -- but I like this perspective shift. The unimaginative among us can see completing the square as pure symbol manipulation.

Happy math.

]]>We have a bunch of coefficients ($c_0, c_1...$ ) multiplied by various powers of $x$.

Why are we forced to learn about them? Here's the insights I wish I had.

How many lines are here?

|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Why it's one hundred twenty three, of course. Oh, you'd probably have preferred the number in its polynomial form:

The digits "123" are simply the coefficients for various powers of 10, written largest to smallest. The "simple" idea of counting by adding powers of 10 is pretty game-changing, right?

We can even switch bases. The general interpretation of "123", for any base, is:

We're already familiar with decimals, so turn the coefficients "123" into a quantity we can understand. For hexadecimal numbers, we plug in $x = 16$ and work out $(1)16^2 +(2)16 + (3)16^0 = 291$.

Strictly speaking, 123.4 involves negative powers, since $0.1 = 10^{-1}$. Still, the original notion of polynomials gave us a good start, better than counting with lines in the sand.

Imagine you're taking notes for someone who's too lazy to use a calculator. They might say:

"Take the number 15. Actually, add 3. Multiply by 7. Now multiply by the original number again."

As arithmetic, you might write:

This is an exact expression, with specific numbers. But maybe your buddy is indecisive.

Making this more general -- with any starting number -- we'd write:

Notice that all the intermediate steps have been reduced to 2 terms. A polynomial!

Instead of thinking "When will I see this equation", think "This equation is the simplified version of a bunch of arithmetic". The steps don't look similar (where's the "add 3" portion?) but the polynomial boils everything to the essentials.

So, another big insight: we don't always need to *solve* a polynomial, we can use it to simplify many stages of arithmetic.

But... now that it's in its equation form, we *could* solve it. If we wanted to. Because it seems like a mathy thing to do.

Let's say we want to know when this process hits 100. What input number would that be?

Solving this gives $x = -5.56$ and $x=2.56$. (Yes, in the real world we can use tools to solve equations; grinding through the quadratic formula is an intuition for another day.)

The big aha is that polynomials are the boiled down version of arithmetic, assuming you can only:

- Add/subtract/multiply/divide by a constant, or:
- Multiply by your original number, x

Strictly speaking, polynomials don't allow for square roots ($\sqrt{x}$) or variable exponents ($2^x$), just the simple operations above.

Why limit ourselves this way? Well, we can see how far we can get.

Linear algebra is limited to linear functions, but even then we can model complex behavior if we chain enough of them together.

Similarly, polynomials are simple, but with *enough terms* we can model pretty complex behavior. The simple structure gives us several nice properties:

- Adding/multiplying polynomials gives us a polynomial
- Divide a polynomial by its roots, $(x - r)$, and get a polynomial (like dividing a compound number by one of its factors)
- Feed polynomials into each other $f(g(x))$
- The derivative / integral of a polynomial is a polynomial, making them easy to optimize

Polynomials are like the integers -- simpler than real numbers, and still useful.

What does $x$ mean? Well, it's a number, a quantity, something we want to represent.

What does $x \cdot x = x^2$ mean? It's an *interaction*: a quantity interacting with itself. In the case of length interacting with length, the result is area: length * length = square units. Or, we could have speed interacting with time to make distance: 30 mph for 30 hours is 900 miles.

A big aha: math doesn't care about units. A polynomial does the bookkeeping to show *that* an interaction like $x \cdot x$ happened, along with plain old $x$ and $x^{17}$.

In physics, you can't sensibly write $3cm + 5cm^2$ because the powers of the units don't match. You can't write $3 cm + 5kg$ because the types of units don't match.

But math doesn't care. Numbers are numbers, and "1 + 2*2 = 5" works, even though "2 times 2" represents an interaction.

A polynomial is a general-purpose accounting system where we can throw *anything* in the pot of soup. Will it taste good? Not the polynomial's problem. You'll get a formula and decide what, if anything, to do with it.

Given an input $x$, polynomials show what happens when we perform an ungodly amount of arithmetic to that input.

It seems, if we twiddle enough levers, we can get that polynomial to behave like almost any pattern we choose.

You got it: that's the Taylor series.

With enough terms, our humble polynomial -- composed of *basic arithmetic* -- can model undulating sine waves and other things that give students nightmares.

Each term in the polynomial gets us a better approximation of the original function:

We can analyze the coefficients in the polynomial and extract the "DNA" inside a function:

If the pattern in the coefficients are similar, the functions are likely related.

Back to the decimal analogy: 12 and 120 are similar, but that's only obvious when I write them as "decimal polynomials". If I spilled 12 and 120 skittles on the floor, you wouldn't notice any obvious connection between the two piles. Well, one might look slightly more appetizing.

One of the big insights in math -- called the Fundamental Theorem of Algebra -- is that a polynomial can be rewritten as a sequence of *multiplications*.

That is, adding powers of $x$ can *also* be seen as multiplying $x$ with various offsets:

There's as many roots ($a_1, a_2, a_3$) as the highest power of $x$ (roots can be used multiple times).

Why is this important? Well, besides being incredibly surprising (*added* powers can be converted to *multiplications*), it makes solving equations much easier (see: why do we factor equations?).

In short, it's not instantly obvious how to satisfy this:

Want to guess answers? Plug in $x$ and it shows up in places terms which need to be balanced. But what about this:

It's the same scenario, factored into multiplications. If *any* term goes to zero, the entire product becomes zero and we're done. Just pick $(x + 6)$ or $(x - 2)$, make it zero ($x = -6$ or $x = 2$), and we've solved the puzzle.

Remember how "123" was secretly a polynomial? For fun, using the accursed quadratic formula, we can factor it:

What does this mean? We can get the digits "123" to equal zero if we use a complex base! Not that you'd *want* to... but it's possible.

A polynomial is simple: just a bunch of powers of x. That means it's easy to use the Power Rule in Calculus to find the derivative, and find the min/max.

The reverse works as well: we can take a bunch of data points, fit a polynomial to them, and estimate the min/max of the sequence. The more terms, the more accurate (but beware overfitting).

A simple structure has its advantages.

Polynomials usually evoke memories of the quadratic equation, or equations force-fit into word problems. Ugh, let's skip past that.

Polynomials are a simple, powerful model used from arithmetic to algebra to calculus. They're as applicable in the "real world" as a 2-digit number.

Happy math.

]]>In math, we can get misleading intuitions about what can (or can't) be rearranged.

After learning addition, we've memorized facts like 2 + 4 = 6. But this might stray into the idea that "whenever I see 2 and 4, I can simplify to 6".

Although 2 + 4 = 6, but "baked(2) + baked(4)" is not "baked(6)". Baking unmixed ingredients in the exponential oven we get:

$2^2 + 4^2 \neq 6^2$

We can only confidently say:

$(2 + 4)^2 = 6^2$

We combine the ingredients, *then* bake the result. Exponents, like baking an apple pie, modify the original ingredients so they can't be easily combined later. While we might *recognize* the original 2 and 4, they aren't directly available. Two baked pies can't be smashed together to consolidate the filling.

This confusion gummed me up in calculus, when learning derivatives (the bad boy of baking).

In algebra, we internalize rules like:

But our intuition leads us astray when we get to the derivative.

because

Raw polynomials can be multiplied, but the *derivatives* of multiplied polynomials can't be rearranged so easily. Multiplication makes functions interact in a way that makes taking the derivative more complex:

Working through the Product Rule we get:

When learning Calculus, I was confused how standard interactions (like multiplication) needed special handling. I thought I was done learning new rules for "arithmetic".

But no: functions, when multiplied, interact in funky ways. See how each side grows its own sliver of area (`df * g`

and `dg * f`

)? The functions being multiplied are "baked together" and the overall effect depends on them both, simultaneously. We can't examine them in isolation (e.g., `df`

or `dg`

by itself).

Now, there are setups when the inputs *can* be processed separately and combined later (linear algebra). The cooking equivalent might be a smoothie: An apple/banana smoothie mixed with a peach/mango smoothie is the same as blending all ingredients in the beginning.

A common assumption is that operations are usually linear, but $\sin(a + b) \not= \sin(a) + \sin(b)$ and $(a + b)^2 \not= a^2 + b^2$. Sorry, we have to carefully cook the ingredients if we want the math to taste right.

When our intuition for a math rule doesn't make sense, ask "Are we making a pie, or a smoothie?"

]]>