Vector Calculus – BetterExplained

Vector Calculus: Understanding the Cross Product

kalid — Wed, 15 Apr 2015 15:00:29 +0000

Taking two vectors, we can write every combination of components in a grid:

This completed grid is the outer product, which can be separated into the:

Dot product, the interactions between similar dimensions (x*x, y*y, z*z)
Cross product, the interactions between different dimensions (x*y,y*z, z*x, etc.)

The dot product ($\vec{a} \cdot \vec{b}$) measures similarity because it only accumulates interactions in matching dimensions. It’s a simple calculation with 3 components.

The cross product (written $\vec{a} \times \vec{b}$) has to measure a half-dozen “cross interactions”. The calculation looks complex but the concept is simple: accumulate 6 individual differences for the total difference.

Instead of thinking “When do I need the cross product?” think “When do I need interactions between different dimensions?”.

Area, for example, is formed by vectors pointing in different directions (the more orthogonal, the better). Indeed, the cross product measures the area spanned by two 3d vectors (source):

(The “cross product” assumes 3d vectors, but the concept extends to higher dimensions.)

Did the key intuition click? Let’s hop into the details.

Defining the Cross Product

The dot product represents the similarity between vectors as a single number:

For example, we can say that North and East are 0% similar since $(0, 1) \cdot (1, 0) = 0$. Or that North and Northeast are 70% similar ($\cos(45) = .707$, remember that trig functions are percentages.) The similarity shows the amount of one vector that “shows up” in the other.

Should the cross product, the difference between vectors, be a single number too?

Let’s try. Sine is the percentage difference, so we could write:

Unfortunately, we’re missing some details. Let’s say we’re looking down the x-axis: both y and z point 100% away from us. A number like “100%” tells us there’s a big difference, but we don’t know what it is! We need extra information to tell us “the difference between $\vec{x}$ and $\vec{y}$ is this” and “the difference between $\vec{x}$ and $\vec{z}$ is that“.

So, let’s express the cross product as a vector:

The size of the cross product is the numeric “amount of difference” (with $\sin(\theta)$ as the percentage). By itself, this doesn’t distinguish $\vec{x} \times \vec{y}$ from $\vec{x} \times \vec{z}$.
The direction of the cross product is based on both inputs: it’s the direction orthogonal to both (i.e., favoring neither).

Now $\vec{x} \times \vec{y}$ and $\vec{x} \times \vec{z}$ have different results, each with a magnitude indicating they are “100%” different from $\vec{x}$.

(Should the dot product be a vector result too? Well, we’re tracking the similarity between $\vec{a}$ and $\vec{b}$. The similarity measures the overlap between the original vector directions, which we already have.)

Geometric Interpretation

Two vectors determine a plane, and the cross product points in a direction different from both (source):

Here’s the problem: there’s two perpendicular directions. By convention, we assume a “right-handed system” (source):

If you hold your first two fingers like the diagram shows, your thumb will point in the direction of the cross product. I make sure the orientation is correct by sweeping my first finger from $\vec{a}$ to $\vec{b}$. With the direction figured out, the magnitude of the cross product is $|a| |b| \sin(\theta)$, which is proportional to the magnitude of each vector and the “difference percentage” (sine).

The Cross Product For Orthogonal Vectors

To remember the right hand rule, write the xyz order twice: xyzxyz. Next, find the pattern you’re looking for:

xy => z (x cross y is z)
yz => x (y cross z is x; we looped around: y to z to x)
zx => y

Now, xy and yx have opposite signs because they are forward and backward in our xyzxyz setup.

So, without a formula, you should be able to calculate:

Again, this is because x cross y is positive z in a right-handed coordinate system. I used unit vectors, but we could scale the terms:

Calculating The Cross Product

A single vector can be decomposed into its 3 orthogonal parts:

When the vectors are crossed, each pair of orthogonal components (like $a_x \times b_y$) casts a vote for where the orthogonal vector should point. 6 components, 6 votes, and their total is the cross product. (Similar to the gradient, where each axis casts a vote for the direction of greatest increase.)

xy => z and yx => -z (assume $\vec{a}$ is first, so xy means $a_x b_y$)
yz => x and zy => -x
zx => y and xz => -y

xy and yx fight it out in the z direction. If those terms are equal, such as in $(2, 1, 0) \times (2, 1, 1)$, there is no cross product component in the z direction (2 – 2 = 0).

The final combination is:

where $\vec{n}$ is the unit vector normal to $\vec{a}$ and $\vec{b}$.

Don’t let this scare you:

There’s 6 terms, 3 positive and 3 negative
Two dimensions vote on the third (so the z term must only have y and x components)
The positive/negative order is based on the xyzxyz pattern

If you like, there is an algebraic proof, that the formula is both orthogonal and of size $|a| |b| \sin(\theta)$, but I like the “proportional voting” intuition.

Example Time

Again, we should do simple cross products in our head:

Why? We crossed the x and y axes, giving us z (or $\vec{i} \times \vec{j} = \vec{k}$, using those unit vectors). Crossing the other way gives $-\vec{k}$.

Here’s how I walk through more complex examples:

Let’s do the last term, the z-component. That’s (1)(5) minus (4)(2), or 5 – 8 = -3. I did z first because it uses x and y, the first two terms. Try seeing (1)(5) as “forward” as you scan from the first vector to the second, and (4)(2) as backwards as you move from the second vector to the first.
Now the y component: (3)(4) – (6)(1) = 12 – 6 = 6
Now the x component: (2)(6) – (5)(3) = 12 – 15 = -3

So, the total is $(-3, 6, -3)$ which we can verify with Wolfram Alpha.

In short:

The cross product tracks all the “cross interactions” between dimensions
There are 6 interactions (2 in each dimension), with signs based on the xyzxyz order

Appendix

Connection with the Determinant

You can calculate the cross product using the determinant of this matrix:

There’s a neat connection here, as the determinant (“signed area/volume”) tracks the contributions from orthogonal components.

There are theoretical reasons why the cross product (as an orthogonal vector) is only available in 0, 1, 3 or 7 dimensions. However, the cross product as a single number is essentially the determinant (a signed area, volume, or hypervolume as a scalar).

Connection with Curl

Curl measures the twisting force a vector field applies to a point, and is measured with a vector perpendicular to the surface. Whenever you hear “perpendicular vector” start thinking “cross product”.

We take the “determinant” of this matrix:

Instead of multiplication, the interaction is taking a partial derivative. As before, the $\vec{i}$ component of curl is based on the vectors and derivatives in the $\vec{j}$ and $\vec{k}$ directions.

Relation to the Pythagorean Theorem

The cross and dot product are like the orthogonal sides of a triangle:

For unit vectors, where $|a| = |b| = 1 $, we have:

I cheated a bit in the grid diagram, as we have to track the squared magnitudes (as done in the Pythagorean Theorem).

Advanced Math

The cross product & friends get extended in Clifford Algebra and Geometric Algebra. I’m still learning these.

Cross Products of Cross Products

Sometimes you’ll have a scenario like:

First, the cross product isn’t associative: order matters.

Next, remember what the cross product is doing: finding orthogonal vectors. If any two components are parallel ($\vec{a}$ parallel to $\vec{b}$) then there are no dimensions pushing on each other, and the cross product is zero (which carries through to $0 \times \vec{c}$).

But it’s ok for $\vec{a}$ and $\vec{c}$ to be parallel, since they are never directly involved in a cross product, for example:

Whoa! How’d we get back to $\vec{j}$? We asked for a direction perpendicular to both $\vec{i}$ and $\vec{j}$, and made that direction perpendicular to $\vec{i}$ again. Being “doubly perpendicular” means you’re back on the original axis.

Dot Product of Cross Products

Now if we take

what happens? We’re forced to do $\vec{a} \times \vec{b}$ first, because $\vec{b} \cdot \vec{c}$ returns a scalar (single number) which can’t be used in a cross product.

If $\vec{a}$ and $\vec{c}$ are parallel, what happens? Well, $\vec{a} \times \vec{b}$ is perpendicular to $\vec{a}$, which means it’s perpendicular to $\vec{c}$, so the dot product with $\vec{c}$ will be zero.

I never really memorized these rules, I have to think through the interactions.

Other Coordinate Systems

The Unity game engine is left-handed, OpenGL (and most math/physics tools) are right-handed. Why?

In a computer game, x goes horizontal, y goes vertical, and z goes “into the screen”. This results in a left-handed system. (Try it: using your right hand, you can see x cross y should point out of the screen).

Applications of the Cross Product

Find the direction perpendicular to two given vectors.
Find the signed area spanned by two vectors.
Determine if two vectors are orthogonal (checking for a dot product of 0 is likely faster though).
“Multiply” two vectors when only perpendicular cross-terms make a contribution (such as finding torque).
With the quaternions (4d complex numbers), the cross product performs the work of rotating one vector around another (another article in the works!).

Happy math.

Vector Calculus: Understanding the Dot Product

kalid — Wed, 22 Feb 2012 23:00:31 +0000

I think of the dot product as directional multiplication. Multiplication goes beyond repeated counting: it's applying the essence of one item to another. (For example, complex multiplication is rotation, not repeated counting.)

When dealing with simple growth rates, multiplication scales one rate by another:

"3 x 4" can mean "Take your 3x growth and make it 4x as large, to get 12x"

When dealing with vectors ("directional growth"), there's a few operations we can do:

Add vectors: Accumulate the growth contained in several vectors.
Multiply by a constant: Make an existing vector stronger (in the same direction).
Dot product: Apply the directional growth of one vector to another. The result is how much stronger we've made the original vector (positive, negative, or zero).

Today we'll build our intuition for how the dot product works.

Getting the Formula Out of the Way

You've seen the dot product equation everywhere:

And also the justification: "Well Billy, the Law of Cosines (you remember that, don't you?) says the following calculations are the same, so they are." Not good enough -- it doesn't click! Beyond the computation, what does it mean?

The goal is to apply one vector to another. The equation above shows two ways to accomplish this:

Rectangular perspective: combine x and y components
Polar perspective: combine magnitudes and angles

The "this stuff = that stuff" equation just means "Here are two equivalent ways to 'directionally multiply' vectors".

Seeing Numbers as Vectors

Let's start simple, and treat 3 x 4 as a dot product:

The number 3 is "directional growth" in a single dimension (the x-axis, let's say), and 4 is "directional growth" in that same direction. 3 x 4 = 12 means we get 12x growth in a single dimension. Ok.

Now, suppose 3 and 4 refer to different dimensions. Let's say 3 means "triple your bananas" (x-axis) and 4 means "quadruple your oranges" (y-axis). Now they're not the same type of number: what happens when apply growth (use the dot product) in our "bananas, oranges" universe?

(3,0) means "Triple your bananas, destroy your oranges"
(0,4) means "Destroy your bananas, quadruple your oranges"

Applying (0,4) to (3,0) means "Destroy your banana growth, quadruple your orange growth". But (3, 0) had no orange growth to begin with, so the net result is 0 ("Destroy all your fruit, buddy").

See how we're "applying" and not simply adding? With regular addition, we smush the vectors together: (3,0) + (0, 4) = (3, 4) [a vector which triples your oranges and quadruples your bananas].

"Application" is different. We're mutating the original vector based on the rules of the second. And the rules of (0, 4) are "Destroy your banana growth, and quadruple your orange growth." When applied to something with only bananas, like (3, 0), we're left with nothing.

The final result of the dot product process can be:

Zero: we don't have any growth in the original direction
Positive number: we have some growth in the original direction
Negative number: we have negative (reverse) growth in the original direction

Understanding the Calculation

"Applying vectors" is still a bit abstract. I think "How much energy/push is one vector giving to the other?". Here's how I visualize it:

Rectangular Coordinates: Component-by-component overlap

Like multiplying complex numbers, see how each x- and y-component interacts:

We list out all four combinations (x with x, y with x, x with y, y with y). Since the x- and y-coordinates don't affect each other (like holding a bucket sideways under a waterfall -- nothing falls in), the total energy absorbtion is absorbtion(x) + absorbtion(y):

Polar coordinates: Projection

The word "projection" is so sterile: I prefer "along the path". How much energy is actually going in our original direction?

Here's one way to see it:

Take two vectors, a and b. Rotate our coordinates so b is horizontal: it becomes (|b|, 0), and everything is on this new x-axis. What's the dot product now? (It shouldn't change just because we tilted our head).

Well, vector a has new coordinates (a1, a2), and we get:

a1 is really "What is the x-coordinate of a, assuming b is the x-axis?". That is |a|cos(θ), aka the "projection":

Analogies for the Dot Product

The common interpretation is "geometric projection", but it's so bland. Here's some analogies that click for me:

Energy Absorbtion

One vector are solar rays, the other is where the solar panel is pointing (yes, yes, the normal vector). Larger numbers mean stronger rays or a larger panel. How much energy is absorbed?

Energy = Overlap in direction * Strength of rays * Size of panel

If you hold your panel sideways to the sun, no rays hit (cos(θ) = 0).

Photo credit

But... but... solar rays are leaving the sun, and the panel is facing the sun, and the dot product is negative when vectors are opposed! Take a deep breath, and remember the goal is to embrace the analogy (besides, physicists lose track of negative signs all the time).

Mario-Kart Speed Boost

In Mario Kart, there are "boost pads" on the ground that increase your speed (Never played? I'm sorry.)

Photo source

Imagine the red vector is your speed (x and y direction), and the blue vector is the orientation of the boost pad (x and y direction). Larger numbers are more power.

How much boost will you get? For the analogy, imagine the pad gives a speed bonus like this:

If you come in going 0, you'll get nothing. (If you are dropped onto the pad, there's no boost.)
If you cross the pad perpendicularly, you'll get 0 benefit. (Just like the banana obliteration, there's 0x boost in the perpendicular direction.)
If our direction and pad are aligned, our x-speed contributes an x-boost, and our y-speed gives us a y-boost:

Neat, eh? Another way to see it: your incoming speed is $|a|$, and the max boost is $|b|$. The percentage of boost you actually get (based on how you're lined up) is $\cos(\theta)$, for an overal boost of $|a||b|\cos(\theta)$, which is the dot product.

Fruit Stand Analogy

Let's say your store sells apples, bananas, and clementines. They cost \$1, \$2, and \$3 each, respectively.

A customer wants to buy 2 apples, 3 bananas, and 4 clementines. What does it cost?

cost = (A quantity) * (A price) + (B quantity) * (B price) + (C quantity) * (C price) 
cost = 2*1 + 3*2 + 4*3 = 20

This is the dot product between the "quantity" vector and the "price" vector! We're multiplying the matching entries and getting the total. We ignore entries that don't "make sense" to multiply (why should the banana quantity and clementine price impact each other?).

Physics Physics Physics

The dot product appears all over physics: some field (electric, gravitational) is pulling on some particle. We'd love to multiply, and we could if everything were lined up. But that's never the case, so we take the dot product to account for potential differences in direction.

It's all a useful generalization: Integrals are "multiplication, taking changes into account" and the dot product is "multiplication, taking direction into account".

And what if your direction is changing? Why, take the integral of the dot product, of course!

Onward and Upward

Don't settle for "Dot product is the geometric projection, justified by the law of cosines". Find the analogies that click for you! Happy math.

Understanding Pythagorean Distance and the Gradient

kalid — Fri, 04 Nov 2011 16:24:48 +0000

The Pythagorean Theorem shows how strange our concept of distance is. Using the rule $a^2 + b^2 = c^2$, we can trade some "a" to get more "b".

Starting with

means "A 13-inch pizza equals a 13-inch pizza". Sure. But we can trade an inch and get:

Huh? A 12-inch pizza and a 5-inch pizza equal a 13-inch pizza?

The math works (144 + 25 = 169) but, but... we gave up an inch and got a five-inch pizza!

Let's understand why the tradeoff happens, and how to use it.

Explanation 1: Shaving the Square

A key insight: Bigger numbers are harder to square.

Imagine laying tiles on a porch -- as your porch grows, the outer layer needs more tiles. Trimming a 13x13 porch to 12x12 frees up 25 tiles, which is enough to make a new 5x5 porch!

I call this "shaving the square". Trimming 1 unit from the outside of a large square has more "shavings" which can contribute to a smaller one (trimming an inch from a giant fro can make a sweater for an infant). As we continue to trim, the benefit diminishes because our starting point is smaller and smaller.

Explanation 2: Sliding the Chopstick

A second insight: Slide a little, pivot a lot.

Imagine a chopstick wedged in a corner: the length is fixed, and the ends of the chopstick must touch a wall. What're the options?

Well, laying on a single wall means 100% for one side (like saying $13^2 + 0^2 = 13^2$). Not that interesting.

By sliding the chopstick (from 13 to 12) we can swing it out by 5 on the other wall!

You need to try it -- a small slide gives a giant pivot. As we keep sliding, the tradeoff (How much pivot do we get?) changes.

So What's the Tradeoff?

Time to see how the a/b tradeoff works. First, let's use grid coordinates: x & y (horizontal and vertical). Given a fixed distance (13 units, let's say), our options lay on the circle where $x^2 + y^2 = 13^2$:

A few points:

Each possibility is the same distance, but has a different ratio of x to y (100% x, 100% y, or a mix like (12,5))
We can only move to neighboring points on the circle (options at the same distance)
The tradeoff we face is how much "x" we get for "y" when moving to a neighbor. If we're at (0, 13) we could move to (5, 12). This trades 1 y for 5 x's.

This is the "chunky" tradeoff where we're using an entire unit at a time. What about .5 units? .01?

Enter the tangent! The tangent line shows the trajectory of our current path, the direction to our neighbor. We follow the tangent for a tiny, microscopic amount to get our next neighbor. The tangent is an approximation -- it's not pointing exactly at our nearest neighbor, but it's pretty close.

The tangent shows the tradeoff you are about to make.

What's the actual amount? Any point (x,y) has a slope of y/x, and a tangent line with slope -x/y, so the tradeoff is...getting confused yet?

Less mindless algebra, more intuition:

Circles have a tangent line perpendicular to the current point
If you're at (5,12) then tangent slope is some ratio of 5 and 12
Remember "shaving the square": you get a better deal in the direction of the smaller coordinate (increasing a large square is tough).
So, at (5, 12) you're "heavy on the y" and the trade will favor improving your x: it should be "trade 5 y's for 12 x's". And why not the other way? It doesn't make sense that the more y you have, the easier it is to get y! That'd spiral off into exponential growth, not a circle.
Lastly, we can't trade an entire chunk of 5 y's! The tangent is about our nearest neighbor. We have a trade of 12/5 or 2.4 to 1. Our next, tiny movement will be at this ratio (and then we'll be at a new point, with a new tangent).

General principle: Our neighbors are on a circle, which encourages balance. You get a better deal in the direction of the smaller coordinate: at (x,y) the tradeoff is y:x.

Optimizing The Tradeoff

Now we know the tradeoff for any point (x,y) -- let's optimize!

In a boring scenario, we get paid based on pure distance, so every point (or direction to move) is the same.

The exciting scenario: our (x,y) position is an input into some other function which gives us a return! Now we want to maximize that function.

Here's a scenario: Popeye throws cars for cash. He lines up spectators on fences running North and East. The spectators must look straight ahead (they're in neck braces, due to earlier events) but will pay Popeye if they see a car pass in front of them.

Maximizing Even Payouts

Suppose each spectator offers \$1 if they see the car (Payout (x,y) = x + y). Where to throw?

First, assume Popeye has finite energy -- he can throw the car 13 meters. Now let's start somewhere: throwing the car pure North (0, 13):

Ok. What if he threw it slightly East? To (5, 12) let's say?

Clearly better. This should make sense: at (0,13) the tradeoff is great to get more East. We can give up 1 North and get a whopping 5 East, a "profit" of \$4 if we do the trade. We should keep trading as long as it's profitable -- as long as we're out of balance, the circle will reward us for boosting the smaller side. Following a 45 degree angle for 13 units is the ideal:

Neat. A 45-degree throw hits 70.7% of the possible spectators for each side.

Psst. Confused about how a 45-degree through passes by 70.7% of the spectators on each side? No problem.

A 45-degree throw is along the diagonal of a square. A triangle with sides 1 and 1 has a hypotenuse of:

And has sides $(1, 1, 1.414)$.

A hypotenuse of $\sqrt{2}$ isn't convenient: it's hard to know what fraction a side is of the whole. We divide the triangle by the length of the hypotenuse ($\sqrt{2}$), making the hypotenuse 1 and the other sides a percentage:

Now we've discovered that a 45-degree throw, with sides $(1, 1, \sqrt{2})$, has the ratio $.707, .707, 1$. 70.7% of the distance along the hypotenuse shows up on each side.

General Technique: Finding the Best Direction

We stumbled upon the way to find the best return:

Pick any starting point / direction
Tweak it: if our return improves, keep the new choice (it's profitable)
Keep tweaking until our return is no longer profitable

In math slang, this is "finding the local maximum". In economics slang, it's finding the point of "zero marginal returns". Popeye calls it Squeezing the Spinach.

Maximizing Uneven Returns

Now suppose the Northern spectators offer \$2 (Eastern stay at \$1), so P(x,y) = x + 2*y. Should we throw it 100% North?

Not bad. But what about 45 degrees again?

Interesting -- 45 degrees is still better! But... I think we went too far! Shouldn't we favor North since it pays more?

Yep. Let's remember how to Squeeze the Spinach (maximize our returns): start with North and change until it's not profitable:

The payout function means 1 North = 2 Easts (North pays \$2, so 1 unit North = 2 units East)
Trades are profitable if we can beat 1 North for 2 Easts (1 North for 3 Easts, for example, would profit \$1)

So... where are trades better than 1 North for 2 Easts? In the Northern section, where the circle rewards us by throwing Easts at us ("Please, please go East... I'll give you a bunch if you give up a little North").

Remember how circles are about x/y, x & y, x:y, etc.? Well, we have the numbers 1 and 2. (2,1) is in the East section. We want (1,2). Why? At (1,2) we have reached the perfect 1 North = 2 East tradeoff.

Following the direction (1,2) for 13 units is:

Tada! Over 29 smackeroos because we maximized our return.

The Gradient Principle

We can supercharge this result:

To maximize return, go in each direction proportional to its payoff.

If North pays 2:1 compared to East, your trajectory should favor North by 2:1. In mathier terms:

Payoff(x,y) = ax + by
Best trajectory = (a, b) [in our case, (East, North) => (1, 2)]

And this works in multiple dimensions! Given 3 dimensions, go in a direction (Payoff(x), Payoff(y), Payoff(z)). Vector calculus fans, this is why the gradient is in the direction of greatest increase.

The gradient for $F(x,y,z)$ is

And each partial derivative (dF/dx) is the payoff for moving in that direction.

But does it all balance? Suppose x pays 3, y pays 4, and z pays 5 (at the current position). The 2-dimensional tradeoff trajectories are:

Now for the magic: the combined trajectory

satisfies all 3 requirements! On the x-z plane, x doesn't care about y -- as long as the ratio to z is (3 , ?, 5) you're getting the best tradeoff from the x-z perspective. The pairs are:

(3, ?, 5)
(?, 4, 5)
(3, 4, ?)

You don't need a sudoku master to see (3, 4, 5) satisfies all those proportions.

Still not convinced? Imagine the payoff for y was zero. We don't want to waste energy in our trajectory (3, ?, 5) in a useless direction. But that can't happen, because the y-z tradeoff will be (?, 0, 5) and the x-y tradeoff will be (3, 0, ?). The x-z tradeoff lets y-z and x-y "figure out" what y should be, which is 0.

Questions I Had That You Might Have Too

Q: I still don't get why this works at all. Somehow 50% in x and 50% in y leads to .7 + .7 = 1.4?

It's a deep question about why space behaves like this. I was going crazy staring at chopsticks on a wall.

Here's my answer: distance is distance. 13 units is 13 units. But in some situations we are "measuring our coordinates" (what are the values of x & y) and not the distance itself.

Cartesian coordinates (x-axis, y-axis) are very inefficient for diagonal motion (i.e., you are measuring the sides of the triangle, not the hypotenuse). When $.707^2 + .707^2 = 1$, it's a measure how how "inefficient" our x & y coordinates are being. We used 70% of each coordinate to represent an object that could have been 100% on one (i.e, if we used polar coordinates).

Q: I have an offshore investment with 200% return, and an onshore one with 5% return. I have \$1000 to spend -- should I split my money?

Heavens, no! Remember, this principle is about distance measurements on a grid with the idea that 50% in x and 50% in y covers "more ground" than 100% in x. In investing 1) money is not on a grid and 2) there's no distance bonus. Putting half your money in each is plain old 0.5 + 0.5 = 1.0. Giving up \$1 of the offshore investment gives you \$1 for the onshore one.

Put all your money in the best investment.

Q: So all this stuff is useless?

Heavens, no! Ask yourself: am I measuring distance on a coordinate system?

Many things are measured in terms of x-y coordinates (physical phenomena, etc.) and do have the Pythagorean distance tradeoff.

But not every graph is the same. Graphs that aren't about distance (like "Money vs. Time") do not get any boost from the Pythagorean theorem. This confused me for a long time: the Pythagorean Theorem works for coordinate distance!

Final Thoughts

The Pythagorean Theorem is so versatile -- it's not about triangles, it covers the nature of distance. I seem to find some new realization when I study it. Really grokking it will help you everywhere, from geometry to vector calculus.

Happy math.

Vector Calculus: Understanding Circulation and Curl

kalid — Mon, 19 Feb 2007 23:02:22 +0000

Circulation is the amount of force that pushes along a closed boundary or path. It's the total "push" you get when going along a path, such as a circle.

A vector field is usually the source of the circulation. If you had a paper boat in a whirlpool, the circulation would be the amount of force that pushed it along as it went in a circle. The more circulation, the more pushing force you have.

Curl is simply the circulation per unit area, circulation density, or rate of rotation (amount of twisting at a single point). Imagine shrinking your whirlpool down smaller and smaller while keeping the force the same: you'll have a lot of power in a small area, so will have a large curl. If you widen the whirlpool while keeping the force the same as before, then you'll have a smaller curl. And of course, zero circulation means zero curl.

Intuition

Circulation is the amount of "pushing" force along a path. Curl is the amount of pushing, twisting, or turning force when you shrink the path down to a single point. Let's use water as an example.

Suppose we have a flow of water and we want to determine if it has curl or not: is there any twisting or pushing force? To test this, we put a paddle wheel into the water and notice if it turns (the paddle is vertical, sticking out of the water like a revolving door -- not like a paddlewheel boat):

If the paddle does turn, it means this field has curl at that point. If it doesn't turn, then there's no curl.

What does it really mean if the paddle turns? Well, it means the water is pushing harder on one side than the other, making it twist. The larger the difference, the more forceful the twist and the bigger the curl. Also, a turning paddle wheel indicates that the field is "uneven" and not symmetric; if the field were even, then it would push on all sides equally and the paddle wouldn't turn at all.

The fact that there is a "twist" means the field is not conservative (this has nothing to do with its political views).

A conservative field is "fair" in the sense that work needed to move from point A to point B, along any path, is the same. For example, consider a river: its field is conservative. Sure, you can get a free ride downstream, but then you have to do work to get back to your starting point. Or, you can do work to move upstream, and get a free ride back. Either way, the amount of work you "put in" is the same as what you get back.

However, in a field with curl (like a whirlpool), you can get a free ride by moving in the direction of the twist. In a whirlpool, you can get a free trip by moving with the current in a circle. If you fight the current and go the wrong way, you have to use energy with no free ride at all.

Conservative fields have zero curl: there are no free twists to push you along. Alternatively, if a field has curl, it is not conservative.

Gravity is another example of a conservative field. Technically, if you lift a rock and then let it fall, the energy you get from falling is the same as what you put in to lift the rock. Theoretically speaking, no energy was gained or lost in this transaction.

Additional Details

To be technical, curl is a vector, which means it has a both a magnitude and a direction. The magnitude is simply the amount of twisting force at a point.

The direction is a little more tricky: it's the orientation of the axis of your paddlewheel in order to get maximum rotation. In other words, it is the direction which will give you the most "free work" from the field. Imagine putting your paddlewheel sideways in the whirlpool - it wouldn't turn at all. If you put it in the proper direction, it begins turning.

But wait a minute -- aren't there two directions to get a twisting motion? Couldn't you just turn the paddlewheel "upside down" and get the maximum curl as well?

Yep, you're right. By convention alone, if the paddle wheel is rotating counterclockwise, its curl vector points out of the page. This is a type of right-hand rule: make a fist with your right hand and stick out your thumb. If the circulation/pushing force follows the twisting of your fingers (counterclockwise), then the curl vector will be in the direction of your thumb.

Mathematics

Circulation is the integral of a vector field along a path - you are adding how much the field "pushes" you along a path.

How do we find this? Well, we should expect some type of dot product, because we want to know the amount that one vector (the force) is pushing in the direction of another (the path). So, the two vectors we need are (1) the path vector and (2) the field vector at every point along the path.

If we have a function that defines the position at any time, $F(t)$, we can take the time derivative to get the velocity at that position.

The velocity vector is always in the direction of movement -- if you are moving from A to B, the velocity vector will be an arrow from A to B, i.e. your change in position or your direction of movement. So, we can use the velocity to get our direction.

It's important to understand why we aren't using the position vector itself -- it tells us where we are, but not where we're going. We need to know our direction to see how much "push" we are getting: Knowing your position in a river isn't important -- are you going upstream or downstream, and at what angle?

The force vector (2) is defined by the field we are in. No derivatives or other changes are necessary -- every point in the field has some force acting on it.

So, our formula for circulation is:

Remember, velocity is simply the derivative of position (r), so (dr) is a vector giving us our direction. We integrate along the entire path and use the dot product to see how much pushing force is applied. We then sum up these "pushes" to get the total circulation.

Since curl is the circulation per unit area, we can take the circulation for a small area (letting the area shrink to 0). However, since curl is a vector, we need to give it a direction -- the direction is normal (perpendicular) to the surface with the vector field. The magnitude is the same as before: circulation/area.

Recall that by convention (a bunch of people agreeing), counterclockwise circulation will give a curl pointing out of the page. Using these facts, we can create the formula for curl:

Where (S) is the surface we are considering; the direction of the curl is the normal to the surface.

You'll see fancier equations for curl where the surface shrinks to zero (such as in wikipedia), but recognize the basic intuition -- curl is the circulation per unit area.

Parting Thoughts

You'll often see curl of a field (F) written like this:

which is a cross-product of the gradient and the field (F). This has to do with how curl is actually computed, which will be material for another article (and probably in your textbook already -- see wikipedia for details).

If I have been successful, you should understand intuitively what circulation and curl mean, and how we got the formulae above. They spring up naturally from our definition of circulation as "pushing force along a path" and curl as "pushing force/area".

Math should be a tool for clearly stating what we already know. Understand the intuition and then tackle the complicated formulas. Happy math.

PS. Have some fun and check out this video of a famous whirlpool. Imagine the circulation on this (go on, imagine):