<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>BetterExplained &#187; Vector Calculus</title>
	<atom:link href="http://betterexplained.com/articles/category/math/vector-calculus/feed/" rel="self" type="application/rss+xml" />
	<link>http://betterexplained.com</link>
	<description>Learn Right, Not Rote.</description>
	<lastBuildDate>Wed, 16 May 2012 00:45:28 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
		<item>
		<title>Vector Calculus: Understanding the Dot Product</title>
		<link>http://betterexplained.com/articles/vector-calculus-understanding-the-dot-product/</link>
		<comments>http://betterexplained.com/articles/vector-calculus-understanding-the-dot-product/#comments</comments>
		<pubDate>Mon, 27 Feb 2012 15:00:31 +0000</pubDate>
		<dc:creator>kalid</dc:creator>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Vector Calculus]]></category>

		<guid isPermaLink="false">http://betterexplained.com/?p=1831</guid>
		<description><![CDATA[I see the dot product as directional multiplication. But multiplication goes beyond <a href="http://betterexplained.com/articles/rethinking-arithmetic-a-visual-guide/">repeated counting&#8230; <a href="http://betterexplained.com/articles/vector-calculus-understanding-the-dot-product/" class="read_more">Read article</a></a>: it&#8217;s applying the essence of one item to another.

Normal multiplication combines growth rates: &#8220;3 x 4&#8243; can mean &#8220;Take your 3x growth and make]]></description>
			<content:encoded><![CDATA[<p>I see the dot product as directional multiplication. But multiplication goes beyond <a href="http://betterexplained.com/articles/rethinking-arithmetic-a-visual-guide/">repeated counting</a>: it&#8217;s applying the essence of one item to another.</p>

<p>Normal multiplication combines growth rates: &#8220;3 x 4&#8243; can mean &#8220;Take your 3x growth and make it 4x larger (i.e., 12x)&#8221;. <a href="http://betterexplained.com/articles/understanding-why-complex-multiplication-works/">Complex multiplication</a> lets us combine rotations. <a href="http://betterexplained.com/articles/a-calculus-analogy-integrals-as-multiplication/">Integrals</a> let us do piece-by-piece multiplication.</p>

<p>A vector is &#8220;growth in a direction&#8221;. The dot product lets us apply the directional growth of one vector to another: the result is how much we went along the original path (positive progress, negative, or zero).</p>

<p>Today let&#8217;s build our intuition for how the dot product works.</p>

<h2>Getting the Formula Out of the Way</h2>

<p>You&#8217;ve seen the dot product equation everywhere:</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/7093a2c642c17b4be8c2765fdc3d223f.png' title='\displaystyle{\vec{a} \cdot \vec{b} = a_x \cdot b_x + a_y \cdot b_y = |\vec{a}||\vec{b}|\cos(\theta) }' alt='\displaystyle{\vec{a} \cdot \vec{b} = a_x \cdot b_x + a_y \cdot b_y = |\vec{a}||\vec{b}|\cos(\theta) }' align=absmiddle class='tex'></p>

<p>And also the justification: &#8220;Well Billy, the Law of Cosines (you remember that, don&#8217;t you?) says the following calculations are the same, so they are.&#8221; Not good enough &#8212; it doesn&#8217;t click! Beyond the computation, what does it mean?</p>

<p>The goal is to apply one vector to another. Each computation examines this from a rectangular perspective (x- and y-coordinates) or a polar one (magnitudes and angles). The &#8220;blah = foo&#8221; equation above really means &#8220;Here&#8217;s two equivalent ways to &#8216;directionally multiply&#8217; vectors&#8221;.</p>

<p>(Similarly, we can show that <a href="http://betterexplained.com/articles/intuitive-understanding-of-eulers-formula/">Euler&#8217;s formula</a> (e^ix = cos(x) + i*sin(x)) is true because the Taylor series is the same on both sides. Accurate but unsatisfying! Instead, see how both sides can describe the same motion.)</p>

<h2>Seeing Numbers as vectors</h2>

<p>Let&#8217;s start simple, and see 3 x 4 as a dot product:</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/69d0b6157d3c64105256895647620f90.png' title='\displaystyle{(3, 0) \cdot (4,0)}' alt='\displaystyle{(3, 0) \cdot (4,0)}' align=absmiddle class='tex'></p>

<p>The number 3 is &#8220;directional growth&#8221; in a single dimension (x-axis, let&#8217;s say), and 4 is &#8220;directional growth&#8221; in that same direction. 3 x 4 = 12 means 12x growth in that single dimension. Ok.</p>

<p>Now, suppose each number refers to a different dimension? Suppose 3 means &#8220;triple your bananas&#8221; (sigh&#8230; or &#8220;x-axis&#8221;) and 4 means &#8220;quadruple your oranges&#8221; (y-axis). They&#8217;re not the same &#8220;type&#8221; of number: what happens when we apply growth (take the dot product) in the (bananas, oranges) universe?</p>

<ul>
<li>(3,0) is &#8220;Triple your bananas, destroy your oranges&#8221;</li>
<li>(0,4) is &#8220;Destroy your bananas, quadruple your oranges&#8221;</li>
</ul>

<p>Applying (0,4) to (3,0) means &#8220;Destroy your banana growth, quadruple your orange growth&#8221;. But (3, 0) had no orange growth to begin with, so the net result is 0 (&#8220;Destroy all your fruit, buddy&#8221;).</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/d3c7038a36ee1fffede207579a40d786.png' title='\displaystyle{(3, 0) \cdot (0, 4) = 0}' alt='\displaystyle{(3, 0) \cdot (0, 4) = 0}' align=absmiddle class='tex'></p>

<p>See how we&#8217;re &#8220;applying&#8221; and not adding. With addition, we sort of smush the items together: (3,0) + (0, 4) = (3, 4) [a vector which triples your oranges <em>and</em> quadruples your bananas].</p>

<p>&#8220;Application&#8221; is different. We&#8217;re mutating the original vector according to the rules in the second. And the rules are &#8220;Destroy your banana growth <em>rate</em>, and triple your orange growth <em>rate</em>&#8220;. And, sadly, this leaves us with nothing.</p>

<p>The final result of this process can be:</p>

<ul>
<li>zero: we don&#8217;t have any growth in the original direction</li>
<li>positive number: we have some growth in the original direction</li>
<li>negative number: we have negative (reverse) growth in the original direction</li>
</ul>

<h2>Understanding the Calculation</h2>

<p>&#8220;Applying vectors&#8221; is still a bit abstract. I think &#8220;How much energy/push is one vector giving to the other?&#8221;. Here&#8217;s how I visualize it:</p>

<p><strong>Rectangular Coordinates: Component-by-component overlap</strong></p>

<p>Like multiplying complex numbers, see how each x- and y-component interacts:</p>

<p><img src="http://betterexplained.com/wp-content/uploads/dotproduct/dot_product_components.png" alt="Dot Product Components" /></p>

<p>We list out all four combinations (x-x, y-x, x-y, y-y). Since the x- and y-coordinates don&#8217;t affect each other (like holding a bucket sideways under a waterfall &#8212; nothing falls in), the total energy absorbtion is absorbtion(x) + absorbtion(y):</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/fba1cc0231e9178d6a8fc9db46f5fad5.png' title='\displaystyle{\vec{a} \cdot \vec{b} = a_x \cdot b_x + a_y \cdot b_y}' alt='\displaystyle{\vec{a} \cdot \vec{b} = a_x \cdot b_x + a_y \cdot b_y}' align=absmiddle class='tex'></p>

<p><strong>Polar coordinates: Projection</strong></p>

<p>The word &#8220;projection&#8221; is so sterile: I prefer &#8220;along the path&#8221;. How much energy is actually going in our original direction?</p>

<p>Here&#8217;s one way to see it: </p>

<p><img src="http://betterexplained.com/wp-content/uploads/dotproduct/dot_product_rotation.png" alt="Dot Product Rotation" /></p>

<p>Take two vectors, a and b. Rotate our coordinates so b is horizontal: it becomes (|b|, 0), and everything is on this new x-axis. What&#8217;s the dot product now? (It shouldn&#8217;t change just because we tilted our head).</p>

<p>Well, vector a has new coordinates (a1, a2), and we get:</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/54426ff2bcbd0c0c19cac25e8eb8b5a1.png' title='\displaystyle{a1 \cdot |\vec{b}| + a2 \cdot 0 = a1 \cdot |\vec{b}|}' alt='\displaystyle{a1 \cdot |\vec{b}| + a2 \cdot 0 = a1 \cdot |\vec{b}|}' align=absmiddle class='tex'></p>

<p>a1 is really &#8220;What is the x-coordinate of a, assuming b is the x-axis?&#8221;. That is |a|cos(&#952;), aka the &#8220;projection&#8221;:</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/589231b1e74b52bba5238e289552840c.png' title='\displaystyle{\vec{a} \cdot \vec{b} = |\vec{a}|\cos(\theta)|\vec{b}|}' alt='\displaystyle{\vec{a} \cdot \vec{b} = |\vec{a}|\cos(\theta)|\vec{b}|}' align=absmiddle class='tex'></p>

<h2>Analogies for the Dot Product</h2>

<p>The common interpretation is &#8220;geometric projection&#8221;, but it&#8217;s so sterile. Here&#8217;s some analogies that click for me:</p>

<p><strong>Energy Absorbtion</strong></p>

<p>One vector are solar rays, the other is where the solar panel is pointing (yes, yes, the normal vector). Larger numbers mean stronger rays or a larger panel. How much energy is absorbed?</p>

<ul>
<li>Energy =  Overlap in direction * Strength of rays * Size of panel</li>
<li><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/4ee8bf2bcf46916dd0d58c8223707f7a.png' title='\displaystyle{Energy = \cos(\theta) \cdot |a| \cdot |b|}' alt='\displaystyle{Energy = \cos(\theta) \cdot |a| \cdot |b|}' align=absmiddle class='tex'></li>
</ul>

<p>If you hold your panel sideways to the sun, no rays hit (cos(&#952;) = 0).</p>

<p><img src="http://betterexplained.com/wp-content/uploads/dotproduct/Solar_Panel.png" alt="Solar Panel Dot Product" />
<a href="http://www.flickr.com/photos/knowmybackyard/2394376192/">Photo credit</a></p>

<p>But&#8230; but&#8230; solar rays are leaving the sun, and the panel is facing the sun, and the dot product is negative when vectors are opposed! Take a deep breath, and remember the goal is to embrace the analogy (besides, physicists lose track of negative signs all the time).</p>

<p><strong>Mario-Kart Speed Boost</strong></p>

<p>In Mario Kart, there are &#8220;boost pads&#8221; on the ground that increase your speed (Never played? I&#8217;m sorry.)</p>

<p><img src="http://betterexplained.com/wp-content/uploads/dotproduct/mario_kart_vector.png" alt="Solar Panel Dot Product" />
<a href="http://www.mariokartwii.com/f72/official-mario-kart-wii-model-hacking-new-39114-409.html">Photo source</a></p>

<p>Imagine the red vector is your speed (x and y direction), and the blue vector is the orientation of the boost pad (x and y direction). Larger numbers are more power.</p>

<p>How much boost will you get? For the analogy, imagine the pad multiplies your speed:</p>

<ul>
<li>If you come in going 0, you&#8217;ll get nothing</li>
<li>If you cross the pad perpendicularly, you&#8217;ll get 0 [just like the banana obliteration, it will give you 0x boost in the perpendicular direction]</li>
</ul>

<p>But, if we have some overlap, our x-speed will get an x-boost, and our y-speed gets a y-boost:</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/98dd3e634c7022901a8b70f689859752.png' title='\displaystyle{Total = speed_x \cdot boost_x + speed_y \cdot boost_y}' alt='\displaystyle{Total = speed_x \cdot boost_x + speed_y \cdot boost_y}' align=absmiddle class='tex'></p>

<p>Neat, eh? Another way to see it: your incoming speed is |a|, and the max boost is |b|. The amount of boost you actually get (for being lined up with it) is cos(&#952;), for the total |a||b|cos(&#952;).</p>

<p><strong>Physics Physics Physics</strong></p>

<p>The dot product appears all over physics: some field (electric, gravitational) is pulling on some particle. We&#8217;d love to multiply, and we could if everything were lined up. But that&#8217;s never the case, so we take the dot product to account for potential differences in direction.</p>

<p>It&#8217;s all a useful generalization: Integrals are &#8220;multiplication, taking changes into account&#8221; and the dot product is &#8220;multiplication, taking direction into account&#8221;.</p>

<p>And what if your direction is changing? Why, take the <a href="http://en.wikipedia.org/wiki/Line_integral">integral of the dot product</a>, of course!</p>

<h2>Onward and Upward</h2>

<p>Don&#8217;t settle for &#8220;Dot product is the geometric projection, justified by the law of cosines&#8221;. Find the analogies that click for you! Happy math.</p>
]]></content:encoded>
			<wfw:commentRss>http://betterexplained.com/articles/vector-calculus-understanding-the-dot-product/feed/</wfw:commentRss>
		<slash:comments>22</slash:comments>
		</item>
		<item>
		<title>Understanding Pythagorean Distance and the Gradient</title>
		<link>http://betterexplained.com/articles/understanding-pythagorean-distance-and-the-gradient/</link>
		<comments>http://betterexplained.com/articles/understanding-pythagorean-distance-and-the-gradient/#comments</comments>
		<pubDate>Fri, 04 Nov 2011 16:24:48 +0000</pubDate>
		<dc:creator>kalid</dc:creator>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Vector Calculus]]></category>

		<guid isPermaLink="false">http://betterexplained.com/?p=1460</guid>
		<description><![CDATA[The <a href="http://betterexplained.com/articles/measure-any-distance-with-the-pythagorean-theorem/">Pythagorean Theorem</a> shows how strange our concept of distance is. Using the rule a<sup>2</sup> + b<sup>2</sup> = c<sup>2&#8230; <a href="http://betterexplained.com/articles/understanding-pythagorean-distance-and-the-gradient/" class="read_more">Read article</a></sup>, we can trade some &#8220;a&#8221; to get more &#8220;b&#8221;.

Starting with



means &#8220;A 13-inch pizza equals a]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://betterexplained.com/articles/measure-any-distance-with-the-pythagorean-theorem/">Pythagorean Theorem</a> shows how strange our concept of distance is. Using the rule <span class="tex-inline" alt="a^2 + b^2 = c^2">a<sup>2</sup> + b<sup>2</sup> = c<sup>2</sup></span>, we can trade some &#8220;a&#8221; to get more &#8220;b&#8221;.</p>

<p>Starting with</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/4fe780744f65a3a106be7088db2d4cc4.png' title='\displaystyle{13^2 + 0^2 = 13^2}' alt='\displaystyle{13^2 + 0^2 = 13^2}' align=absmiddle class='tex'></p>

<p>means &#8220;A 13-inch pizza equals a 13-inch pizza&#8221;. Sure. But we can trade an inch and get:</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/87c51dc809eaac5cee3f2bd3954af2fa.png' title='\displaystyle{12^2 + 5^2 = 13^2}' alt='\displaystyle{12^2 + 5^2 = 13^2}' align=absmiddle class='tex'></p>

<p>Huh? A 12-inch pizza and a 5-inch pizza equal a 13-inch pizza?</p>

<p>The math works (144 + 25 = 169) but, but&#8230; we gave up an inch and got a five-inch pizza!</p>

<p>Let&#8217;s understand why the tradeoff happens, and how to use it.</p>

<h2>Explanation 1: Shaving the Square</h2>

<p>A key insight: <strong>Bigger numbers are harder to square</strong>.</p>

<p><img src="http://betterexplained.com/wp-content/uploads/pythagdistance/shavingthesquare.png" alt="Shaving the Square" /></p>

<p>Imagine laying tiles on a porch &#8212; as your porch grows, the outer layer needs more tiles. Trimming a 13&#215;13 porch to 12&#215;12 frees up 25 tiles, which is enough to make a new 5&#215;5 porch!</p>

<p>I call this &#8220;shaving the square&#8221;. Trimming 1 unit from the outside of a large square has more &#8220;shavings&#8221; which can contribute to a smaller one (trimming an inch from a giant fro can make a sweater for an infant). As we continue to trim, the benefit diminishes because our starting point is smaller and smaller.</p>

<h2>Explanation 2: Sliding the Chopstick</h2>

<p>A second insight: <strong>Slide a little, pivot a lot</strong>.</p>

<p>Imagine a chopstick wedged in a corner: the length is fixed, and the ends of the chopstick must touch a wall. What&#8217;re the options?</p>

<p>Well, laying on a single wall means 100% for one side (like saying <span class="tex-inline" alt="13^2 + 0^2 = 13^2">13<sup>2</sup> + 0<sup>2</sup> = 13<sup>2</sup></span>). Not that interesting.</p>

<p>By sliding the chopstick (from 13 to 12) we can swing it <em>out</em> by 5 on the other wall!</p>

<p><img src="http://betterexplained.com/wp-content/uploads/pythagdistance/slideandpivot.png" alt="Shaving the Square" /></p>

<p>You need to try it &#8212; a small slide gives a giant pivot. As we keep sliding, the tradeoff (How much pivot do we get?) changes.</p>

<h2>So What&#8217;s the Tradeoff?</h2>

<p>Time to see how the a/b tradeoff works. First, let&#8217;s use grid coordinates: x &#038; y (horizontal and vertical). Given a fixed distance (13 units, let&#8217;s say), our options lay on the circle where <span class="tex-inline" alt="x^2 + y^2 = 13^2">x<sup>2</sup> + y<sup>2</sup> = 13<sup>2</sup></span>:</p>

<p>A few points:</p>

<ul>
<li>Each possibility is the same distance, but has a different ratio of x to y (100% x, 100% y, or a mix like (12,5))</li>
<li>We can only move to neighboring points on the circle (options at the same distance)</li>
<li>The tradeoff we face is how much &#8220;x&#8221; we get for &#8220;y&#8221; when moving to a neighbor. If we&#8217;re at (0, 13) we could move to (5, 12). This trades 1 y for 5 x&#8217;s.</li>
</ul>

<p>This is the &#8220;chunky&#8221; tradeoff where we&#8217;re using an entire unit at a time. What about .5 units? .01?</p>

<p>Enter the tangent! The <strong>tangent line</strong> shows the trajectory of our current path, the direction to our neighbor. We follow the tangent for a tiny, microscopic amount to get our next neighbor. The tangent is an approximation &#8212; it&#8217;s not pointing exactly at our nearest neighbor, but it&#8217;s pretty close.</p>

<p><strong>The tangent shows the tradeoff you are about to make.</strong></p>

<p>What&#8217;s the actual amount? Any point (x,y) has a slope of y/x, and a tangent line with slope -x/y, so the tradeoff is&#8230;getting confused yet?</p>

<p>Less mindless algebra, more intuition:</p>

<ul>
<li>Circles have a tangent line perpendicular to the current point</li>
<li>If you&#8217;re at (5,12) then tangent slope is some ratio of 5 and 12</li>
<li>Remember &#8220;shaving the square&#8221;: you get a better deal in the direction of the smaller coordinate (increasing a large square is tough).</li>
<li>So, at (5, 12) you&#8217;re &#8220;heavy on the y&#8221; and the trade will favor improving your x: it should be &#8220;trade 5 y&#8217;s for 12 x&#8217;s&#8221;. And why not the other way? It doesn&#8217;t make sense that the more y you have, the <em>easier</em> it is to get y! That&#8217;d spiral off into exponential growth, not a circle.</li>
<li>Lastly, we can&#8217;t trade an entire chunk of 5 y&#8217;s! The tangent is about our nearest neighbor. We have a trade of 12/5 or 2.4 to 1. Our next, tiny movement will be at this ratio (and then we&#8217;ll be at a new point, with a new tangent).</li>
</ul>

<p>General principle: Our neighbors are on a circle, which encourages balance. You get a better deal in the direction of the smaller coordinate: at (x,y) the tradeoff is y:x.</p>

<h2>Optimizing The Tradeoff</h2>

<p>Now we know the tradeoff for any point (x,y) &#8212; let&#8217;s optimize!</p>

<p>In a boring scenario, we get paid based on pure distance, so every point (or direction to move) is the same.</p>

<p>The exciting scenario: our (x,y) position is an <em>input</em> into some other function which gives us a return! Now we want to maximize that function.</p>

<p>Here&#8217;s a scenario: Popeye throws cars for cash. He lines up spectators on fences running North and East. The spectators must look straight ahead (they&#8217;re in neck braces, due to earlier events) but will pay Popeye if they see a car pass in front of them.</p>

<p><img src="http://betterexplained.com/wp-content/uploads/pythagdistance/popeyeshow.png" alt="Popeye's show" /></p>

<h2>Maximizing Even Payouts</h2>

<p>Suppose each spectator offers $1 if they see the car (Payout (x,y) = x + y). Where to throw?</p>

<p>First, assume Popeye has finite energy &#8212; he can throw the car 13 meters. Now let&#8217;s start somewhere: throwing the car pure North (0, 13):</p>

<p>P(0,13) = 0 + 13 = $13</p>

<p>Ok. What if he threw it slightly East? To (5, 12) let&#8217;s say?</p>

<p>P(5,12) = 5 + 12 = $17</p>

<p>Clearly better. This should make sense: at (0,13) the tradeoff is <em>great</em> to get more East. We can give up 1 North and get a whopping 5 East, a &#8220;profit&#8221; of $4 if we do the trade. We should keep trading as long as it&#8217;s profitable &#8212; as long as we&#8217;re out of balance, the circle will reward us for boosting the smaller side. Following a 45 degree angle for 13 units is the ideal:</p>

<p>P(13 * 1/sqrt(2), 13 * 1/sqrt(2)) = P(13 * .707, 13 * .707) = 9.2 + 9.2 = $18.4</p>

<p>Neat. A 45-degree throw hits 70.7% of the possible spectators for each side.</p>

<p>Psst. Confused about how we got .707? No problem. Taking sides of 1 and 1 means the hypotenuse is 2:</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/528ca43e1d8dc4fd0d1486fe586b3366.png' title='\displaystyle{1^2 + 1^2 = 2}' alt='\displaystyle{1^2 + 1^2 = 2}' align=absmiddle class='tex'></p>

<p>But we want a trajectory of length 1, so we can scale it up easily (multiply by 13). So we divide both sides by 2:</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/33ca50402c348f07ad8e9d4fb44f4957.png' title='\displaystyle{\frac{1^2}{2} + \frac{1^2}{2} = \frac{2}{2} = 1}' alt='\displaystyle{\frac{1^2}{2} + \frac{1^2}{2} = \frac{2}{2} = 1}' align=absmiddle class='tex'></p>

<p>To get the new &#8220;a&#8221; and &#8220;b&#8221; back out, convert 2 into &#8220;sqrt(2)^2&#8243;:</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/e05e6103ff367469f368e872ccd97e6c.png' title='\displaystyle{\frac{1^2}{(\sqrt 2)^2} + \frac{1^2}{(\sqrt 2)^2} = \left( \frac{1}{\sqrt 2} \right)^2 + \left( \frac{1}{\sqrt 2} \right) ^2 = a^2 + b^2 = 1}' alt='\displaystyle{\frac{1^2}{(\sqrt 2)^2} + \frac{1^2}{(\sqrt 2)^2} = \left( \frac{1}{\sqrt 2} \right)^2 + \left( \frac{1}{\sqrt 2} \right) ^2 = a^2 + b^2 = 1}' align=absmiddle class='tex'></p>

<h2>General Technique: Finding the Best Direction</h2>

<p>We stumbled upon the way to find the best return:</p>

<ul>
<li>Pick any starting point / direction</li>
<li>Tweak it: if our return improves, keep the new choice (it&#8217;s profitable)</li>
<li>Keep tweaking until our return is no longer profitable</li>
</ul>

<p>In math slang, this is &#8220;finding the local maximum&#8221;. In economics slang, it&#8217;s finding the point of &#8220;zero marginal returns&#8221;. Popeye calls it Squeezing the Spinach.</p>

<h2>Maximizing Uneven Returns</h2>

<p>Now suppose the Northern spectators offer $2 (Eastern stay at $1), so P(x,y) = x + 2*y. Should we throw it 100% North?</p>

<p>P(0, 13) = 0 + 2*13 = $26</p>

<p>Not bad. But what about 45 degrees again?</p>

<p>P(9.2, 9.2) = 9.2 + 2*9.2 = $27.6</p>

<p>Interesting &#8212; 45 degrees is still better! But&#8230; I think we went too far! Shouldn&#8217;t we favor North since it pays more?</p>

<p>Yep. Let&#8217;s remember how to Squeeze the Spinach (maximize our returns): start with North and change until it&#8217;s not profitable:</p>

<ul>
<li>The payout function means 1 North = 2 Easts (North pays $2, so 1 unit North = 2 units East)</li>
<li>Trades are profitable if we can beat 1 North for 2 Easts (1 North for 3 Easts, for example, would profit $1)</li>
</ul>

<p>So&#8230; where are trades <em>better</em> than 1 North for 2 Easts? In the Northern section, where the circle rewards us by throwing Easts at us (&#8220;Please, please go East&#8230; I&#8217;ll give you a bunch if you give up a little North&#8221;).</p>

<p>Remember how circles are about x/y, x &#038; y, x:y, etc.? Well, we have the numbers 1 and 2. (2,1) is in the East section. We want (1,2). Why? At (1,2) we have reached the perfect 1 North = 2 East tradeoff.</p>

<p>Following the direction (1,2) for 13 units is:</p>

<p>P(13 * 1/sqrt(5), 13 * 2/sqrt(5)) = P(5.81, 11.62) = 5.81 + 2*11.62 = $29.05</p>

<p>Tada! Over 29 smackeroos because we maximized our return.</p>

<h2>The Gradient Principle</h2>

<p>We can supercharge this result:</p>

<p><strong>To maximize return, go in each direction proportional to its payoff.</strong></p>

<p>If North pays 2:1 compared to East, your trajectory should favor North by 2:1. In mathier terms:</p>

<ul>
<li>Payoff(x,y) = a<em>x + b</em>y</li>
<li>Best trajectory = (a, b)  [in our case, (East, North) => (1, 2)]</li>
</ul>

<p>And this works in multiple dimensions! Given 3 dimensions, go in a direction (Payoff(x), Payoff(y), Payoff(z)). Vector calculus fans, this is why the <a href="http://betterexplained.com/articles/vector-calculus-understanding-the-gradient/">gradient</a> is in the direction of greatest increase.</p>

<p>The gradient for <span class="tex-inline" alt="F(x,y,z)">F(x,y,z)</span> is</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/13c8856e02bc4cf4cea67c6916bf34f9.png' title='\displaystyle{(\frac{dF}{dx},\frac{dF}{dy},\frac{dF}{dz})}' alt='\displaystyle{(\frac{dF}{dx},\frac{dF}{dy},\frac{dF}{dz})}' align=absmiddle class='tex'></p>

<p>And each partial derivative (dF/dx) is the payoff for moving in that direction.</p>

<p>But does it all balance? Suppose x pays 3, y pays 4, and z pays 5 (at the current position). The 2-dimensional tradeoff trajectories are:</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/2b214d4d423707d97da9b5c7dcdd34dd.png' title='\displaystyle{ (x, y) = (3,4) }' alt='\displaystyle{ (x, y) = (3,4) }' align=absmiddle class='tex'>
<img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/a4ca4bb583d2542ab0eeb05f2faba973.png' title='\displaystyle{ (y, z) = (4, 5) }' alt='\displaystyle{ (y, z) = (4, 5) }' align=absmiddle class='tex'>
<img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/43a3f96057809346b877f071453c6e82.png' title='\displaystyle{ (x, z) = (3, 5) }' alt='\displaystyle{ (x, z) = (3, 5) }' align=absmiddle class='tex'></p>

<p>Now for the magic: the combined trajectory</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/e171550b05778387a538d7d3be4bc823.png' title='\displaystyle{(x,y,z) = (3,4,5)}' alt='\displaystyle{(x,y,z) = (3,4,5)}' align=absmiddle class='tex'></p>

<p>satisfies all 3 requirements! On the x-z plane, x doesn&#8217;t care about y &#8212; as long as the ratio to z is (3 , ?, 5) you&#8217;re getting the best tradeoff from the x-z perspective. The pairs are:</p>

<ul>
<li>(3, ?, 5)</li>
<li>(?, 4, 5)</li>
<li>(3, 4, ?)</li>
</ul>

<p>You don&#8217;t need a sudoku master to see (3, 4, 5) satisfies all those proportions.</p>

<p>Still not convinced? Imagine the payoff for y was zero. We don&#8217;t want to waste energy in our trajectory (3, ?, 5) in a useless direction. But that can&#8217;t happen, because the y-z tradeoff will be (?, 0, 5) and the x-y tradeoff will be (3, 0, ?). The x-z tradeoff lets y-z and x-y &#8220;figure out&#8221; what y should be, which is 0.</p>

<h2>Questions I Had That You Might Have Too</h2>

<p><strong>Q: I still don&#8217;t get why this works at all. Somehow 50% in x and 50% in y leads to .7 + .7 = 1.4?</strong></p>

<p>It&#8217;s a deep question about <em>why</em> space behaves like this. I was going crazy staring at chopsticks on a wall.</p>

<p>Here&#8217;s my answer: distance is distance. 13 units is 13 units. But in some situations we are &#8220;measuring our coordinates&#8221; (what are the values of x &#038; y) and <em>not</em> the distance itself.</p>

<p>Coordinates with perpendicular axes are very inefficient, especially for diagonal motion (i.e., you are measuring the sides of the triangle, not the hypotenuse). When .707^2 + .707^2 = 1, it&#8217;s a measure how how &#8220;inefficient&#8221; our x &#038; y coordinates are being. We used 70% of each coordinate to represent an object that could have been 100% on one (i.e, if we used polar coordinates).</p>

<p><strong>Q: I have an offshore investment with 200% return, and an onshore one with 5% return. I have $1000 to spend &#8212; should I split my money?</strong></p>

<p>Heavens, no! Remember, this principle is about <em>distance measurements on a grid</em> with the idea that 50% in x and 50% in y covers &#8220;more ground&#8221; than 100% in x. In investing 1) money is not on a grid and 2) there&#8217;s no distance bonus. Putting half your money in each is plain old 0.5 + 0.5 = 1.0. Giving up $1 of the offshore investment gives you $1 for the onshore one.</p>

<p>Put all your money in the best investment.</p>

<p><strong>Q: So all this stuff is useless?</strong></p>

<p>Heavens, no! Ask yourself: am I measuring distance on a coordinate system?</p>

<p>Many things are measured in terms of x-y coordinates (physical phenomena, etc.) and <em>do</em> have the Pythagorean distance tradeoff.</p>

<p><a href="http://betterexplained.com/articles/types-of-graphs/">But not every graph is the same</a>. Graphs that aren&#8217;t about distance (like &#8220;Money vs. Time&#8221;) do <em>not</em> get any boost from the Pythagorean theorem. This confused me for a long time: the Pythagorean Theorem works for coordinate distance!</p>

<h2>Final Thoughts</h2>

<p>The Pythagorean Theorem is so versatile &#8212; it&#8217;s not about triangles, it covers the nature of distance. I seem to find some new realization when I study it. Really grokking it will help you everywhere, from geometry to vector calculus.</p>

<p>Happy math.</p>
]]></content:encoded>
			<wfw:commentRss>http://betterexplained.com/articles/understanding-pythagorean-distance-and-the-gradient/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Vector Calculus: Understanding Circulation and Curl</title>
		<link>http://betterexplained.com/articles/vector-calculus-understanding-circulation-and-curl/</link>
		<comments>http://betterexplained.com/articles/vector-calculus-understanding-circulation-and-curl/#comments</comments>
		<pubDate>Mon, 19 Feb 2007 23:02:22 +0000</pubDate>
		<dc:creator>kalid</dc:creator>
				<category><![CDATA[Vector Calculus]]></category>

		<guid isPermaLink="false">http://betterexplained.com/articles/vector-calculus-understanding-circulation-and-curl/</guid>
		<description><![CDATA[<strong>Circulation&#8230; <a href="http://betterexplained.com/articles/vector-calculus-understanding-circulation-and-curl/" class="read_more">Read article</a></strong> is the amount of force that pushes along a closed boundary or path. It's the total  "push" you get when going along a path, such as a circle.

A vector field is usually the source of the circulation. If]]></description>
			<content:encoded><![CDATA[<p><strong>Circulation</strong> is the amount of force that pushes along a closed boundary or path. It's the total  "push" you get when going along a path, such as a circle.</p>

<p>A vector field is usually the source of the circulation. If you had a paper boat in a whirlpool, the circulation would be the amount of force that pushed it along as it went in a circle. The more circulation, the more pushing force you have.</p>

<p><strong>Curl</strong> is simply the circulation per unit area, circulation density, or rate of rotation (amount of twisting at a single point). Imagine shrinking your whirlpool down smaller and smaller while keeping the force the same: you'll have a lot of power in a small area, so will have a large curl. If you widen the whirlpool while keeping the force the same as before, then you'll have a smaller curl. And of course, zero circulation means zero curl.</p>

<h2>Intuition</h2>

<p>Circulation is the amount of "pushing" force along a path. Curl is the amount of pushing, twisting, or turning force when you shrink the path down to a single point. Let's use water as an example.</p>

<p>Suppose we have a flow of water and we want to determine if it has curl or not: is there any twisting or pushing force? To test this, we put a paddle wheel into the water and notice if it turns (the paddle is <em>vertical</em>, sticking out of the water like a revolving door -- not like a paddlewheel boat). If the paddle does turn, it means this field has curl at that point. If it doesn't turn, then there's no curl.</p>

<p>What does it really mean if the paddle turns? Well, it means the water is pushing harder on one side than the other, making it twist. The larger the difference, the more forceful the twist and the bigger the curl. Also, a turning paddle wheel indicates that the field is "uneven" and not symmetric; if the field were even, then it would push on all sides equally and the paddle wouldn't turn at all.</p>

<p>The fact that there is a "twist" means the field is <strong>not conservative</strong> (this has nothing to do with its political views).</p>

<p>A conservative field is "fair" in the sense that work needed to move from point A to point B, along any path, is the same. For example, consider a river: its field is conservative. Sure, you can get a free ride downstream, but then you have to do work to get back to your starting point. Or, you can do work to move upstream, and get a free ride back. Either way, the amount of work you "put in" is the same as what you get back.</p>

<p>However, in a field with curl (like a whirlpool), you can get a free ride by moving in the direction of the twist. In a whirlpool, you can get a free trip by moving with the current in a circle. If you fight the current and go the wrong way, you have to use energy with no free ride at all.</p>

<p>Conservative fields have zero curl: there are no free twists to push you along. Alternatively, if a field has curl, it is not conservative.</p>

<p>Gravity is another example of a conservative field. Technically, if you lift a rock and then let it fall, the energy you get from falling is the same as what you put in to lift the rock. Theoretically speaking, no energy was gained or lost in this transaction.</p>

<h2>Additional Details</h2>

<p>To be technical, curl is a <strong>vector</strong>, which means it has a both a magnitude and a direction.  The magnitude is simply the amount of twisting force at a point.</p>

<p>The direction is a little more tricky: it's the orientation of the axis of your paddlewheel in order to get maximum rotation. In other words, it is the direction which will give you the most "free work" from the field. Imagine putting your paddlewheel sideways in the whirlpool - it wouldn't turn at all. If you put it in the proper direction, it begins turning.</p>

<p>But wait a minute -- aren't there two directions to get a twisting motion? Couldn't you just turn the paddlewheel "upside down" and get the maximum curl as well?</p>

<p>Yep, you're right. By convention alone, if the paddle wheel is rotating counterclockwise, its curl vector points out of the page. This is a type of right-hand rule: make a fist with your right hand and stick out your thumb. If the circulation/pushing force follows the twisting of your fingers (counterclockwise), then the curl vector will be in the direction of your thumb.</p>

<h2>Mathematics</h2>

<p>Circulation is the integral of a vector field along a path - you are adding how much the field "pushes" you along a path.</p>

<p>How do we find this? Well, we should expect some type of dot product, because we want to know the amount that one vector (the force) is pushing in the direction of another (the path). So, the two vectors we need are (1) the path vector and (2) the field vector at every point along the path.</p>

<p>If we have a function that defines the position at any time, \( F(t) \), we can take the time derivative to get the velocity at that position.</p>

<p>The velocity vector is always in the direction of movement -- if you are moving from A to B, the velocity vector will be an arrow from A to B, i.e. your change in position or your direction of movement. So, we can use the velocity to get our direction.</p>

<p>It's important to understand why we aren't using the position vector itself -- it tells us where we are, but not where we're going. We need to know our direction to see how much "push" we are getting: Knowing your position in a river isn't important -- are you going upstream or downstream, and at what angle?</p>

<p>The force vector (2) is defined by the field we are in. No derivatives or other changes are necessary -- every point in the field has some force acting on it.</p>

<p>So, our formula for circulation is:</p>

<p>Force at position \(r = F(r) \)<br />
Direction at position \(r = dr \)<br />
Total pushing force = \(Circulation = \int F(r) \cdot dr \)</p>

<p>Remember, velocity is simply the derivative of position \(r\), so \(dr\) is a vector giving us our direction. We integrate along the entire path and use the dot product to see how much pushing force is applied. We then sum up these "pushes" to get the total circulation.</p>

<p>Since curl is the circulation per unit area, we can take the circulation for a small area (letting the area shrink to 0). However, since curl is a vector, we need to give it a direction -- the direction is normal (perpendicular) to the surface with the vector field. The <strong>magnitude</strong> is the same as before: circulation/area.</p>

<p>Recall that by convention (a bunch of people agreeing), counterclockwise circulation will give a curl pointing out of the page. Using these facts, we can create the formula for curl:</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/be2b4ed947fbb623be6d0f4dabce4e7b.png' title='\displaystyle{ Curl = \frac{circulation}{area} = \frac{\int F(r) \cdot dr}{\int S} }' alt='\displaystyle{ Curl = \frac{circulation}{area} = \frac{\int F(r) \cdot dr}{\int S} }' align=absmiddle class='tex' /></p>

<p>Where \(S\) is the surface we are considering; the direction of the curl is the normal to the surface.</p>

<p>You'll see fancier equations for curl where the surface shrinks to zero (such as in <a href="http://en.wikipedia.org/wiki/Curl">wikipedia</a>), but recognize the basic intuition -- curl is the circulation per unit area.</p>

<h2>Parting Thoughts</h2>

<p>You'll often see curl of a field \(F\) written like this:</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/845ca3bb44edc34f8aa286d6390a22f5.png' title='\displaystyle{ Curl(F) = \nabla \times F }' alt='\displaystyle{ Curl(F) = \nabla \times F }' align=absmiddle class='tex' /></p>

<p>which is a cross-product of the <a href="http://betterexplained.com/articles/vector-calculus-understanding-the-gradient/">gradient</a> and the field \(F\). This has to do with how curl is actually computed, which will be material for another article (and probably in your textbook already -- see <a href="http://en.wikipedia.org/wiki/Curl">wikipedia</a> for details).</p>

<p>If I have been successful, you should understand intuitively what circulation and curl mean, and how we got the formulae above. They spring up naturally from our definition of circulation as "pushing force along a path" and curl as "pushing force/area".</p>

<p>Math should be a tool for clearly stating what we already know. Understand the intuition and then tackle the complicated formulas. Happy math.</p>

<p><span class="caps">PS.</span> Have some fun and check out this video of a famous whirlpool. Imagine the circulation on this (go on, imagine):</p>

<p><object width="425" height="350"><param name="movie" value="http://www.youtube.com/v/dHol4ICeDoo"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/dHol4ICeDoo" type="application/x-shockwave-flash" wmode="transparent" width="425" height="350"></embed></object></p>]]></content:encoded>
			<wfw:commentRss>http://betterexplained.com/articles/vector-calculus-understanding-circulation-and-curl/feed/</wfw:commentRss>
		<slash:comments>33</slash:comments>
		</item>
		<item>
		<title>Vector Calculus: Understanding the Gradient</title>
		<link>http://betterexplained.com/articles/vector-calculus-understanding-the-gradient/</link>
		<comments>http://betterexplained.com/articles/vector-calculus-understanding-the-gradient/#comments</comments>
		<pubDate>Sat, 17 Feb 2007 23:24:45 +0000</pubDate>
		<dc:creator>kalid</dc:creator>
				<category><![CDATA[Vector Calculus]]></category>

		<guid isPermaLink="false">http://betterexplained.com/articles/vector-calculus-understanding-the-gradient/</guid>
		<description><![CDATA[The <strong>gradient</strong> is a fancy word for derivative, or the rate of change of a function. It&#8217;s a vector (a direction to move) that


<ul>
<li>Points in the direction of greatest increase of a function (<a href="http://betterexplained.com/articles/understanding-pythagorean-distance-and-the-gradient/">intuition on why</a>)</li>
<li>Is </li>&#8230; <a href="http://betterexplained.com/articles/vector-calculus-understanding-the-gradient/" class="read_more">Read article</a></ul>]]></description>
			<content:encoded><![CDATA[<p>The <strong>gradient</strong> is a fancy word for derivative, or the rate of change of a function. It&#8217;s a vector (a direction to move) that</p>


<ul>
<li>Points in the direction of greatest increase of a function (<a href="http://betterexplained.com/articles/understanding-pythagorean-distance-and-the-gradient/">intuition on why</a>)</li>
<li>Is zero at a local maximum or local minimum (because there is no single direction of increase)</li>
</ul>



<p>The term gradient (grad) typically refers to the derivative of <strong>vector functions</strong>, or functions of more than one variable. Yes, you can say a line has a gradient (its slope), but using the term gradient for single-variable functions is unnecessarily confusing. Keep it simple.</p>

<p>&#8220;Gradient&#8221; can refer to gradual changes of color, but we&#8217;ll stick to the math definition if that&#8217;s ok with you. You&#8217;ll see the meanings are related.</p>

<h2>Properties of the Gradient</h2>

<p>Now that we know the gradient is the derivative of a multi-variable function, let&#8217;s derive some properties.</p>

<p>The regular, plain-old derivative gives us the rate of change of a single variable, usually x. For example, dF/dx tells us how much the function F changes for a change in x. But if a function takes multiple variables, such as x and y, it will have multiple derivatives: the value of the function will change when we &#8220;wiggle&#8221; x (dF/dx) and when we wiggle y (dF/dy).</p>

<p>We can represent these multiple rates of change in a vector, with one component for each derivative. Thus, a function that takes 3 variables will have a gradient with 3 components:</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/94d6dfca196277bf7d0b144d5bfe139b.png' title='\displaystyle{F(x)}' alt='\displaystyle{F(x)}' align=absmiddle class='tex' /> has one variable and a single derivative: <img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/716645b02d56bd11d78340bcf39da9f7.png' title='\displaystyle{\frac{dF}{dx}}' alt='\displaystyle{\frac{dF}{dx}}' align=absmiddle class='tex' /></p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/d161070d2af20651ee0d8034849f4dcd.png' title='\displaystyle{F(x,y,z)}' alt='\displaystyle{F(x,y,z)}' align=absmiddle class='tex' /> has three variables and three derivatives: <img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/13c8856e02bc4cf4cea67c6916bf34f9.png' title='\displaystyle{(\frac{dF}{dx},\frac{dF}{dy},\frac{dF}{dz})}' alt='\displaystyle{(\frac{dF}{dx},\frac{dF}{dy},\frac{dF}{dz})}' align=absmiddle class='tex' /></p>

<p>The gradient of a multi-variable function has a component for each direction.</p>

<p>And just like the regular derivative, the gradient points in the direction of greatest increase. However, now that we have multiple directions to consider (x, y and z), the direction of greatest increase is no longer simply &#8220;forward&#8221; or &#8220;backward&#8221; along the x-axis, like it is with functions of a single variable.</p>

<p>If we have two variables, then our 2-component gradient can specify any direction on a plane. Likewise, with 3 variables, the gradient can specify and direction in 3D space to move to increase our function.</p>

<h2>A Twisted Example</h2>

<p>I&#8217;m a big fan of examples to help solidify an explanation. Suppose we have a magical oven, with coordinates written on it and a special display screen:</p>

<p><img src="/wp-content/uploads/gradient/gradient_microwave_1_1.jpg" alt="gradient_microwave_1_1.jpg" title="gradient_microwave_1_1.jpg" width="500" height="328" border="0" /></p>

<p>We can type any 3 coordinates (like &#8220;3,5,2&#8243;) and the display shows us the <strong>gradient</strong> of the temperature at that point.</p>

<p>The microwave also comes with a convenient clock. Unfortunately, the clock comes at a price &#8212; the temperature inside the microwave varies drastically from location to location. But this was well worth it: we really wanted that clock.</p>

<p>With me so far? We type in any coordinate, and the microwave spits out the gradient at that location.</p>

<p>Be careful not to confuse the coordinates and the gradient. The <strong>coordinates are the current location</strong>, measured on the x-y-z axis. The <strong>gradient is a direction to move</strong> from our current location, such as move up, down, left or right.</p>

<p>Now suppose we are in need of psychiatric help and put the Pillsbury Dough Boy inside the oven because we think he would taste good. He&#8217;s made of cookie dough, right? We place him in a random location inside the oven, and our goal is to cook him as fast as possible. The gradient can help!</p>

<p>The gradient at any location points in the direction of <strong>greatest increase</strong> of a function. In this case, our function measures temperature. So, the gradient tells us which direction to move the doughboy to get him to a location with a higher temperature, to cook him even faster. Remember that the gradient does <strong>not</strong> give us the coordinates of where to go; it gives us the <strong>direction to move</strong> to increase our temperature.</p>

<p>Thus, we would start at a random point like (3,5,2) and check the gradient. In this case, the gradient there is (3,4,5). Now, we wouldn&#8217;t actually move an entire 3 units to the right, 4 units back, and 5 units up. The gradient is just a direction, so we&#8217;d <strong>follow this trajectory for a tiny bit</strong>, and then check the gradient again.</p>

<p>We get to a new point, pretty close to our original, which has its own gradient. This new gradient is the new best direction to follow. We&#8217;d keep repeating this process: move a bit in the gradient direction, check the gradient, and move a bit in the new gradient direction. Every time we nudged along and follow the gradient, we&#8217;d get to a warmer and warmer location.</p>

<p>Eventually, we&#8217;d get to the hottest part of the oven and that&#8217;s where we&#8217;d stay, about to enjoy our fresh cookies.</p>

<h2>Don&#8217;t eat that cookie!</h2>

<p>But before you eat those cookies, let&#8217;s make some observations about the gradient. That&#8217;s more fun, right?</p>

<p>First, when we reach the hottest point in the oven, what is the gradient there?</p>

<p>Zero. Nada. Zilch. Why? Well, once you are at the maximum location, there is <strong>no direction of greatest increase</strong>. Any direction you follow will lead to a <strong>decrease</strong> in temperature. It&#8217;s like being at the top of a mountain: any direction you move is downhill. A zero gradient tells you to stay put &#8211; you are at the max of the function, and can&#8217;t do better.</p>

<p>But what if there are two nearby maximums, like two mountains next to each other? You could be at the top of one mountain, but have a bigger peak next to you. In order to get to the highest point, you have to go downhill first.</p>

<p>Ah, now we are venturing into the not-so-pretty underbelly of the gradient. Finding the maximum in regular (single variable) functions means we find all the places where the derivative is zero: there is no direction of greatest increase. If you recall, the regular derivative will point to <strong>local</strong> minimums and maximums, and the absolute max/min must be tested from these candidate locations.</p>

<p>The same principle applies to the gradient, a generalization of the derivative. You must find multiple locations where the gradient is zero &#8212; you&#8217;ll have to test these points to see which one is the global maximum. Again, the top of each hill has a zero gradient &#8212; you need to compare the height at each to see which one is higher. Now that we have cleared that up, go enjoy your cookie.</p>

<h2>Mathematics</h2>

<p>We know the definition of the gradient: a derivative for each variable of a function. The gradient symbol is usually an upside-down delta, and called &#8220;del&#8221; (this makes a bit of sense &#8211; delta indicates change in one variable, and the gradient is the change in for all variables). Taking our group of 3 derivatives above</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/b74a21aac14e8940dae48cffec6d15d1.png' title='\displaystyle{grad F(x,y,z) = \nabla F(x,y,z) = (\frac{dF}{dx},\frac{dF}{dy},\frac{dF}{dz})}' alt='\displaystyle{grad F(x,y,z) = \nabla F(x,y,z) = (\frac{dF}{dx},\frac{dF}{dy},\frac{dF}{dz})}' align=absmiddle class='tex' /></p>

<p>Notice how the x-component of the gradient is the partial derivative with respect to x (similar for y and z). For a one variable function, there is no y-component at all, so the gradient reduces to the derivative.</p>

<p>Also, notice how the gradient can itself be a function!</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/02fa0e1c23085f115a6ef75483f32fb0.png' title='\displaystyle{F(x,y,z) = x + y^2 + z^3 }' alt='\displaystyle{F(x,y,z) = x + y^2 + z^3 }' align=absmiddle class='tex' /></p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/ebd3d984ce2301ff90e1acef8710ce35.png' title='\displaystyle{\nabla F(x,y,z) = (1, 2y, 3z^2)}' alt='\displaystyle{\nabla F(x,y,z) = (1, 2y, 3z^2)}' align=absmiddle class='tex' /></p>

<p>If we want to find the direction to move to increase our function the fastest, we plug in our current coordinates (such as 3,4,5) into the equation and get:</p>

<p><img src='http://74.50.62.72/wp-content/plugins/wp-latexrender/pictures/0aefe2f69a70ede6565aa035b0f611d7.png' title='\displaystyle{direction = (1, 2(4), 3(5)^2) = (1, 8, 75)}' alt='\displaystyle{direction = (1, 2(4), 3(5)^2) = (1, 8, 75)}' align=absmiddle class='tex' /></p>

<p>So, this new vector (1, 8, 75) would be the direction we&#8217;d move in to increase the value of our function. In this case, our x-component doesn&#8217;t add much to the value of the function: the partial derivative is always 1.</p>

<p>Obvious applications of the gradient are finding the max/min of multivariable functions. Another less obvious but related application is finding the maximum of a constrained function: a function whose x and y values have to lie in a certain domain, i.e. find the maximum of all points constrained to lie along a circle. Solving this calls for my boy Lagrange, but all in due time, all in due time: enjoy the gradient for now.</p>

<p>The key insight is to recognize the gradient as the generalization of the derivative. <strong>The gradient points to the maximum of the function; follow the gradient, and you will reach the local maximum.</strong></p>

<h2>Questions</h2>

<p><b>Why is the gradient perpendicular to lines of equal potential?</b></p>

<p>Lines of equal potential (&#8220;equipotential&#8221;) are the points with the same energy (or value for f(x,y,z)). In the simplest case, a circle represents all items the same distance from the center.</p>

<p>The gradient represents the direction of greatest change. If it had any component along the line of equipotential, then that energy would be wasted (as it&#8217;s moving closer to a point at the same energy). When the gradient is perpendicular to the equipotential points, it is moving as far from them as possible (<a href="http://betterexplained.com/articles/understanding-pythagorean-distance-and-the-gradient/">this article</a> explains why the gradient is the direction of greatest increase &#8212; it&#8217;s the direction that maximizes the varying tradeoffs inside a circle).</p>]]></content:encoded>
			<wfw:commentRss>http://betterexplained.com/articles/vector-calculus-understanding-the-gradient/feed/</wfw:commentRss>
		<slash:comments>121</slash:comments>
		</item>
	</channel>
</rss>

