The **gradient** is a fancy word for derivative, or the rate of change of a function. It’s a vector (a direction to move) that

- Points in the direction of greatest increase of a function (intuition on why)
- Is zero at a local maximum or local minimum (because there is no single direction of increase)

The term "gradient" is typically used for functions with several inputs and a single output (a scalar field). Yes, you can say a line has a gradient (its slope), but using "gradient" for single-variable functions is unnecessarily confusing. Keep it simple.

“Gradient” can refer to gradual changes of color, but we’ll stick to the math definition if that’s ok with you. You’ll see the meanings are related.

## Properties of the Gradient

Now that we know the gradient is the derivative of a multi-variable function, let’s derive some properties.

The regular, plain-old derivative gives us the rate of change of a single variable, usually x. For example, dF/dx tells us how much the function F changes for a change in x. But if a function takes multiple variables, such as x and y, it will have multiple derivatives: the value of the function will change when we “wiggle” x (dF/dx) and when we wiggle y (dF/dy).

We can represent these multiple rates of change in a vector, with one component for each derivative. Thus, a function that takes 3 variables will have a gradient with 3 components:

- F(x) has one variable and a single derivative: dF/dx
- F(x,y,z) has three variables and three derivatives: (dF/dx, dF/dy, dF/dz)

The gradient of a multi-variable function has a component for each direction.

And just like the regular derivative, the gradient points in the direction of greatest increase (here's why: we trade motion in each direction enough to maximize the payoff).

However, now that we have multiple directions to consider (x, y and z), the direction of greatest increase is no longer simply “forward” or “backward” along the x-axis, like it is with functions of a single variable.

If we have two variables, then our 2-component gradient can specify any direction on a plane. Likewise, with 3 variables, the gradient can specify and direction in 3D space to move to increase our function.

## A Twisted Example

I’m a big fan of examples to help solidify an explanation. Suppose we have a magical oven, with coordinates written on it and a special display screen:

We can type any 3 coordinates (like “3,5,2″) and the display shows us the **gradient** of the temperature at that point.

The microwave also comes with a convenient clock. Unfortunately, the clock comes at a price — the temperature inside the microwave varies drastically from location to location. But this was well worth it: we really wanted that clock.

With me so far? We type in any coordinate, and the microwave spits out the gradient at that location.

Be careful not to confuse the coordinates and the gradient. The **coordinates are the current location**, measured on the x-y-z axis. The **gradient is a direction to move** from our current location, such as move up, down, left or right.

Now suppose we are in need of psychiatric help and put the Pillsbury Dough Boy inside the oven because we think he would taste good. He’s made of cookie dough, right? We place him in a random location inside the oven, and our goal is to cook him as fast as possible. The gradient can help!

The gradient at any location points in the direction of **greatest increase** of a function. In this case, our function measures temperature. So, the gradient tells us which direction to move the doughboy to get him to a location with a higher temperature, to cook him even faster. Remember that the gradient does **not** give us the coordinates of where to go; it gives us the **direction to move** to increase our temperature.

Thus, we would start at a random point like (3,5,2) and check the gradient. In this case, the gradient there is (3,4,5). Now, we wouldn’t actually move an entire 3 units to the right, 4 units back, and 5 units up. The gradient is just a direction, so we’d **follow this trajectory for a tiny bit**, and then check the gradient again.

We get to a new point, pretty close to our original, which has its own gradient. This new gradient is the new best direction to follow. We’d keep repeating this process: move a bit in the gradient direction, check the gradient, and move a bit in the new gradient direction. Every time we nudged along and follow the gradient, we’d get to a warmer and warmer location.

Eventually, we’d get to the hottest part of the oven and that’s where we’d stay, about to enjoy our fresh cookies.

## Don’t eat that cookie!

But before you eat those cookies, let’s make some observations about the gradient. That’s more fun, right?

First, when we reach the hottest point in the oven, what is the gradient there?

Zero. Nada. Zilch. Why? Well, once you are at the maximum location, there is **no direction of greatest increase**. Any direction you follow will lead to a **decrease** in temperature. It’s like being at the top of a mountain: any direction you move is downhill. A zero gradient tells you to stay put – you are at the max of the function, and can’t do better.

But what if there are two nearby maximums, like two mountains next to each other? You could be at the top of one mountain, but have a bigger peak next to you. In order to get to the highest point, you have to go downhill first.

Ah, now we are venturing into the not-so-pretty underbelly of the gradient. Finding the maximum in regular (single variable) functions means we find all the places where the derivative is zero: there is no direction of greatest increase. If you recall, the regular derivative will point to **local** minimums and maximums, and the absolute max/min must be tested from these candidate locations.

The same principle applies to the gradient, a generalization of the derivative. You must find multiple locations where the gradient is zero — you’ll have to test these points to see which one is the global maximum. Again, the top of each hill has a zero gradient — you need to compare the height at each to see which one is higher. Now that we have cleared that up, go enjoy your cookie.

## Mathematics

We know the definition of the gradient: a derivative for each variable of a function. The gradient symbol is usually an upside-down delta, and called “del” (this makes a bit of sense – delta indicates change in one variable, and the gradient is the change in for all variables). Taking our group of 3 derivatives above

Notice how the x-component of the gradient is the partial derivative with respect to x (similar for y and z). For a one variable function, there is no y-component at all, so the gradient reduces to the derivative.

Also, notice how the gradient is a function: it takes 3 coordinates as a position, and returns 3 coordinates as a direction.

If we want to find the direction to move to increase our function the fastest, we plug in our current coordinates (such as 3,4,5) into the gradient and get:

So, this new vector (1, 8, 75) would be the direction we’d move in to increase the value of our function. In this case, our x-component doesn’t add much to the value of the function: the partial derivative is always 1.

Obvious applications of the gradient are finding the max/min of multivariable functions. Another less obvious but related application is finding the maximum of a constrained function: a function whose x and y values have to lie in a certain domain, i.e. find the maximum of all points constrained to lie along a circle. Solving this calls for my boy Lagrange, but all in due time, all in due time: enjoy the gradient for now.

The key insight is to recognize the gradient as the generalization of the derivative. **The gradient points to the direction of greatest increase; keep following the gradient, and you will reach the local maximum.**

## Questions

**Why is the gradient perpendicular to lines of equal potential?**

Lines of equal potential (“equipotential”) are the points with the same energy (or value for F(x,y,z)). In the simplest case, a circle represents all items the same distance from the center.

The gradient represents the direction of greatest change. If it had any component along the line of equipotential, then that energy would be wasted (as it’s moving closer to a point at the same energy). When the gradient is perpendicular to the equipotential points, it is moving as far from them as possible (this article explains why the gradient is the direction of greatest increase — it’s the direction that maximizes the varying tradeoffs inside a circle).

## Other Posts In This Series

- Vector Calculus: Understanding the Dot Product
- Vector Calculus: Understanding the Cross Product
- Vector Calculus: Understanding Flux
- Vector Calculus: Understanding Divergence
- Vector Calculus: Understanding Circulation and Curl
- Vector Calculus: Understanding the Gradient
- Understanding Pythagorean Distance and the Gradient

meFebruary 22, 2007 at 10:43 ami like it… well explained.

adamsmithApril 13, 2017 at 7:58 pmI also agree with you http://jualspreiantiair.com

JaneMarch 2, 2007 at 10:00 amSuper!!!

ChrisMarch 21, 2007 at 8:01 pmYou are the man! Nice work!

KalidApril 1, 2007 at 1:07 pmThanks, glad it was helpful for you.

gauravJune 9, 2007 at 2:18 ami was always looking for conceptual and practical examples and yes i finally got.

HarryJune 10, 2007 at 4:04 pmAwsome!

PaloAugust 15, 2007 at 5:05 pmwell you made a good explanation, that even a not-so-smart guy gets it, but i think you missed the obvious -> WHY does gradient show the direction of the greatest increase.

I think that the principle of the gradient is quite easy, but understanding why does it work the way it does is a bit tricky and you should have focued on it more.

It would be interesting if you would somehow add it to this good article. Inspiration http://mathforum.org/library/drmath/view/68326.html

good luck !

KalidAugust 15, 2007 at 5:51 pmHi Palo, that’s a great point! I’ve been feeling a bit guilty, if you can imagine it, because I’ve lacked that explanation :)

I’m probably going to do a separate article on the reason *why* the gradient points in the direction of greatest increase — I have another explanation that it works well with. Thanks for the link and feedback!

John GabrielSeptember 16, 2007 at 6:47 amYour introduction is not quite correct:

You claim: “Points in the direction of greatest increase of a function”.

Why? It can also point in the direction of greatest decrease of a function.

A gradient is one or more directional derivatives. These derivatives are considered in a particular direction. In the case of single variable calculus, we generally talk about a directional derivative when we consider multiples of the x unit vector, i.e. k*(1,0). To consider the y unit vector, we deal with the partial derivatives with respect to y in a given direction. In three dimensions, the 3 partial derivatives form what we now call a ‘gradient’.

So in fact it is incorrect to call this a slope or anything else except to say that it describes the partial derivatives of a point in the direction of a given vector in space.

Does this make sense? Please visit my blog for some more interesting reading.

http://mathphile.blogspot.com/

KalidSeptember 19, 2007 at 11:49 amHi John, thanks for writing. You’re right, the formal definition of a gradient is a set of directional derivatives.

But when thinking about the intuitive meaning, I think it’s ok to consider the gradient as a vector that “points” in the direction of greatest increase (i.e. if you follow that direction your function will tend towards a local maximum).

Unless I’m mistaken, the gradient vector always points in the direction of greatest increase (greatest decrease would be in the opposite direction).

John GabrielSeptember 27, 2007 at 7:21 pmWhat I was saying is that it points either one way or the other, it is not restricted to the direction of greatest increase. As a simple example, consider what happens when you differentiate a parabola: You set the derivative equal to 0 and then you determine that it has either a maximum or a minimum at its turning point. It is not always a maximum just as it is not always a minimum. Think I have explained this correctly now.

sqibOctober 5, 2007 at 3:41 pmgood john you have done a great job.

KalidOctober 5, 2007 at 10:23 pmHi John, thanks for the clarification. I’d still politely disagree and say that in general, the gradient points in the direction of greatest increase :).

In the case of 2 dimensions, the gradient/slope only gives a forward or backward direction. A positive slope means travel “forward” and a negative slope means travel “backwards”.

Consider f(x) = x^2, a regular parobola. The gradient is zero at the minimum (x=0), and there is no *single* direction to go. At x = -1, the slope is negative, which means travel “backwards” (to x = -2) to increase your value. Similarly, at x = 1, you travel forward (to x = 2) to increase your value.

But, as you mention, strange things can happen when the derivative = 0. It can mean you are at a local maximum (no way to improve), or at a local minimum (no single direction to improve your position — forward or back will help). I consider the corner case of zero an exception to the general rule / intuition that the gradient is “the direction to follow” if you want to improve your function.

liviuFebruary 9, 2017 at 3:43 pmf(x)=x^2 is ONE dimension not two dimension

Prakash ShresthaJune 16, 2017 at 6:51 amIT is actually 2 dimensional. f(x)=y=x^2

ShenandoahSeptember 19, 2017 at 6:18 pmPrakash: the way you defined the function, f maps from R onto R. It takes a one-dimensional input x and gives a one-dimensional output f(x). That is, f(8)=64 is a scalar. An example of a two-dimensional function would be g(x) = (x^2 , x+4), since it outputs a two-dimensional vector.

The GRAPH of the function f over its domain x makes a two-dimensional drawing. This is probably what you were thinking of. But it is possible to graph a function in any number of dimensions.

VidhyaOctober 10, 2007 at 9:09 amWonderful explanation!

KalidOctober 10, 2007 at 10:17 amThanks Vidhya, glad you liked it.

bihazoOctober 21, 2007 at 6:21 amhi john keep it up you done a great job

TravisDecember 2, 2007 at 5:53 pmThanks a bunch! I didn’t think it could be this simple to find the maximum increase at a point, so I thought I’d look it up. Thanks to your great explaination, it turn out it was as easy as it seemed it should be. Great job! Thanks!

Travis

KalidDecember 2, 2007 at 7:27 pmAwesome, glad it worked for you :)

caitlynDecember 5, 2007 at 11:09 pmthanks!!!!

KalidDecember 6, 2007 at 1:03 amHi Caitlyn, you’re welcome.

DerekDecember 10, 2007 at 12:27 pmThanks! The sadistic microwave example helped a lot.

KalidDecember 10, 2007 at 4:54 pmAwesome, glad it was useful :).

John GabrielDecember 11, 2007 at 7:32 amHello Kalid,

Did not read your reply for some

time. Am sorry you do not agree. :-)

Let me give you an example:

Suppose we are dealing with pressure

and height in a certain ‘cubic’

area. Suppose that the middle of the

cube height is 0 meters. Also suppose

that we have a whirlpool generated in the

cube such that the pressure rate increases

as we go below the middle of the cube.

Anything below is negative height and anything above

is positive height. Now, as one rises

higher in the cube, the pressure decreases.

If we find the gradient, then according to

your definition (and many others’), then

the gradient vector for the rate of greatest

increase will point below the middle of the

cube, not above. But above the middle we

find the greatest ‘decrease’ in rate of pressure.

In this example, greatest increase points

downwards and greatest decrease upwards.

It would probably be better to define

gradient as a vector that points in a

direction of greatest increase or decrease.

It’s additive inverse will point in the

diretion of greatest decrease or increase

respectively. For most physical phenomena,

your definition would generally be true.

But what happens when you have an anomaly?

Make sense?

srashtitomaMay 31, 2016 at 3:48 amVery nice explain

John GabrielDecember 11, 2007 at 7:34 amI do not believe I have the best answer to this question but like yourself, I am a believer in trying to find the best possible explanation. Once again, I like your website. Keep up the good work Kalid!

John GabrielJanuary 16, 2008 at 1:06 pmOkay, I think I have the best answer. If f is a real-valued function, then del(f) or gradient of f points to the greatest increase, whereas -del(f) points t0 the greatest decrease.

For once planet math has some decent information on this since I last checked:

http://planetmath.org/encyclopedia/Gradient.html

I do not endorse everything Planet Math publishes but this particular information appears to be correct. In any event, it clears up the previous confusion I think.

KalidJanuary 16, 2008 at 7:11 pmHi John, thanks for the comment! Yes, that’s an important distinction to make: the positive gradient is the greatest increase, and the negative gradient is the greatest decrease. Thanks for helping clarify :).

JaredMarch 24, 2008 at 6:42 pmThank you!

BigmouthMay 24, 2008 at 6:20 pmThis actually makes sense to me. Thanks!

KalidMay 24, 2008 at 8:10 pm@Jared, Bigmouth: Cool, glad it was helpful!

AnonymousJune 6, 2008 at 1:44 pmdid not grasp the idea

KalidJune 6, 2008 at 1:45 pmBe more specific. The gradient is the direction to move that gives you the biggest increase.

ShaheenJune 13, 2008 at 9:21 pmIt helps me a lot. But I have some doubt still now.Is it the same concept for gradient of each vertex in a triangle mesh?

Thanks so much.

JohnnyTJune 14, 2008 at 5:15 pmKalid

Thanks for the great explanations! I thought I was math-retarded for some time; however your writings actually make sense to me!

Take care!

Johnny T

KalidJune 14, 2008 at 6:39 pm@Shaheen: Thanks, glad you enjoyed it. I’m not sure I understand the question: in a triangle mesh, you could measure the gradient at each vertex to find the “best” direction to move. Again, not sure if this is your question.

@Johnny T: Thank you for the comment! Yes, when a subject seems difficult (as vector calculus was for me) sometimes it’s just because the explanation wasn’t clicking properly. Thanks for dropping by.

wali khanJuly 3, 2008 at 4:40 amwell done,excellent explaination with solid examples

KalidJuly 3, 2008 at 8:30 pmThanks Wali, glad you enjoyed it.

A Gentle Introduction To Learning Calculus | BetterExplainedJuly 31, 2008 at 7:14 pm[…] Div, Grad, Flux and Curl (if you already know vector calculus) […]

j.sathish kumarSeptember 17, 2008 at 3:11 amthanks

but i have some doubts.how the differentaion gives the maximum space rate of change. as per my understandings differentiation only is difference between two point in the region say p1 and p2.can u clarify

leonOctober 3, 2008 at 10:05 pmThanks a lot for explaining the concept.

sophieNovember 4, 2008 at 2:08 pmi was having so much trouble understanding this and now its all clear thank you so much!

KalidNovember 4, 2008 at 3:55 pm@lon, sophie: Thanks, glad you enjoyed it!

Ryan JohnsonNovember 10, 2008 at 12:33 amJesus. This was a lot better explained than in my text book and by my professor. I thought we were using the gradient as the normal vector but I really doubted that it could be that.

KalidNovember 10, 2008 at 1:16 pm@Ryan: Thanks! I struggled with this concept for a while also.

Ranjeet KumarJanuary 15, 2009 at 11:07 amthanks ! this explanation made me clear how to find the direction of smallest change.It is just the 90 degree rotation of gradiant(the direction of largest change).

Shakeel AhmedJanuary 20, 2009 at 9:19 pmThanks very much for your effort

BillMarch 20, 2009 at 12:09 amUm — in your microwave example, aren’t you pushing the doughboy out the back of the microwave? (Just wanted to understand the concept). I love these essays, btw, keep them coming!

HeheheMarch 20, 2009 at 8:38 amI loved the microwave analogy.also thanks for clarifying the upsidedown delta now everything makes more sense

RAHULApril 10, 2009 at 1:54 amstil im confused between scalar field and vector field….

Better Explained « Xavier Seton’s BlogMay 7, 2009 at 12:43 am[…] Vector Calculus: Gradient, Flux, Divergence, Curl & Circulation […]

aradhita chattopadhyayJuly 16, 2009 at 10:00 pmhow can such a mathematical expression denote the max change? pls i didnt understand the relation of this with mathematics. pls reply sir.

nat2_bam2August 2, 2009 at 2:30 amthank you soo much!!

its a big help for our project…

Can we have your number?hehe

KalidAugust 3, 2009 at 6:29 pm@Rahul: A scalar field returns a single value (x), but a vector field returns multiple values (x,y,z). Usually the multiple values (x,y,z) are taken as a “direction” to follow.

@aradhita: Hi, that’s a question I need to get into in a later post.

@nat2_bam2: Thanks!

MigsAugust 22, 2009 at 10:57 pmHi kalid! i read your explanation. oh this is very helpful! by the way can you give an example on how to apply this on a situation of the classic “mountain and mountain climber” problem? hope you will reply. thanks again your explanations were clear

KalidAugust 27, 2009 at 3:32 pm@Migs: Great question. The classic “mountain climber” problem is when the vector field gives the height of the mountain (z) at a certain position (x,y), so z = f(x,y).

The gradient at any position x,y will give you the direction of the _greatest increase_ in z. That is, the gradient will point in the “most uphill”. Following the gradient will give you the shortest path the the top of the mountain (technically, the top of the nearest local maximum). How this helps!

vigneshSeptember 5, 2009 at 2:30 ambeautiful…well said

akanshaSeptember 19, 2009 at 8:55 amthanks a lot for the wonderful explanation!!!

KalidSeptember 20, 2009 at 12:19 am@akansha: You’re welcome!

anonymousOctober 14, 2009 at 7:20 pmVery nice! Keep up. Thanks a lot

FlorenciaOctober 23, 2009 at 9:58 amVery nice article!!

Hope to see how to find the maximum of a constrained function soon!!

Thanks a lot!!

KalidOctober 23, 2009 at 12:56 pm@Florencia: Glad you liked it! Thanks for the suggestion.

abNovember 4, 2009 at 8:32 amVery good explanation by the way. So if you are on a landscape given by z=cosy-cosx and u want to get from (0,0,0) to (4pi,0,0) by moving in the direction of the gradient in the positive x-direction how would u explain that? What would that path look like?

P-FNovember 16, 2009 at 1:40 pmThanks for the great explanation. Another topic that would be very interesting for you to cover is the Jacobian, which causes pain for many, many students (including myself).

KalidNovember 17, 2009 at 5:16 pm@P-F: Thanks for the note — I think the Jacobian, and linear algebra in general, would be great to cover. I’ve forgotten a lot of it and am looking to relearning :).

Mark SoricDecember 12, 2009 at 9:03 amJust wondering something. In that case of f(x,y) = X^2 + y^2, a paraboloid – how can the gradient by perpendicular to the tangent plane at all point and only have components in x and y…

gradF(X,Y) = 2x + 2y

How can it point in any other direction other than parallel to the xy plane?

I’m lost here.

prabuFebruary 16, 2010 at 2:25 amthank you kalil. wonderful explanation.

KalidFebruary 16, 2010 at 5:57 pm@prabu: Glad it helped!

AshrafulMay 6, 2010 at 12:19 amIt was a great explanation! But I have a specific problem with gradients. Is there any functions that cant be expressed as gradient of any parameter? What could be the properties of that function?

AshrafulMay 6, 2010 at 12:33 amMay I could be more specific about my previous problem. If a function is constant in all direction, is it possible to express the function as gradient?

KalidMay 8, 2010 at 10:31 pmI’m not sure if I understand the question — the gradient of a constant function would be a 0 vector [perhaps technically (0,0)], that is, there is no direction of greatest increase. If it helps, think of the gradient in terms of a derivative (the derivative of a constant function is 0).

KinarAugust 30, 2010 at 10:45 amMath professional!

AnonymousSeptember 5, 2010 at 3:42 pmThank you for getting to the heart of why del is required and how to intuitively understand it. Its the first time I understand it so well despite reading so much about it before!

AnonymousSeptember 6, 2010 at 8:58 pmdamn! i got it now :-)

AnonymousSeptember 6, 2010 at 9:01 pmmath is so beautiful :-)

KalidSeptember 13, 2010 at 2:24 pm@Anonymous: Agreed :).

bob clearSeptember 10, 2010 at 10:33 pmWOW! great explanation…. thanks dude.:D

KalidSeptember 13, 2010 at 2:17 pm@bob: Thanks!

Like, part 2 « ArcsecondOctober 30, 2010 at 2:13 am[…] article on the gradient was similarly bad, claiming incorrectly that the gradient is “just a direction”. (This […]

JoseNovember 2, 2010 at 7:01 amGreat explanation helped me explain my brother! Nice job! Gonna bookmark it for further needs I might have with it.

jsNovember 2, 2010 at 11:05 pmgreat explanation and example

KalidNovember 3, 2010 at 6:27 pm@js: Thanks!

shreedharNovember 4, 2010 at 5:00 pmhey, explained really well. But still you didn’t provide any sign of why the gradient would always point in the direction of maximum increase…

Nick PellatzNovember 14, 2010 at 9:22 pmI don’t usually comment on blogs, but this is a great explanation. Way better than my text book. A+++++++++

Vector Calculus: Understanding Divergence | BetterExplainedNovember 17, 2010 at 6:27 pm[…] symbol for divergence is the upside down triangle for gradient (called del) with a dot []. The gradient gives us the partial derivatives (dx, dy, dz), and the dot […]

KalidNovember 21, 2010 at 10:53 pm@shreedhar: Thanks — I’d like to cover that in a follow-up article. I need to get a nice, intuitive explanation for it first ;).

@Nick: Thanks, glad it helped.

Al PaquetteNovember 22, 2010 at 6:20 amMan! I just love this kind of explanation. It’s so clear and concise, and it shows me that the author really understands the concept himself.

All mathematics should be taught this way. Go from the specific to the general (abstract). Not the other way around, which is the path usually followed by the type who wants to show off his prowess with math symbols and equations.

AnonymousFebruary 17, 2011 at 4:31 pmnice explanation

AnonymousFebruary 17, 2011 at 4:32 pmdon’t eat that cookie!

KalidFebruary 17, 2011 at 4:44 pm@Al: Thank you! I think one of the big problems in math teaching (especially) is just trying to get things explained without the professor’s “prowess” getting in the way, as you say.

@Anonymous: How could you eat cookies when there’s gradients to be studied? :)

PandiaFebruary 18, 2011 at 6:24 pmNice work!! Thanks man:)

KalidFebruary 22, 2011 at 10:25 am@Pandia: No prob!

GatonFebruary 25, 2011 at 8:49 pmThanks!

MarwaMarch 2, 2011 at 9:03 amThanks alot,I loved your way explaining this, very helpful indeed.

Keep it up.

KalidMarch 2, 2011 at 3:03 pm@Marwa: Thanks, glad it helped!

Vitor P.F.March 9, 2011 at 2:07 amGreat !! Congrats

AlexMarch 27, 2011 at 10:39 pmAwesome! I was cracking my head trying to figure out HW, only to realize how basic it was after reading through ur page. Thanks!

dianneApril 5, 2011 at 8:20 amgoodluck with my exam tom. ^^

AnonymousApril 5, 2011 at 2:26 pmIt wasn’t a bad explanation but I wish you had explained ‘why’ the gradient is the perpendicular vector of the function its derivatives were derived from. This still bothers me a little.

Also, if we have a function with three variables, shouldn’t the independent variable be considered? By considered I mean, if I have a function F(x, y z), then I am saying that w = F(x, y, z), and this function can not be graphed since it has 4 dimentions. A normal F(x, y) can be graphed since you considered the Z, X, and Y of the graph.

From the book I read, I interpreted that the original function has a constant value for ‘w’, hence producing a graph with a new function F2(x, y). However I still didn’t see the math that proves that the gradient of the function F(x, y, z) is actually the vector that is perpendicular to the surface of the graph from which its derivatives were derived from. If you could prove this, it would be really helpful.

mohamedMay 6, 2011 at 1:44 amthx man very much

I understood it totally from u

my regards

BurtonMay 16, 2011 at 8:14 amThank you very much! This made perfect sense and it really helped me out.

KalidMay 17, 2011 at 7:28 am@Burton: Thanks!

panchitoJune 16, 2011 at 4:00 amCool! Thanks

DonglinJune 29, 2011 at 9:56 amgood explain, it solved my problem

AnonymousJune 30, 2011 at 11:07 pmkalid , are u professor

shrikantJuly 20, 2011 at 1:28 amsuch wonderful explanation……..wow

KalidJuly 20, 2011 at 4:41 pm@shrikant: Thanks!

jayakumar.gSeptember 9, 2011 at 11:15 pmso very easy method

MrigehSeptember 14, 2011 at 7:45 amI love you!

KalidSeptember 14, 2011 at 8:48 am@jayakumar.g: Glad it helped

@Mrigeh: :)

hyaaOctober 6, 2011 at 10:16 pmi want to ask, once knowing the maximum rate of change of temperature in your’s microwave example, how we can attain that particular place without moving our coordinates positions as mentioned by microwave for example when we choose coordinates (3,5,2) we obtain gradient as (3,4,5). now from where we get the information that which coordinates should be selected next time that gives us maximum gradient? should we choose (3,4,5) coordinates?

KalidOctober 6, 2011 at 11:41 pm@hyaa: I’m sorry, I don’t think I understand the question. The gradient gives you the direction (not coordinates) of the greatest increase in your current value. You have to follow the gradient for a bit, get to a new point, get the gradient there, follow it for a bit… and so on to maximize your value.

Think of the gradient as a compass which points towards your greatest increase. A compass doesn’t give you the coordinates of North, but tells you how to get there from your current position. Hope that helps.

ZitaOctober 13, 2011 at 8:58 pmHi, I still have a question. If there is a function h(x,y)to denote the height of a mountain at position(x,y). Can I use the knowledge of gradient to locate the top of the mountain and how?

kalidNovember 4, 2011 at 9:26 am@Zita: Yep — you start at any point, and keep following the gradient of h to find the top.

Understanding Pythagorean Distance and the Gradient | BetterExplainedNovember 4, 2011 at 9:33 am[…] go in a direction (Payoff(x), Payoff(y), Payoff(z)). Vector calculus fans, this is why the gradient is in the direction of greatest […]

maxNovember 5, 2011 at 1:52 pmgreat, really good thank you,it would be comprehensive if you explain that ‘why’ the gradient is the perpendicular vector of the function its derivatives were derived from

kalidNovember 7, 2011 at 6:59 pm@max: Great question. Going to add it as a Q & A at the end of the article.

Patel AnkitNovember 15, 2011 at 9:10 pmExcellent explanation, I think if you provide your ebook for free of cost it would really be helpful for the poorer students to strengthen thier grass-roots. :)

GOOD JOB, KEEP IT UP………….

What is Gradient? | İsmail SEZENDecember 14, 2011 at 6:35 pm[…] http://betterexplained.com/articles/vector-calculus-understanding-the-gradient/ […]

ChicoFebruary 16, 2012 at 7:54 pmConsider the directional derivative, f_u.

f_u = f_x u_1 + f_y u_2 (it takes some effort to see this definition of f_u)

=grad(f) dot u (u is a unit vector)

=|grad(f)| [email protected] (@ is the angle between grad(f) and u)

Thus, it is clear that the directional derivative, f_u, is maxed when [email protected]=1.

It follows that @=0 and the directional derivative, f_u, is attained when u is in the direction of the gradient. Therefore, the gradient does indeed give the direction of greatest increase.

Note that f_u is minimized when [email protected]=-1. Thus, @=pi, and u is in the opposite direction of the gradient. QED

ps

I am a nerdy math professor who likes demonstrating mathematical prowess. Thanks for the microwave intuition builder. My students are going to like that.

kalidFebruary 25, 2012 at 11:13 pm@Chico: Awesome, thanks for sharing! I like that a lot — lining up with the gradient (out of all possible directional derivatives) will give you the best return (cosine = 1). That clicks for me.

Glad you enjoyed the microwave intuition, I love searching for little analogies.

DenizFebruary 21, 2012 at 1:06 pmI already knew this but you gave me a better intuition of it and I like your style of writing! Thank you! :)

kalidFebruary 26, 2012 at 1:07 am@Deniz: Thanks! And you’re welcome :).

beant singhMay 23, 2012 at 4:01 amah highly informative and excellently defined

kalidMay 23, 2012 at 5:33 am@beant: Great, glad it helped!

How To Understand Derivatives: The Quotient Rule, Exponents, and Logarithms | BetterExplainedJune 12, 2012 at 6:06 pm[…] is the gradient, a way to represent “From this point, if you travel in the x or y direction, here’s how […]

madhuJuly 5, 2012 at 2:54 amthat cookie tasted awsome

madhuJuly 5, 2012 at 3:38 amcould u explain why does gradient is zero for local minimum?

kalidJuly 6, 2012 at 3:23 pmMadhu: Those cookies are delicious :). The gradient behaves like the first derivative (rate of change). In regular calculus, d/dx = 0 means your function is not changing [therefore you are at a max or min].

Similarly, when the gradient is zero, it means your function is not changing when you move.

AnonymousOctober 13, 2012 at 7:39 amwow gr8

YogeshJanuary 5, 2013 at 9:02 amHey, could you explain WHY gradient points in the direction of maximum increase?

I mean given that this article is mindblowing but I still cant get Why maximum increase..

Also, could you explain a vector field? In case of scalar field, what I imagine is as follows: Consider the 3D space, and for each point my scalar function returns a value.. And with a intensity proportional to that value, a black point appears(Greyscale).. I cant picture anything about a vector field though.. Please consider..

And again, thanks for all the Vector calculus explanations..

kalidJanuary 6, 2013 at 7:34 pmHi Yogesh, thanks for the note.

This article explains why the gradients points in the direction of greatest increase: http://betterexplained.com/articles/vector-calculus-understanding-the-gradient

The essence is “you have a circle of possible directions, the individual derivatives (df/dx and df/dy) give you the tradeoff as you change directions, so find the direction that makes the best use of that tradeoff”.

A vector field is tricky. Imagine your same 3d space, but instead of a point (a single value) imagine that there is wind blowing through it. Each position in your space could feel a different “push” (strength and angle) from the wind. For example, in a cyclone, the push might be in a circle (each individual point is pushed in a different instantaneous direction), with a dead spot in the middle. With a steady wind, every point might feel the same “push”.

someoneMarch 29, 2013 at 2:21 amYou are a gift to mankind! Well to me you are.

JoseMay 29, 2013 at 8:04 pmPeople confused by gradient pointing towards maximally increasing direction, or perpendicularity of gradient to equipotential.

When you have a function, you know what I mean by function, something that looks like a hill, whether its a 2d hill you drew on the paper or a 3d hill you made out of clay on your table, stick your finger at a single point on the hill. Now ask “what is the change height of the hill as I move across the hill a little bit in a “flat” direction?” In the 2d case, you go left or right, the only flat direction. You trace the line one inch to the right (right = positive), following it, and you find the line goes up (up = positive) one inch, so you say the slope is one inch (up) per inch (to the right) near that point. The “per” means divide. Positive / positive = positive. Going back to the left (negative) you say you went down (negative) one inch. Neg/Neg = Positive. The gradient for this 2D example, since its defined a vector pointing along x, can only point along the x axis, but which way? Left or right? The gradient also has a magnitude, which can be very positive, a little positive, a little negative, or very negative. Its magnitude is whatever you got for the slope. When the slope was positive, the gradient will point along x, for sure, but along the positive direction. When the slope was negative, the gradient still points along x, but towards the negative direction (left). So the gradient points along x towards the direction of increasing height, whichever way it is. If the increasing height was to the left of your finger, you would find the slope was negative there after defining left to be negative and up to be positive, and the gradient would point negative along x, saying “look at me, I’m pointing the direction you travel along x to see the hill rise.”

In the 3d case, the clay on your table, you stick your finger and soon realize you get different slopes depending on if you move your finger forwards, backwards, left, right, or diagonal. Lets say you move your finger to the left and it goes up. Then the portion of the gradient pointing to the left is the slope your finger measured, since your finger went up moving that direction a component of the gradient points somewhat in that direction. Then you go back to where you started from. You go forward, and find your finger went up even more than it did going left. Lets say your finger went up two inches going an inch forward, and it went up one inch going an inch to the left. We know a portion of the gradient points forwards (cause your finger went up, not down), and about half that portion (you went up an inch, not two inches like going forwards) points to the left. So the gradient points very forward, and a little left (2 inch up per inch forward, pointing forward, 1 inch up per inch left, pointing left). You will notice this is obviously uphill. It turns out it is EXACTLY uphill. If you go 2 forward for every 1 left you move near that point, you will gain the most height possible. When you gain the most height possible, you are moving exactly perpendicular to the direction you would not gain height if you were moving, aka the equipotential path, or path of same height.

Baljeet KaurJuly 25, 2013 at 10:05 pmits awsome…it clearly explains the actual physical significance of a gradient..thank you..:)

kalidJuly 26, 2013 at 12:47 amHi Baljeet, glad it helped!

jayakrishnan.k.jSeptember 3, 2013 at 7:47 amBRILLIANT EXPLANATION! Thanks a lot………

manjuSeptember 18, 2013 at 3:21 amthks for complete explanation

Kevin @ http://kldavenport.comNovember 11, 2013 at 10:43 amI stumbled across your site looking for one specific aspect of gradients and ended up reading the whole post. You did a great job of distilling these concepts Kalid.

KalidNovember 11, 2013 at 9:06 pmThanks Kevin, I appreciate it!

RickDecember 26, 2013 at 9:54 pmI went over to Wikipedia and read an article on a similar topic. It was so much more difficult to understand, and Wikipedia is easier than most math texts. It makes me so angry that most math books seem to go out of their way to make mathematics unnecessarily difficult. Maybe, with more people beginning to write internet articles like this, the math obfuscators won’t be able to get away with it much longer. I look forward to the day when students realize that math can be the easiest class in school.

Thanks and keep up the good work Kalid!

KalidDecember 27, 2013 at 3:47 pmThanks Rick! I can relate to the feeling of frustration, it’s what drove me to write (after having an aha momne, it quickly turned to: “Why couldn’t they explain it like that in the first place?”).

I do think math has the potential to become the easiest subject. Its objectivity, which could be seen as offputting, is a great indicator of when something has truly clicked. As a result, we can quickly determine whether an analogy is helping solve the problem before us.

Appreciate the support :).

Marco PMarch 18, 2014 at 3:34 amLovely!

KhasudMarch 20, 2014 at 6:35 amJazak Allah khair

Qammar AbbasMarch 21, 2014 at 5:02 amAn easy and interesting approach. Hats off!

ATHUL P ANANDApril 18, 2014 at 4:24 pmawesme dude!!gud work!!!well explained…ws really helpful! :)

lashiweApril 29, 2014 at 7:35 pmThank u so much,your explanations really helped…- must confess am not all that good in mathematics I could really use some more help.but thank you

KalidMay 19, 2014 at 11:08 amAnother explanation that I posted on reddit:

I imagine a vector field like a grid. Off in the distance is a billboard showing the amount of money we’ll get for being at a certain position on the grid (value of the function).

From where we’re standing, we can take a step in any direction. The “work” is the same (i.e., I moved 1 unit) but the payoff (net amount of cash we gain) can differ drastically depending on if we go North, West, South, NorthEast, etc.

The gradient is the “payoff” in the sense it points to the direction of greatest reward (if it’s zero, it means you are already at the max reward, i.e. any step you take will diminish your earnings).

A surface like “z = 3” is basically saying “show me all the positions in this grid where the reward is $3”. On the grid a path is drawn, highlighting all the positions of this equal payoff. If you started following this path, your payoff would never change.

But… what would the gradient be? The gradient is pointing in the direction of greatest increase, so should have nothing in common with the path of 0 increase. In other words, the projection of the gradient onto the surface should be 0, i.e. it’s normal. The gradient has zero inclination for you to go anywhere near the path of zero gains.

Ok, fine, that’s what the gradient should do. How do we show it actually maximizes the payoff?

Here’s how I figure it: on a circle (showing all possible paths), we can basically make any tradeoff of x and y that we want. At 45-degrees we can trade them 1-for-1, at higher values we can get 2 units x-distance for 1 unit of y distance, or 10 to 1, or a million to 1 (at angles close to 0 or 90).

The direction of the gradient is calculated to maximize the tradeoff based on dz/dx and dz/dy, i.e. it figures out how much reward we get for moving in each direction and allocates “effort” appropriately.

If we get $20 for moving the x direction and $10 for moving in the y direction, then our direction should favor x, but only at a 1:2 tradeoff. I.e., if we can trade 1 x for 3 y’s then we should keep trading (adjusting the angle) until we get 1x for 2y’s.

In other words, you can prove that the gradient direction is the direction which maximizes z assuming you are moving 1 infinitesimal unit and are getting rewarded by dz/dx and dz/dy. And by definition, this profit-maximizing direction would not waste any energy along the profit-maintaining path that must have both dz/dx and dz/dy of 0 (the equal-valued path must not change the amount of z).

RobJune 2, 2014 at 3:47 pmReally great explanation! The only thing is that in your definition of the ‘del’ the partial derivatives should have a lower-case delta ( instead of ‘d’ :)

AjinkyaJune 4, 2014 at 9:31 pmReally helpful…

I would be really glad if you could tell me the derivation for the formula for gradient of a scalar in terms of the nabla/del operator…

Or if you could tell a link where I can find it…

pariJune 29, 2014 at 10:38 amHi kalid, it is a great explanation at lest for people like me.

I think one should get this overview before getting into the actual concept.

thanks a lot!

NaveenJuly 10, 2014 at 1:39 amThanks a bunch !

What if My Equation DOESN’T Equal Zero?? | The blog at the bottom of the seaAugust 10, 2014 at 11:02 pm[…] And here’s a link that really helped me understand what a gradient was, and what it was all about: Vector Calculus: Understanding the Gradient […]

RajmohanAugust 25, 2014 at 9:41 amvery good article

AymunSeptember 6, 2014 at 7:03 pmMashaAllah you posted this in 2007and to this date people are getting benefit from it. Loved it. Thank you

AnonymousOctober 7, 2014 at 2:36 amwhat is green’s theorem? can anyone explain.

sanuOctober 17, 2014 at 3:32 pmBefore reading this, I wasted 2 days for this gradient!!

but the world of mathematics is very very very interesting when people like you teaching us.

thanks a lot….

OsamaOctober 26, 2014 at 11:07 amAwesome work But if you can help me with this problem,? My professor assigned me “physical significance of gradient” that would be my first assignment confused what to submitt? ?? (application/physical significance)

Amir sultanOctober 29, 2014 at 8:40 amGood explanation

Taps patelDecember 3, 2014 at 10:17 amThank you so much

J ScribDecember 10, 2014 at 10:12 pmThanks a lot. this really helped me understand it better!

ManikantaDecember 13, 2014 at 9:31 pmA gradient is Vector differentiation operator applied on a scalar function. Not strictly the derivative of vector functions as you said in opening paragraph, both are different. However, gradient can be treated as a derivative of a special vector for which all the vector components have same function, whose magnitude is the scalar function and directed in the direction of f_x=f_y=f_z from notation f(vector) = (f_x,f_y,f_z).

ManikantaDecember 13, 2014 at 10:13 pmIam adding this Just to clarify the things,

1. Derivative of a vector function is called divergence. reference:http://ocw.mit.edu/ans7870/18/18.013a/textbook/HTML/chapter09/section01.html,

2. Gradient of a vector function is called Jacobian. reference:http://en.wikipedia.org/wiki/Gradient#Gradient_of_a_vector

omoJanuary 17, 2015 at 3:59 pmNice woke it realy open but wish ther was a summarie of it

AqeelFebruary 22, 2015 at 1:31 amhow to find max rate of change of scalar field

VU KhanFebruary 22, 2015 at 10:48 pmPick an area that has small hill. Starting at the bottom of the hill

1. Describe how to follow the gradient path to the top.

2. In particular, describe how to determine the direction in which the gradient points at a given point on the hill.

3. What should you do if you encounter a wall ?

4. Discuss whether the gradient path is the shortest path, the quickest path or the easiest path.

Gradient: Excellent explanation of physical meaning of gradient in Quora | tyro2tigerFebruary 26, 2015 at 11:29 pm[…] http://betterexplained.com/articles/vector-calculus-understanding-the-gradient/ […]

Tim McIntyreMarch 6, 2015 at 12:29 amHas your page been changed? I used it last year as a reference for a course I am teaching on fields. This year a student immediately picked out an issue with the statement “The term gradient (grad) typically refers to the derivative of vector functions”. It refers to the derivative of scalar functions. All your examples are scalars – temperature, height and “x+y^2+z^3”. Could this be corrected please?

TimMarch 6, 2015 at 12:37 amI’d suggest it would be better to write “The term gradient (grad) typically refers to the derivative of a scalar field”. The term “vector function” is somewhat confusing.

kalidMarch 9, 2015 at 12:58 amHi Tim, great feedback — I just updated the article. (I’d incorrectly used “vector function” as a function that took a vector as input, vs. a vector-valued one that returned a vector.) Thanks!

TimMarch 9, 2015 at 1:02 amHi Kalid,

thanks for the quick correction! Just in time for Thursday’s lecture.

Tim

MinieMarch 13, 2015 at 2:37 amI have been trying to ‘reload’ these ‘memories’ because I am planning to study again. I really needed this website. Thank you :)

DharApril 1, 2015 at 9:22 pmThank you so much for this example! I’ve been struggling the way my professor explains it, but this just [email protected]

Vysakh KApril 27, 2015 at 11:58 amHi, My doubt is regarding the gradient on an equipotential surface. Without becoming zero why is the gradient vector, pointing perpendicular to the surface? I’m not understanding why travelling perpendicular will move in the direction of maximum. Please Explain.

neha somanJune 27, 2015 at 11:15 pmI really liked the way you explained the gradient of a function with the help of the example! Keep up the good work and keep posting such good stuff!

Thank You

KazzAugust 1, 2015 at 9:57 pmLove u bro for thisss!! Physics now seem like a kid game after visiting ur blog.. . Really awesome tactics to learn the tough topics. Thanks again, u saved me from cramming.

AdvaitAugust 5, 2015 at 7:21 pmSuper helpful! Thank you thank you thank you!! :D

RahulAugust 16, 2015 at 1:20 amWill you please write an article on the total dervative or material derivative of a function?

furqanAugust 28, 2015 at 11:17 pmGreat

zahidSeptember 6, 2015 at 11:40 pmAs. . dF=del(F) dot Dr…

It means change in function is equal to self dot dr ..where dr is directional derivative ..

Sir I just want to know what is the physical interpretation of directional derivative …

Also if we move in the direction perpendicular to gradient then why the change is zero .. As in practical cases it is not soso..

Plzzz explain it

HabibSeptember 21, 2015 at 8:02 pmNice article

Aakash KharbandaSeptember 23, 2015 at 9:44 amThanks Bro

Great Job

ChelseaNovember 5, 2015 at 5:44 pmThank you for all these articles! they really help :)

AshDecember 2, 2015 at 12:23 amYou said when we input a particular coordinate, we get the direction in which we have to move. The output is a set like (1, 3, 5). Is it a vector joining origin to that point? If yes, then hows it used to reach the maximum?

kalidDecember 3, 2015 at 1:12 am@Ash: The output, such as (1, 3, 5), is the direction to move from the coordinates they were calculated from. This represents the best direction to move. So, you start at the origin, calculate the gradient, follow it a bit, and get to point A (Ax, Ay, Az). You recalculate the gradient from A, move a bit, and get to B. Then to C, D, etc. Keep going until the gradient is 0 (nowhere else to go to maximize your current function). Hope that helps.

David C BauerApril 8, 2016 at 11:38 pmJust used this for a review. Still fresh. Great job.

Rahul SainiApril 14, 2016 at 11:13 amThis is the first time i am reading something on this site and i am impressed , great work .

DavidApril 15, 2016 at 3:12 pmNice Clear Concise

Manu RApril 18, 2016 at 11:55 amWith reference to the great example(loved it), what happens when the dough boy reaches a local minima? Won’t the gradient be zero at that time? How to find the direction of greatest increase at that point? Should we take double gradient?

kalidApril 21, 2016 at 12:45 pmWhen the gradient is zero, it’s either a local max, local min, or a saddle/inflection point. From there, you need other tests to figure out which it is (similar to the 2nd derivative test in Calculus I). A bit more here:

http://tutorial.math.lamar.edu/Classes/CalcIII/RelativeExtrema.aspx

JohnApril 21, 2016 at 11:42 amI understand that when you’re at a local max there’s clearly no direction that can increase the function but I’m finding it hard to understand why the gradient at a local min is also zero. Is this trying to imply that all directions at a minimum increase the function uniformly (i.e the same amount) therefore there is no direction that will increase the function value the most?

kalidApril 21, 2016 at 11:51 amExactly. The various derivatives “vote” for a direction to move. (Basically, you are moving in a direction proportional to its contribution, check out http://betterexplained.com/articles/understanding-pythagorean-distance-and-the-gradient/).

At a minimum, every direction (x, y, z) is screaming equally loud you need to move in its direction. It’s like the donkey stuck between two piles of hay, unsure of which to pick. At a maximum, every direction (x, y, z) is quiet, saying not to move. So you sit still.

JohnApril 21, 2016 at 12:36 pmThank you very much for the confirmation. I will also be giving that linked article a read :)

Crazy chickenMay 14, 2016 at 9:12 pmThank you so much because of this i got full marks in my SATS TEST. You’re the man. Without you in would fail

EdmundMay 30, 2016 at 11:33 pmreally really well done. This is amazing,

shubhrantJune 3, 2016 at 7:03 amit was great, superb thank you

DércioJuly 22, 2016 at 12:50 pmVery well explained.

SamyakAugust 1, 2016 at 5:32 amIt was truly awesome

atriAugust 31, 2016 at 10:07 amOutstanding…!!

Jennifer LarsenSeptember 24, 2016 at 6:26 amThank you, my text book just showed a relief map of Yosemite and something about perpendicular. I thought I was formulating problems to explain modern art.

Abdul WaheedOctober 14, 2016 at 8:22 pmThank U so much.!!

VijayNovember 1, 2016 at 5:57 amWell explained ..keep the good job

arunNovember 1, 2016 at 4:49 pmbravo!!!! its was keep going without boring and explained in a lucid manner. I would like to ask you where from you get those illustrations man, those are really awesome. but if you add more example problems related to the content it would be even super.

AndreiNovember 12, 2016 at 9:38 pmKeep it up! ^.^ I see a big potential in your explanations.

Anjumol k sNovember 29, 2016 at 3:49 amSuperbly told

RajDecember 14, 2016 at 10:28 pmVery Nice Article!

Very well written!

AdityaDecember 19, 2016 at 5:02 amBrilliant! THanks

Nika TsogiaidzeDecember 30, 2016 at 7:31 amPerfectly explained. Thank you!

tong sin keongJanuary 15, 2017 at 4:47 amThank you for the nice exposition!!

LekhaJanuary 16, 2017 at 12:12 pmBest Explanation ever

SamJanuary 18, 2017 at 9:00 pmThis was a great explanation

BillFebruary 14, 2017 at 9:59 amBest explanation I’ve read! Nice!

nil96February 14, 2017 at 11:32 am_/_

Abdulrahman Haje katimFebruary 24, 2017 at 6:20 amBest explanation I have seen eveeeer.

ChadMarch 16, 2017 at 12:01 pmExcellent!

bradenMarch 16, 2017 at 7:24 pmthanks! what a tasty example. Thank you for explaining this difficult concept for me

Max LeeApril 17, 2017 at 4:18 amI like your explanation.

山重水复疑无路，最快下降问梯度（深度学习入门系列之七） - AI 之下August 31, 2017 at 7:44 pm[…] [1] 吴军. 智能时代. 中信出版集团. 2016.8 [2] Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets[J]. Neural computation, 2006, 18(7): 1527-1554. [3] Williams D, Hinton G. Learning representations by back-propagating errors[J]. Nature, 1986, 323(6088): 533-538. [4] Rumelhart D E, McClelland J L, PDP Research Group. Parallel Distributed Processing, Volume 1 Explorations in the Microstructure of Cognition: Foundations[J]. 1986. [5] Tom Mitchell著.曾华军等译. 机器学习. 机器工业出版社. 2007.4 [6] Better Explained. Vector Calculus: Understanding the Gradient […]

Prof.Dr Mircea OrasanuOctober 8, 2017 at 7:16 amThe physical significance of the divergence of a vector field is the rate at which “density” exits a given region of space. The definition of the divergence therefore follows naturally by noting that, in the absence of the creation or destruction of matter, the density within a region of space can change only by having it flow into or out of the region. By measuring the net flux of content passing through a surface surrounding the region of space, it is therefore immediately possible to say how the density of the interior has changed. This property is fundamental in physics, where it goes by the namMany problems in Meteorology and Oceanography have circular symmetries which make them much easier to deal with in a polar coordinate system. As we learned in our discussion of vectors, the results of a physical problem are independent of our choice of coordinate system. The coordinate system choice is a matter of convenience in the calculation of the problem (in the case of rectangular versus polar coordinates – it is mainly a matter of how much trigonometry we wish to endure). It also allows us to examine matters from a different point of view, from which we can gain better insight into the nature of the physical problem.

For example, an extremely simple but surprisingly accurate representation of atmospheric flow is the Geostrophic wind where the Coriolis force is balanced by the pressure gradient force in the horizontal plane. . e “principle of continuity

Prof.Dr Mircea OrasanuOctober 8, 2017 at 7:17 amThe resulting flow is axi-symmetric (no dependence on ) and, by examining this result in polar coordinates, we can reduce two differential equations to just one. If we examine the more general equation of horizontal motion in polar coordinates, (where the h subscript indicates just the and terms, we can come to a better physical understanding on what it means to neglect the advection term in the Geostrophic approximation.

Updating Weights with Gradient Descent & Backpropagation | NEURALSCULPT.COMOctober 25, 2017 at 12:10 pm[…] Vector Calculus: Understanding the Gradient [2] Gradient Descent (and Past) [3] Discover Limits of Capabilities in […]

AfshinNovember 28, 2017 at 6:51 amExcellent explanation

Finding the Maximal Margin - Support Vector Machines | Painless PredictionsJanuary 7, 2018 at 9:45 pm[…] *The gradient is the same as the derivative. $nabla f(x) = f'(x)$ […]

gaurav kumarJanuary 12, 2018 at 9:53 amwonderful””very nicely explained

干货 | 请收下这份2018学习清单：150个最好的机器学习，NLP和Python教程-时讯快报January 29, 2018 at 3:41 pm[…] 向量计算，理解梯度(betterexplained.com) […]

CeyhunFebruary 20, 2018 at 1:24 pmI liked that and I understood, really. And I am commenting somewhere first, cause you deserved that.

Johanne ChampagneMarch 20, 2018 at 7:56 pmVery well explained. You’re an excellent teacher!

Darsigny Marie-PierJune 29, 2018 at 7:49 amhahahaha amazing thanks!

DEBABRATA BANERJEEJuly 12, 2018 at 1:59 amSir I read the A Twisted Example. It is really wonderful.

As gradient give the direction in which function increases more then why the direction of the gradient perpendicular to the surface???