Alexander Fufaev
My name is Alexander FufaeV and here I write about:

Gradient and How to Calculate the Directional Derivative

Table of contents
  1. Necessary ingredient: scalar function Here you will learn what scalar functions are and what role they play in the forming of the gradient.
  2. Gradient in one dimension Here you learn that the gradient in 1d is simply a partial derivative.
  3. Gradient in two dimensions Here you will learn how derivatives in one dimension are turned into a multidimensional derivative and what is the role of the Nabla operator.
  4. Gradient in three dimensions
  5. Gradient points to the steepest ascent Here you will learn how to use the directional derivative to understand why the gradient vector points in the direction of the steepest ascent.
  6. Calculate directional derivative in 4 steps Here you will learn how to use a gradient to determine the direction of the steepest ascent.
  7. Exercises with Solutions

The gradient of a scalar function \(f\) is a multidimensional derivative of the function \(f\). In a three-dimensional case, you get the gradient by applying the Nabla operator \(\nabla\) to the scalar function \(f\). You will understand what is meant by this after the lesson.

Necessary ingredient: scalar function

A three-dimensional scalar function \(f\) takes three arguments \(x\), \(y\), and \(z\) and spits out a number (a scalar).

The three variables \(x,y,z\) are usually spatial coordinates in physics. In general, they do not have to be space coordinates, they can also represent other quantities, such as radial distance and two angles: \(r,\theta,\varphi\). We assume here for the explanation of the gradient that \(x,y,z\) represent spatial coordinates.

The function \(f(x,y,z)\) assigns a number to each point \((x,y,z)\) in space. For example, \(f(x,y,z)\) could be a temperature function \(T(x,y,z)\) that assigns a temperature \(T(x,y,z)\) to each position in space.

Example: Temperature distribution in the room

The temperature in the room is described by the following function as an example:

Then, for example, the temperature \(T\) at the location \((x,y,z) = (1,2,5)\) is:

It could then be 12 degrees Celsius, for example. At the location \((x,y,z) = (0,0,0)\), on the other hand, the temperature is:

You can also see from this example that a function such as 1 need not depend explicitly on \(x\) (or on \(y\) or \(z\)), but the scalar function can still be higher dimensional. By writing \(x,y,z\) in the argument of \(T(x,y,z)\), we imply that it is a three-dimensional scalar function.

To better illustrate a scalar function \( f \), we assume that it depends on two spatial coordinates \((x,y)\), that is, it is two-dimensional. Then you can think of \( f(x,y) \) as a curved surface with mountains and valleys (sort of a landscape).

Graph of a Two-Dimensional Scalar Function
Example of a scalar function \(f(x,y)\) with mountains and valleys.

Gradient in one dimension

You are at some location \((x,y)\) on this landscape (function) and want to know what the slope would be if you were walking in, say, the \(x\) direction. The slope in \(x\)-direction is the derivative of the function with respect to \(x\):

How do we find the slope of the landscape \(f(x,y)\) if we go in the \(y\) direction? We differentiate \(f(x,y)\) with respect to \(y\) and get the slope in \(y\)-direction:

Depending on the location \((x,y)\) we are at, the slopes 2 and 3 are of course different:

  • On the mountain, the \(x\) and \(y\) directions tend to be downhill - that is, both slopes are negative and rather large.

  • In a valley it goes uphill - that is, both slopes are positive and rather large.

  • And in a plateau, the landscape is flat - that is, the slopes are zero in both directions.

Example: Slope in \(x\) direction

Given is a landscape described by the following scalar function:

We want to know how steep it is when we go in \(x\) direction. To do this, we determine the derivative with respect to \(x\):

So the calculated slope 7 in \(x\) direction increases linearly with \(x\) and is independent at which location \(y\) we are. If we want to determine the slope at the location \((x,y) = (2,1)\), we insert this location into Eq. 7:

What is the gradient in one dimension?

The gradient of a function \(f(x)\) in one dimension, is the derivative of \(f\) with respect to \(x\). In one dimension, the gradient is a pure number, namely the slope in \(x\) direction.

Gradient in two dimensions

What if we want to consider not only the slope in \(x\) direction, but also in \(y\) direction? Then we have to consider two derivatives of the function \(f(x,y)\):

We can consider the derivatives separately or we can combine them to create a new mathematical object that is indispensable in physics and mathematics, namely the gradient in more than one dimension. In our case here: In two dimensions.

To do this, we write down the two derivatives 9 in a column vector:

To keep 10 a bit more compact, we omit \((x,y)\) dependence, but keep in mind that \(f\) may still depend on \(x\) and \(y\).

By writing the derivative in a column, we thereby obtained a vector function (also called vector field). This vector has a magnitude and a direction. But before we look at which direction the two-dimensional gradient 10 points, let's rewrite it a bit. To do this, we pull the function \(f\) out of the vector:

The "vector" in Eq. 11 with derivatives is a so-called operator. Stand-alone derivatives, of course, make not much sense. An operator takes effect only when it is applied to a function, as in this case to the scalar function \(f\). This operator in Eq. 11 is called the Nabla operator \(\nabla\):

Thus, we can also write the two-dimensional gradient 11 as follows:

Since gradient 13 is a vector (more precisely: a vector field), it is also called gradient vector or gradient field.

Gradient in three dimensions

In physics, which describes our three-dimensional world, the gradient is usually also three-dimensional and looks like this:

Here we have simply added the \(z\) dependence to the \(f\) function and added the derivative of \(f\) to \(z\) as the third component in the gradient.

Example #1: Gradient of a scalar function

Given a two-dimensional scalar function \( \varphi(x,y) = x^2 + 5xy \). By 'two-dimensional' is meant that the function does not depend on three spatial coordinates \(x,y,z\) as usual, but only on two \(x,y\) and that it is visualized only in a plane. In this way, the gradient can be illustrated more easily (see illustration) because the gradient vector lies in a 2d plane. The Nabla operator can thus be reduced to two components because of the two-dimensionality of the scalar function:

Differentiating the scalar function with respect to the variables results in the following gradient field:

Vector plot of the gradient of the scalar function x²+5xy
This is what the gradient field of the example function looks like.
Example #2: Homogeneous gradient field

Given a scalar function \( \varphi(x) = x \) that depends only on a spatial coordinate \(x\). Form the gradient:

Gradient of a Linear Scalar Function (Homogeneous Vector Field)
The gradient field of the example function is a homogeneous vector field.

The vector field is constant at each point because the derivative of the above scalar function with respect to \(x\) is a constant. The vector field also has no contribution in the \(y\) direction because the scalar function does not depend on \(y\) and the derivative with respect to \(y\) is zero.

Example #3: Linear gradient field

Given a scalar function \( \varphi(x) = x^2 \) that depends only on the coordinate \(x\). Form the gradient:

Vector Plot of the Gradient Field of the Scalar Function x²
The gradient field of the example function increases linearly with \(x\).

The vector field increases linearly in the \(x\) direction. The vector field has no contribution in \(y\) direction as in example #2.

Gradient points to the steepest ascent

To understand why the gradient vector \( \nabla f \) points in the direction of the steepest ascent, we exploit the so-called directional derivative. For this we take a unit vector \(\boldsymbol{v}\) pointing in any direction. The only important thing is that it is a unit vector, that is normalized.

Remember that the magnitude \( |\nabla f| \) is a constant that we cannot change. Why not? Because \(f\) is fixed. As I said before, you can think of the \(f\) function as a landscape, with rigid hills and valleys. On this surface there is the unit vector \( \boldsymbol{v} \) whose direction we can change and thus scan the landscape \(f\).

The slope of the shadow of \( \boldsymbol{v} \) on the plane, we get with the help of the following scalar product:

The scalar product 19 is the directional derivative of the function \(f\) in the direction of \(\boldsymbol{ v }\).

The result of this scalar product is a pure number, namely the slope in the direction of \(\boldsymbol{ v }\). For example, if you choose \(\boldsymbol{ v }\) as the unit vector in the \(x\) direction: \(\boldsymbol{ v } = \boldsymbol{\hat e}_{\text x}\), then the scalar product 19 gives the slope in \(x\) direction.

The question is, why does 19 indicate the steepest ascent? We can try around and use different unit vectors \(\boldsymbol{ v }\) in the scalar product 19. The steepest ascent has that unit vector \(\boldsymbol{ v }\) which yields the largest scalar product 19.

Instead of laboriously inserting all conceivable unit vectors into 19, there is a more sophisticated way to find out the maximum slope in the direction of \(\boldsymbol{ v }\). To do this, we need to rewrite the scalar product 19. To do this, we use the geometric definition of the scalar product:

Geometric Interpretation of the Scalar Product of Two Vectors
Scalar product by using an angle.

Thus we can write the directional derivative 19 as follows:

Here the non-bold \(v\) is the magnitude of the vector \(\boldsymbol{v}\). Now it is much easier to determine the maximum slope in the direction of \(\boldsymbol{ v }\). We have assumed that the vector \( \boldsymbol{v} \) is normalized. If it is normalized, then it has the magnitude: \( v = 1\). Thus Eq. 21 becomes:

So far so good. The only way we can vary the slope is by the angle \(\theta\) enclosed by the vectors \(\nabla f\) and \(\boldsymbol{v}\). The cosine in Eq. 22 has its largest value at \(\theta=0\): \( \cos(0) ~=~ 1 \). So we set \(\theta=0\) to get the largest slope:

By setting \(\theta=0\), we have aligned the vectors \( \nabla \, f \) and \( \boldsymbol{v} \) parallel to each other (see Illustration 5). That means: Now vector \( \boldsymbol{v} \) points in the same direction as vector \( \nabla \, f \). Also, we made the scalar product 21, that is, the slope, as maximum as possible. And as you see in 21: The maximum slope is the magnitude \(|\nabla f|\) of the gradient vector. Consequently, the gradient vector \(\nabla f\) points in the direction of largest ascent!

Calculate directional derivative in 4 steps

If you don't want to calculate the direction of the steepest ascent 23, but the ascent in any direction \(\boldsymbol{v}\), then you have to do the following:

Here we no longer assume that the vector \(\boldsymbol{v}\) is normalized. Therefore, we must always normalize it by dividing by the magnitude \(|\boldsymbol{v}| =: v \) of the vector. Follow these 4 steps to calculate the directional derivative in any direction:

  1. Calculate the gradient \( \nabla f \) of a given scalar function \( f \).

  2. Normalize the vector \( \boldsymbol{ v } \) by dividing it by its magnitude: \( \boldsymbol{ v } / v \).

  3. Form the scalar product between the gradient \( \nabla f \) and the normalized vector \(\boldsymbol{v}/v\).

  4. Insert specific values for the variables \( x, y, z \). By doing this you set the specific position \( (x, y, z) \) where you want to calculate the slope in the direction of \( \boldsymbol{v} \).

Example: Calculate directional derivative

You want to find out how the following function changes in the direction \( \boldsymbol{ v } = (1,1,0) \) at the position \( (0,1,0) \):

Step #1: Calculate the gradient \( \nabla f \). To do this, you differentiate the given scalar function 25 partially with respect to each coordinate \(x,y,z\). The derivatives then represent the three components of the gradient field \( \nabla f \):

Step #2: Normalize the vector \( \boldsymbol{v} \) by dividing the vector by its magnitude \(v \):

Step #3: Then you calculate the directional derivative by forming the scalar product between the gradient 26 and the unit vector 27:

Step #4: Insert the point \( (0,1,0) \) for \(x,y,z\) into 28 to find the slope in the direction of \(\boldsymbol{v}\) at that position:

Exercises with Solutions

Use this formula eBook if you have problems with physics problems.

Exercise #1: Gradient of the magnitude \(r\) of a position vector

Given the magnitude of a position vector |\( \boldsymbol{r} \)| = \( \sqrt{x^{2}+y^{2}+z^{2}} \), determine the gradient of |\( \boldsymbol{r} \)|.

Solution to Exercise #1

The magnitude of the position vector is a scalar function. Apply the gradient operator to it, that is, differentiate function |\( \boldsymbol{r} \)| with respect to each component \(x\), \(y\) and \(z\): \[ \boldsymbol{\nabla}{|\boldsymbol{r}|}\left(x,y,z \right) ~=~ \left[ \frac{ \partial{|\boldsymbol{r}|} }{ \partial{x} }, \frac{ \partial{|\boldsymbol{r}|} }{ \partial{y} }, \frac{ \partial{|\boldsymbol{r}|} }{ \partial{z} } \right] \]

Compute each derivative, then you get:

  • 1st component: 2 \[ \frac{ \partial }{ \partial{x} } \, \sqrt{x^{2}+y^{2}+z^{2}} ~=~ \frac{1}{2}\frac{2x}{ \sqrt{x^{2}+y^{2}+z^{2}} } \]
  • 2nd component: 3 \[ \frac{ \partial }{ \partial{y} } \, \sqrt{x^{2}+y^{2}+z^{2}} ~=~ \frac{1}{2}\frac{2y}{ \sqrt{x^{2}+y^{2}+z^{2}} } \]
  • 3rd component: 4 \[ \frac{ \partial }{ \partial{z} } \, \sqrt{x^{2}+y^{2}+z^{2}} ~=~ \frac{1}{2}\frac{2z}{ \sqrt{x^{2}+y^{2}+z^{2}} } \]

Factor 2 and \( \frac{1}{2} \) cancel out. Then you have the following vector field (where \( \sqrt{x^{2}+y^{2}+z^{2}} \) was factored out): 5 \[ \boldsymbol{\nabla}{|\boldsymbol{r}|}\left(x,y,z \right) ~=~ \frac{1}{\sqrt{x^{2}+y^{2}+z^{2}}} \left( x, y, z \right) \]

Here, (\(x,y,z\)) is a position vector \( \boldsymbol{r} \). So, the gradient is: 6 \[ \frac{\boldsymbol{r}}{|\boldsymbol{r}|} := \boldsymbol{\hat{r}} \]

The result is a unit vector \( \boldsymbol{\hat{r}} \) in the direction of \( \boldsymbol{r} \).

Exercise #2: Gradient of 1/r and 1/|r-r'|

Gradients like \( \boldsymbol{\nabla}\frac{1}{r} \) and \( \boldsymbol{\nabla}\frac{1}{|\boldsymbol{r}-\boldsymbol{r}'|} \) occur frequently in physics. Calculate these two gradients:

  1. Compute the gradient of \( \frac{1}{r} \).
  2. Compute the gradient of \( \frac{1}{|\boldsymbol{r}-\boldsymbol{r}'|} \)

Solution to Exercise #2.1

Generally, the gradient for \( \frac{1}{r} \) is: \[ \nabla \, \frac{1}{r} (x,y,z) ~=~ \left[ \frac{ \partial{r^{-1}} }{ \partial{x} }, \frac{ \partial{r^{-1}} }{ \partial{y} }, \frac{ \partial{r^{-1}} }{ \partial{z} } \right] \]

The magnitude \( r \) is in three dimensions is given by: \[ r ~=~ \sqrt{ x^2 ~+~ y^2 ~+~ z^2 } \]

Differentiate \( \frac{1}{r} \) with respect to all three variables:

  • 1st component: \[ \frac{\partial \, r^{-1}}{\partial \, x} ~=~ -\left(x^{2}+y^{2}+z^{2}\right)^{-3/2} \, x \]
  • 2nd component: \[ \frac{\partial \, r^{-1}}{\partial \, y} ~=~ -\left(x^{2}+y^{2}+z^{2}\right)^{-3/2} \, y \]
  • 3rd component: \[ \frac{\partial \, r^{-1}}{\partial \, z} ~=~ -\left(x^{2}+y^{2}+z^{2}\right)^{-3/2} \, z \]

So, gradient of \( \frac{1}{r} \) is given by (where \( \left(x,y,z\right) \) is factored out and \( \left(x,y,z\right) ~=~ \boldsymbol{r} \)): \[ \nabla \, \frac{1}{r} ~=~ -\frac{\boldsymbol{r}}{ \left(x^{2}+y^{2}+z^{2}\right)^{3/2} } ~=~ -\frac{\boldsymbol{r}}{r^{3}} \]

Solution to Exercise #2.2

To compute the gradient of \( \frac{1}{|\boldsymbol{r}-\boldsymbol{r}'|} \), you first need to determine whether the gradient acts on \( \boldsymbol{r} \) or \( \boldsymbol{r}'\). If nothing is specified, such as by notation \( \nabla_{r'} \), then Nabla \( \nabla \) refers to \( \boldsymbol{r} \).

So, the goal is to compute the following three derivatives: \[ \nabla \, \frac{1}{|\boldsymbol{r}-\boldsymbol{r}'|} ~=~ \left[\begin{array}{c} \frac{\partial}{\partial x} |\boldsymbol{r}-\boldsymbol{r}'|^{-1} \\ \frac{\partial}{\partial y} |\boldsymbol{r}-\boldsymbol{r}'|^{-1} \\ \frac{\partial}{\partial z} |\boldsymbol{r}-\boldsymbol{r}'|^{-1} \end{array}\right] \]

With \( \boldsymbol{r}(x,y,z) \) and \( \boldsymbol{r}'(x',y',z') \): \[ \frac{1}{|\boldsymbol{r}-\boldsymbol{r}'|} ~=~ ((x-x')^2 ~+~ (y-y')^2 ~+~ (z-z')^2 )^{-\frac{1}{2}} \]

The 1st component is the derivative with respect to \( x \) is: \[ \frac{\partial}{\partial x} \left( (x-x')^2 ~+~ (y-y')^2 ~+~ (z-z')^2 \right)^{-1/2} ~=~ -\frac{1}{2} \cdot \left( (x-x')^2 ~+~ (y-y')^2 ~+~ (z-z')^2 \right)^{-3/2} \cdot 2(x-x') \]

The factor 2 cancels out and the expression can be written more neatly: \[ \frac{\partial}{\partial x}|\boldsymbol{r}-\boldsymbol{r}'|^{-1} ~=~ -\frac{x-x'}{|\boldsymbol{r}-\boldsymbol{r}'|^3} \]

Proceed analogously with the 2nd: \[ \frac{\partial}{\partial y}|\boldsymbol{r}-\boldsymbol{r}'|^{-1} ~=~ -\frac{y-y'}{|\boldsymbol{r}-\boldsymbol{r}'|^3} \] and 3rd component: \[ \frac{\partial}{\partial z}|\boldsymbol{r}-\boldsymbol{r}'|^{-1} ~=~ -\frac{z-z'}{|\boldsymbol{r}-\boldsymbol{r}'|^3} \]

So, the result is: \[ \nabla \, \frac{1}{|\boldsymbol{r}-\boldsymbol{r}'|} ~=~ -\frac{1}{|\boldsymbol{r}-\boldsymbol{r}'|^3} \left(\begin{array}{c} x-x' \\ y-y' \\ z-z' \end{array}\right) ~=~ -\frac{\boldsymbol{r} - \boldsymbol{r}'}{|\boldsymbol{r}-\boldsymbol{r}'|^3} \]

Exercise #3: Directional derivative of the absolute function

Given is the following magnitude function: $$|\boldsymbol{r}| ~=~ r ~=~ \sqrt{ x^2 ~+~ y^2 ~+~ z^2 }$$

Determine the gradient at the point [1,0,1] in the direction \( \boldsymbol{v} \) = [2,2,1].

Solution to Exercise #3

To compute the slope (derivative) of the function |\( \boldsymbol{r} \)| at a certain point in a certain direction, you use the following directional derivative formula: \[ \frac{ \partial{|\boldsymbol{r}|} }{ \partial{\boldsymbol{v}_n} } ~=~ \boldsymbol{\nabla} \, |\boldsymbol{r}| ~\cdot~ \boldsymbol{v}_n \] where the direction vector \( \boldsymbol{v} \) must be normalized! So: \( \boldsymbol{v}_n ~=~ \frac{1}{3} [2,2,1] \).

You've already calculated the gradient of |\( \boldsymbol{r} \)| in Exercise #1: \[ \boldsymbol{\nabla} \, |\boldsymbol{r}| ~=~ \frac{1}{ \sqrt{x^2 ~+~ y^2 ~+~ z^2} } [x,y,z] \]

Substitute the point [1,0,1] into the gradient where you want to determine the directional derivative: \[ \boldsymbol{\nabla} \, |\boldsymbol{r}| ~=~ \frac{1}{\sqrt{2}} [1,0,1] \]

Then multiply the gradient vector by the direction in which you want to compute the slope: \[ \frac{ \partial{|\boldsymbol{r}|} }{ \partial{\boldsymbol{v}_{n}} } ~=~ \frac{1}{\sqrt{2}}[1,0,1] ~\cdot~ \frac{1}{3} [2,2,1] \]

This yields a slope: \[ \frac{ \partial{|\boldsymbol{r}|} }{ \partial{\boldsymbol{v}_n} } ~=~ \frac{1}{\sqrt{2}} \]