Sherman–Morrison for Integral Equations

In this post, I want to discuss two of my favorite topics in applied mathematics: the Sherman–Morrison formula and integral equations. As a bridge between these ideas, we’ll use an integral equation analog of the Sherman–Morrison formula to derive the solution for the Laplace equation with Dirichlet boundary conditions in the 2D disk.

Laplace’s Equation

Suppose we have a thin, flat (two-dimensional) plate of homogeneous material and we measure the temperature at the border. What is the temperature inside the material? The solution to this problem is described by Laplace’s equation, one of the most ubiquitous partial differential equations in physics. Let $u(x,y)$ denote the temperature of the material at point $(x,y)$ . Laplace’s equation states that, at any point $(x,y)$ on the interior of the material,

(1) $\begin{equation*} \frac{\partial^2}{\partial x^2} u(x,y) + \frac{\partial^2}{\partial y^2} u(x,y) = 0. \end{equation*}$

Laplace’s equation (1) and the specification of the temperature on the boundary form a well-posed mathematical problem in the sense that the temperature is uniquely determined at each point $(x,y)$ .¹ We call this problem the Laplace Dirichlet problem since the boundary conditions

$\begin{equation*} u(x,y) \quad \text{is specified for $(x,y)$ on the boundary} \end{equation*}$

are known as Dirichlet boundary conditions.

The Double Layer Potential

Another area of physics where the Laplace equation (1) appears is the study of electrostatics. In this case, $u(x,y)$ represents the electric potential at the point $(x,y)$ . The Laplace Dirichlet problem is to find the electric potential in the interior of the region with knowledge of the potential on the boundary.

The electrostatic application motivates a different way of thinking about the Laplace equation. Consider the following question:

How would I place electric charges on the boundary to produce the electric potential $u(x,y)$ at each point $(x,y)$ on the boundary?

This is a deliciously clever question. If I were able to find an arrangement of charges answering the question, then I could calculate the potential $u(x,y)$ at each point $(x,y)$ in the interior by adding up the contribution to the electric potential of each element of charge on the boundary. Thus, I can reduce the problem of finding the electric potential at each point $(x,y)$ in the 2D region to finding a charge distribution on the 1D boundary to that region.

We shall actually use a slight variant of this charge distribution idea which differs in two ways:

Rather than placing simple charges on the boundary of the region, we place charge dipoles.²
Since we are considering a two-dimensional problem, we use a different formula for the electric potential than given by Coulomb’s law for charges in 3D. Also, since we are interested in solving the Laplace Dirichlet problem in general, we can choose a convenient dimensionless system of units. We say that the potential at a point $(x,y)$ induced by a unit “charge” at the origin is given by $1/2\pi \cdot \ln \sqrt{x^2+y^2}$ .

With these modifications, our new question is as follows:

How would I place a density of “charge” dipoles $\phi(x,y)$ on the boundary to produce the electric potential $u(x,y)$ at each point $(x,y)$ on the boundary?

We call this function $\phi(x,y)$ the double layer potential for the Laplace Dirichlet problem. One can show the double layer potential satisfies a certain integral equation. To write down this integral equation, let’s introduce some more notation. Let $R$ be the region of interest and $\partial R$ its boundary. Denote points $(x,y)$ concisely as vectors $\mathbf{r} = (x,y)$ , with the length of $\mathbf{r}$ denoted $\|\mathbf{r}\| = \sqrt{x^2+y^2}$ . The double layer potential satisfies

(2) $\begin{equation*} \frac{1}{2} \phi(\mathbf{r}) + \frac{1}{2\pi} \int_{\partial R} \phi(\mathbf{r}') \frac{\partial}{\partial \nu_{\mathbf{r}'}} \ln \|\mathbf{r}-\mathbf{r}'\| \, dS(\mathbf{r}') = u(\mathbf{r}), \end{equation*}$

where the integral is taken over the surface $\partial R$ of the region $R$ ; $\partial / \partial \nu_{\mathbf{r}'}$ denotes the directional derivative taken in the direction normal (perpendicular) to the surface at the point $\mathbf{r}'$ . Note we choose a unit system for $\phi$ which hides physical constants particular to the electrostatic context, since we are interested in applying this methodology to the Laplace Dirichlet problem in general (possibly non-electrostatic) applications.

There’s one last ingredient: How do we compute the electric potential $u(\mathbf{r})$ at points $\mathbf{r}$ in the interior of the region? This is answered by the following formula:

(3) $\begin{equation*} u(\mathbf{r}) = \frac{1}{2\pi} \int_{\partial R} \phi(\mathbf{r}') \frac{\partial}{\partial \nu_{\mathbf{r}'}} \ln \|\mathbf{r}-\mathbf{r}'\| \, dS(\mathbf{r}'). \end{equation*}$

The integral equation (2) is certainly nothing to sneeze at. Rather than trying to comprehend it in its full glory, we shall focus on a special case for the rest of our discussion. Suppose the region $R$ is a circular disk with radius $r$ centered at $0$ . The the partial derivative in the integrand in (2) then is readily computed for points $\mathbf{r}$ and $\mathbf{r}'$ both on the boundary $\partial R$ of the circle:

$\begin{equation*} \frac{\partial}{\partial \nu_{\mathbf{r}'}} \ln \|\mathbf{r}-\mathbf{r}'\| = \frac{1}{2r}. \end{equation*}$

Substituting in (2) then gives

(4) $\begin{equation*} \frac{1}{2} \phi(\mathbf{r}) + \frac{1}{4\pi r} \int_{\partial R} \phi(\mathbf{r}') \, dS(\mathbf{r}') = u(\mathbf{r}). \end{equation*}$

The Sherman–Morrison Formula

We are interested in solving the integral equation (4) to obtain an expression for the double-layer potential $\phi$ , as this will give us a solution formula $u(\mathbf{r})$ for the Laplace Dirichlet problem. Ultimately, we accomplish this by using a clever trick. In an effort to make this trick seem more self-evident and less of a “rabbit out of a hat”, I want to draw an analogy to a seemingly unrelated problem: rank-one updates to linear systems of equations and the Sherman–Morrison formula.³ For a more general perspective on the Sherman–Morrison formula and its generalization to the Sherman–Morrison–Woodbury formula, you may be interested in the following post of mine on Schur complements and block Gaussian elimination.

Suppose we want to solve the system of linear equations

(5) $\begin{equation*} (A + uv^\top) x = b, \end{equation*}$

where $A$ is an $n\times n$ square matrix and $u$ , $v$ , and $b$ are length- $n$ vectors. We are ultimately interested in finding $x$ from $b$ . To gain insight into this problem, it will be helpful to first carefully considered the problem in reverse: computing $b$ from $x$ . We could, of course, perform this computation by forming the matrix $A + uv^\top$ in memory and multiplying it with $x$ , but there is a more economical way:

Form $\alpha = v^\top x$ .
Compute $b = \alpha u + Ax$ .

Standing back, observe that we now have a system of $n+1$ equations for unknowns $x$ and $\alpha$ . Specifically, our first equation can be rewritten as

$\begin{equation*} -\alpha + v^\top x = 0 \end{equation*}$

which combined with the second equation

$\begin{equation*} \alpha u + Ax = b \end{equation*}$

gives the $n+1$ by $n+1$ system⁴

(6) $\begin{equation*} \begin{bmatrix} - 1 & v^\top \\ u & A \end{bmatrix} \begin{bmatrix} \alpha \\ x \end{bmatrix} = \begin{bmatrix} 0 \\ b \end{bmatrix}. \end{equation*}$

The original equation for $x$ (5) can be derived from the “lifted” equation (6) by applying Gaussian elimination and eliminating the first row of the linear system (6). But now that we have the lifted equation (6), one can naturally wonder what would happen if we instead used Gaussian elimination to eliminate the last $n$ rows of (6); this will give us an equation for $\alpha = v^\top x$ which we can solved without first computing $x$ . Doing this so-called block Gaussian elimination yields

$\begin{equation*} (-1-v^\top A^{-1}u)\alpha = -v^\top A^{-1} b. \end{equation*}$

Solving this, we deduce that

$\begin{equation*} \alpha = \frac{v^\top A^{-1} b}{1 + v^\top A^{-1}u}. \end{equation*}$

From the equation $\alpha u + Ax = b$ , we have that

$\begin{equation*} x = (A+uv^\top)^{-1}b = A^{-1}b - \alpha A^{-1} u = A^{-1}b - \frac{A^{-1}uv^\top A^{-1} b}{1 + v^\top A^{-1}u}. \end{equation*}$

Since this formula holds for every vector $b$ , we deduce the famous Sherman–Morrison formula

$\begin{equation*} (A+uv^\top)^{-1}= A^{-1} - \frac{A^{-1}uv^\top A^{-1}}{1 + v^\top A^{-1}u}. \end{equation*}$

This example shows how it can be conceptually useful to lift a linear system of equations by adding additional variables and equations and then “do Gaussian elimination in a different order”. The same insight shall be useful in solving integral equations like (4).

Solving for the Double Layer Potential

Let’s try repeating the playbook we executed for the rank-one-updated linear system (5) and apply it to the integral equation (4). We are ultimately interested in computing $\phi(\cdot)$ from $u(\cdot)$ but, as we did last section, let’s first consider the reverse. To compute $u(\cdot)$ from $\phi(\cdot)$ , we first evaluate the integral

$\begin{equation*} \alpha = \int_{\partial R} \phi(\mathbf{r}') \, dS(\mathbf{r}'). \end{equation*}$

Substituting this into (4) gives the system of equations

(7) $\begin{equation*} \frac{1}{4\pi r} \alpha + \frac{1}{2} \phi(\mathbf{r}) = u(\mathbf{r}), \end{equation*}$

(8) $\begin{equation*} -\alpha + \int_{\partial R} \phi(\mathbf{r}') \, dS(\mathbf{r}') = 0. \end{equation*}$

In order to obtain (4) from (7) and (8), we add $1/4\pi r$ times equation (8) to equation (7). Following last section, we now instead eliminate $\phi(\mathbf{r})$ from equation (8) using equation (7). To do this, we need to integrate equation (7) in order to cancel the integral in equation (8):

$\begin{equation*} \frac{1}{4\pi r} \alpha \underbrace{\int_{\partial R} \, dS(\mathbf{r}')}_{=2\pi r} + \frac{1}{2} \int_{\partial R} \phi(\mathbf{r}') \, dS(\mathbf{r}') = \frac{1}{2} \alpha + \frac{1}{2} \int_{\partial R} \phi(\mathbf{r}') \, dS(\mathbf{r}') = \int_{\partial R} u(\mathbf{r}') \, dS(\mathbf{r}'). \end{equation*}$

Adding $2$ times this integrated equation to equation (8) yields

$\begin{equation*} 2\alpha = 2\int_{\partial R} u(\mathbf{r}') \, dS(\mathbf{r}') \implies \alpha = \int_{\partial R} u(\mathbf{r}') \, dS(\mathbf{r}'). \end{equation*}$

Thus plugging this expression for $\alpha$ into equation (7) yields

$\begin{equation*} \phi(\mathbf{r}) = 2u(\mathbf{r}) - \frac{1}{2\pi r} \alpha = 2u(\mathbf{r}) - \frac{1}{2\pi r} \int_{\partial R} u(\mathbf{r}') \, dS(\mathbf{r}'). \end{equation*}$

We’ve solved for our double layer potential!

As promised, the double layer potential can be used to give a solution formula (known as the Poisson integral formula) for the Laplace Dirichet problem. The details are a mechanical, but also somewhat technical, exercise in vector calculus identities. We plug through the details in the following extra section.

Poisson Integral Formula

Let’s finish this up by using the double layer to derive a solution formula for the electric potential $u(\mathbf{r})$ at a point $\mathbf{r}$ in the interior of the region. To do this, we use equation (3):

(9) $\begin{align*} u(\mathbf{r}) &= \frac{1}{2\pi} \int_{\partial R} \left( 2u(\mathbf{r}') - \frac{1}{2\pi r} \int_{\partial R} u(\mathbf{r}'') \, dS(\mathbf{r}'') \right) \frac{\partial}{\partial \nu_{\mathbf{r}'}} \ln \|\mathbf{r}-\mathbf{r}'\| \, dS(\mathbf{r}') \\ &= \frac{1}{\pi} \int_{\partial R} u(\mathbf{r}')\frac{\partial}{\partial \nu_{\mathbf{r}'}} \ln \|\mathbf{r}-\mathbf{r}'\| \, dS(\mathbf{r}') - \frac{1}{4\pi^2 r} \int_{\partial R} \frac{\partial}{\partial \nu_{\mathbf{r}'}} \ln\|\mathbf{r}-\mathbf{r}'\| \, dS(\mathbf{r}')\int_{\partial R} u(\mathbf{r}') \, dS(\mathbf{r}'). \end{align*}$

We now need to do a quick calculation which is somewhat technical and not particularly enlightening. We evaluate $-\frac{1}{4\pi}\int_{\partial R} \frac{\partial}{\partial \nu_{\mathbf{r}'}} \frac{1}{|\mathbf{r}-\mathbf{r}'|} \, dS(\mathbf{r}')$ using the divergence theorem:

$\begin{align*} \frac{1}{2\pi}\int_{\partial R} \frac{\partial}{\partial \nu_{\mathbf{r}'}} \ln \|\mathbf{r}-\mathbf{r}'\| \, dS(\mathbf{r}') &= \frac{1}{2\pi}\int_{\partial R} \nu_{\mathbf{r}'}\cdot \nabla_{\mathbf{r}'} \ln \|\mathbf{r}-\mathbf{r}'\| \, dS(\mathbf{r}') = \frac{1}{2\pi}\int_R \nabla_{\mathbf{r}'}^2 \ln\|\mathbf{r}-\mathbf{r}'\| \, d\mathbf{r}' \\ &= \int_R \delta(\mathbf{r}'-\mathbf{r}) \, d\mathbf{r}' = 1. \end{align*}$

We denote $\nabla$ for the gradient, $\nu_{\mathbf{r}'}$ for the normal vector to $\partial R$ at the point $\mathbf{r}'$ , $\cdot$ for the dot product, $\nabla^2$ for the Laplace operator, and $\delta$ for the Dirac delta “function”. The last equality holds because the function $v(\mathbf{r}') = \tfrac{1}{2\pi} \ln \| \mathbf{r} - \mathbf{r}'\|$ is a so-called fundamental solution for Laplace equation in the sense that $\nabla_{\mathbf{r}'}^2 v(\mathbf{r}') = \delta(\mathbf{r}'-\mathbf{r})$ . Therefore, (9) simplifies to

$\begin{equation*} \mathbf{u}(\mathbf{r}) = \frac{1}{\pi} \int_{\partial R} u(\mathbf{r}')\frac{\partial}{\partial \nu_{\mathbf{r}'}} \ln \|\mathbf{r}-\mathbf{r}'\| \, dS(\mathbf{r}') - \frac{1}{2\pi r} \int_{\partial R} u(\mathbf{r}') \, dS(\mathbf{r}'). \end{equation*}$

Computing the boundary derivative for the spherical region centered at the origin with radius $r$ , we obtain the formula

$\begin{equation*} \mathbf{u}(\mathbf{r}) = \frac{1}{2\pi r} \int_{\partial R} \left( 2\frac{r^2 - \mathbf{r} \cdot \mathbf{r}'}{\|\mathbf{r} - \mathbf{r}'\|^2} - 1 \right) u(\mathbf{r}') \, dS(\mathbf{r}') = \frac{r^2 - \| \mathbf{r} \|^2}{2\pi r} \int_{\partial R} \frac{u(\mathbf{r}')}{\|\mathbf{r} - \mathbf{r}'\|^2} \, dS(\mathbf{r}'). \end{equation*}$

We’ve succeeded at deriving a solution formula for $u(\mathbf{r})$ for points $\mathbf{r}$ in the interior of the disk in terms of $u(\mathbf{r}')$ for points $\mathbf{r}'$ on the boundary of the disk. This is known as the Poisson integral formula for the disk in two dimensions. This formula can be generalized to balls in higher dimensions, though this proof technique using “Sherman–Morrison” fails to work in more than two dimensions.

Sherman–Morrison for Integral Equations

Having achieved our main goal of deriving a solution formula for the 2D Laplace Dirichlet problem for a circular domain, I want to take a step back to present the approach from two sections ago in more generality. Consider a more general integral equation of the form

(10) $\begin{equation*} a \, \phi(\mathbf{x}) + \int_\Omega K(\mathbf{x},\mathbf{y}) \phi(\mathbf{y}) \, d\mathbf{y} = f(\mathbf{x}), \quad \mathbf{x} \in \Omega, \end{equation*}$

where $\Omega$ is some region in space, $K(\cdot,\cdot)$ , $f(\cdot)$ , and $\phi(\cdot)$ are functions of one or two arguments on $\Omega$ , and $a\ne 0$ is a nonzero constant. Such an integral equation is said to be of the second kind. The integral equation for the Laplace Dirichlet problem (2) is of this form with $\Omega = \partial R$ , $a = 1/2$ , $K(\mathbf{x},\mathbf{y}) = \tfrac{\partial}{\partial \nu_{\mathbf{y}}} \ln \|\mathbf{x} - \mathbf{y}\|$ , and $f(\mathbf{x}) = u(\mathbf{x})$ . We say the kernel $K(\cdot,\cdot)$ is separable with rank $k$ if $K(\cdot,\cdot)$ can be expressed in the form

$\begin{equation*} K(\mathbf{x},\mathbf{y}) = g_1(\mathbf{x})h_1(\mathbf{y}) + g_2(\mathbf{x})h_2(\mathbf{y}) + \cdots + g_k(\mathbf{x})h_k(\mathbf{y}). \end{equation*}$

With the circular domain, the Laplace Dirichlet integral equation (2) is separable with rank $k = 1$ .⁵ We shall focus on the second kind integral equation (10) assuming the kernel is separable with rank $1$ (for simplicity, we set $a = 1$ ):

(11) $\begin{equation*} \phi(\mathbf{x}) + g(\mathbf{x}) \int_\Omega h(\mathbf{y}) \phi(\mathbf{y}) \, d\mathbf{y} = f(\mathbf{x}), \quad \mathbf{x} \in \Omega. \end{equation*}$

Let’s try and write this equation in a way that’s more similar to the linear system of equation (5). To do this, we make use of linear operators defined on functions:

Let $\operatorname{Id}$ denote the identity operator on functions: It takes as inputs function $\phi(\cdot)$ and outputs the function $\phi(\cdot)$ unchanged.
Let $I_h$ denote the “integration against $h$ operator”: It takes as input a function $\phi(\cdot)$ and outputs the number $\int_\Omega h(\mathbf{y})\phi(\mathbf{y}) \, d\mathbf{y}$ .

With these notations, equation (11) can be written as

$\begin{equation*} (\operatorname{Id} + g I_h) \phi = f. \end{equation*}$

Using the same derivation which led to the Sherman–Morrison formula for linear systems of equations, we can apply the Sherman–Morrison formula to this integral equation in operator form, yielding

$\begin{equation*} \phi = \left( \operatorname{Id}^{-1} - \frac{\operatorname{Id}^{-1}g I_h \operatorname{Id}^{-1}}{1 + I_h \operatorname{Id}^{-1} g} \right)f = f - \frac{g (I_h f)}{1+I_h g}. \end{equation*}$

Therefore, the solution to the integral equation (11) is

$\begin{equation*} \phi(\mathbf{x}) = f(\mathbf{x}) - \frac{\int_\Omega h(\mathbf{y}) f(\mathbf{y}) \, d\mathbf{y}}{1 + \int_\Omega h(\mathbf{y})g(\mathbf{y}) \, d\mathbf{y} } g(\mathbf{x}). \end{equation*}$

This can be interpreted as a kind of Sherman–Morrison formula for the integral equation (11).

One can also generalize to provide a solution formula for the second-kind integral equation (10) for a separable kernel with rank $k$ ; in this case, the natural matrix analog is now the Sherman–Morrison–Woodbury identity rather than the Sherman–Morrison formula. Note that this solution formula requires the solution of a $k\times k$ system of linear equations. One can use this as a numerical method to solve second-kind integral equations: First, we approximate $K$ by a separable kernel of a modest rank $k$ and then compute the exact solution of the resulting integral equation with the approximate kernel.⁶

My goal in writing this post was to discuss two topics which are both near and dear to my heart, integral equations and the Sherman–Morrison formula. I find the interplay of these two ideas to be highly suggestive. It illustrates the power of the analogy between infinite-dimensional linear equations, like differential and integral equations, and finite-dimensional ones, which are described by matrices. Infinite dimensions certainly do have their peculiarities and technical challenges, but it can be very useful to first pretend infinite-dimensional linear operators (like integral operators) are matrices, do calculations to derive some result, and then justify these computations rigorously post hoc.⁷

Laplace’s Equation

The Double Layer Potential

The Sherman–Morrison Formula

Solving for the Double Layer Potential

Sherman–Morrison for Integral Equations

Leave a Reply Cancel reply