In this post, I want to discuss two of my favorite topics in applied mathematics: the Sherman–Morrison formula and integral equations. As a bridge between these ideas, we’ll use an integral equation analog of the Sherman–Morrison formula to derive the solution for the Laplace equation with Dirichlet boundary conditions in the 2D disk.
Laplace’s Equation
Suppose we have a thin, flat (two-dimensional) plate of homogeneous material and we measure the temperature at the border. What is the temperature inside the material? The solution to this problem is described by Laplace’s equation, one of the most ubiquitous partial differential equations in physics. Let denote the temperature of the material at point . Laplace’s equation states that, at any point on the interior of the material,
(1)
Laplace’s equation (1) and the specification of the temperature on the boundary form a well-posed mathematical problem in the sense that the temperature is uniquely determined at each point .1A well-posed problem is also required to depend continuously on the input data which, in this case, are the boundary temperatures. Indeed, the Laplace problem with boundary data is well-posed in this sense. We call this problem the Laplace Dirichlet problem since the boundary conditions
are known as Dirichlet boundary conditions.
The Double Layer Potential
Another area of physics where the Laplace equation (1) appears is the study of electrostatics. In this case, represents the electric potential at the point . The Laplace Dirichlet problem is to find the electric potential in the interior of the region with knowledge of the potential on the boundary.
The electrostatic application motivates a different way of thinking about the Laplace equation. Consider the following question:
How would I place electric charges on the boundary to produce the electric potential at each point on the boundary?
This is a deliciously clever question. If I were able to find an arrangement of charges answering the question, then I could calculate the potential at each point in the interior by adding up the contribution to the electric potential of each element of charge on the boundary. Thus, I can reduce the problem of finding the electric potential at each point in the 2D region to finding a charge distribution on the 1D boundary to that region.
We shall actually use a slight variant of this charge distribution idea which differs in two ways:
- Rather than placing simple charges on the boundary of the region, we place charge dipoles.2The reason for why this modification works better is an interesting question, but answering it properly would take us too far afield from the goals of this article.
- Since we are considering a two-dimensional problem, we use a different formula for the electric potential than given by Coulomb’s law for charges in 3D. Also, since we are interested in solving the Laplace Dirichlet problem in general, we can choose a convenient dimensionless system of units. We say that the potential at a point induced by a unit “charge” at the origin is given by .
With these modifications, our new question is as follows:
How would I place a density of “charge” dipoles on the boundary to produce the electric potential at each point on the boundary?
We call this function the double layer potential for the Laplace Dirichlet problem. One can show the double layer potential satisfies a certain integral equation. To write down this integral equation, let’s introduce some more notation. Let be the region of interest and its boundary. Denote points concisely as vectors , with the length of denoted . The double layer potential satisfies
(2)
where the integral is taken over the surface of the region ; denotes the directional derivative taken in the direction normal (perpendicular) to the surface at the point . Note we choose a unit system for which hides physical constants particular to the electrostatic context, since we are interested in applying this methodology to the Laplace Dirichlet problem in general (possibly non-electrostatic) applications.
There’s one last ingredient: How do we compute the electric potential at points in the interior of the region? This is answered by the following formula:
(3)
The integral equation (2) is certainly nothing to sneeze at. Rather than trying to comprehend it in its full glory, we shall focus on a special case for the rest of our discussion. Suppose the region is a circular disk with radius centered at . The the partial derivative in the integrand in (2) then is readily computed for points and both on the boundary of the circle:
Substituting in (2) then gives
(4)
The Sherman–Morrison Formula
We are interested in solving the integral equation (4) to obtain an expression for the double-layer potential , as this will give us a solution formula for the Laplace Dirichlet problem. Ultimately, we accomplish this by using a clever trick. In an effort to make this trick seem more self-evident and less of a “rabbit out of a hat”, I want to draw an analogy to a seemingly unrelated problem: rank-one updates to linear systems of equations and the Sherman–Morrison formula.3In accordance with Stigler’s law of eponymy, the Sherman–Morrison formula was actually discovered by William J. Duncan five years before Sherman, Morrison, and Woodbury. For a more general perspective on the Sherman–Morrison formula and its generalization to the Sherman–Morrison–Woodbury formula, you may be interested in the following post of mine on Schur complements and block Gaussian elimination.
Suppose we want to solve the system of linear equations
(5)
where is an square matrix and , , and are length- vectors. We are ultimately interested in finding from . To gain insight into this problem, it will be helpful to first carefully considered the problem in reverse: computing from . We could, of course, perform this computation by forming the matrix in memory and multiplying it with , but there is a more economical way:
- Form .
- Compute .
Standing back, observe that we now have a system of equations for unknowns and . Specifically, our first equation can be rewritten as
which combined with the second equation
gives the by system4This “state space approach” of systematically writing out a matrix–vector multiply algorithm and then realizing this yields a larger system of linear equations was initially taught to be by my mentor Shiv Chandrasekaran; this approach has much more powerful uses, such as in the theory of rank-structured matrices.
(6)
The original equation for (5) can be derived from the “lifted” equation (6) by applying Gaussian elimination and eliminating the first row of the linear system (6). But now that we have the lifted equation (6), one can naturally wonder what would happen if we instead used Gaussian elimination to eliminate the last rows of (6); this will give us an equation for which we can solved without first computing . Doing this so-called block Gaussian elimination yields
Solving this, we deduce that
From the equation , we have that
Since this formula holds for every vector , we deduce the famous Sherman–Morrison formula
This example shows how it can be conceptually useful to lift a linear system of equations by adding additional variables and equations and then “do Gaussian elimination in a different order”. The same insight shall be useful in solving integral equations like (4).
Solving for the Double Layer Potential
Let’s try repeating the playbook we executed for the rank-one-updated linear system (5) and apply it to the integral equation (4). We are ultimately interested in computing from but, as we did last section, let’s first consider the reverse. To compute from , we first evaluate the integral
Substituting this into (4) gives the system of equations
(7)
(8)
In order to obtain (4) from (7) and (8), we add times equation (8) to equation (7). Following last section, we now instead eliminate from equation (8) using equation (7). To do this, we need to integrate equation (7) in order to cancel the integral in equation (8):
Adding times this integrated equation to equation (8) yields
Thus plugging this expression for into equation (7) yields
We’ve solved for our double layer potential!
As promised, the double layer potential can be used to give a solution formula (known as the Poisson integral formula) for the Laplace Dirichet problem. The details are a mechanical, but also somewhat technical, exercise in vector calculus identities. We plug through the details in the following extra section.
Sherman–Morrison for Integral Equations
Having achieved our main goal of deriving a solution formula for the 2D Laplace Dirichlet problem for a circular domain, I want to take a step back to present the approach from two sections ago in more generality. Consider a more general integral equation of the form
(10)
where is some region in space, , , and are functions of one or two arguments on , and is a nonzero constant. Such an integral equation is said to be of the second kind. The integral equation for the Laplace Dirichlet problem (2) is of this form with , , , and . We say the kernel is separable with rank if can be expressed in the form
With the circular domain, the Laplace Dirichlet integral equation (2) is separable with rank .5E.g., set and . We shall focus on the second kind integral equation (10) assuming the kernel is separable with rank (for simplicity, we set ):
(11)
Let’s try and write this equation in a way that’s more similar to the linear system of equation (5). To do this, we make use of linear operators defined on functions:
- Let denote the identity operator on functions: It takes as inputs function and outputs the function unchanged.
- Let denote the “integration against operator”: It takes as input a function and outputs the number .
With these notations, equation (11) can be written as
Using the same derivation which led to the Sherman–Morrison formula for linear systems of equations, we can apply the Sherman–Morrison formula to this integral equation in operator form, yielding
Therefore, the solution to the integral equation (11) is
This can be interpreted as a kind of Sherman–Morrison formula for the integral equation (11).
One can also generalize to provide a solution formula for the second-kind integral equation (10) for a separable kernel with rank ; in this case, the natural matrix analog is now the Sherman–Morrison–Woodbury identity rather than the Sherman–Morrison formula. Note that this solution formula requires the solution of a system of linear equations. One can use this as a numerical method to solve second-kind integral equations: First, we approximate by a separable kernel of a modest rank and then compute the exact solution of the resulting integral equation with the approximate kernel.6A natural question is why one might want to solve an integral equation formulation of a partial differential equations like the Laplace or Helmholtz equation. An answer is that that formulations based on second-kind integral equations tend to lead to systems of linear equations which much more well-conditioned as compared to other methods like the finite element method. They have a number of computational difficulties as well, as the resulting linear systems of equations are dense and may require elaborate quadrature rules to accurately compute.
My goal in writing this post was to discuss two topics which are both near and dear to my heart, integral equations and the Sherman–Morrison formula. I find the interplay of these two ideas to be highly suggestive. It illustrates the power of the analogy between infinite-dimensional linear equations, like differential and integral equations, and finite-dimensional ones, which are described by matrices. Infinite dimensions certainly do have their peculiarities and technical challenges, but it can be very useful to first pretend infinite-dimensional linear operators (like integral operators) are matrices, do calculations to derive some result, and then justify these computations rigorously post hoc.7The utility of this technique is somewhat of an open secret among some subset of mathematicians and scientists, but such heuristics are usually not communicated to students explicitly, at least in rigorous mathematics classes.