My Favorite Proof of the Cauchy–Schwarz Inequality

The Cauchy–Schwarz inequality has many proofs. Here is my favorite, taken from Chapter 3 of The Schur complement and its applications; the book is edited by Fuzhen Zhang, and this chapter was contributed by him as well. Let $x,y \in \real^n$ be vectors, assemble the matrix $B = \onebytwo{x}{y}$ , and form the Gram matrix

$A = B^\top B = \onebytwo{x}{y}^\top \onebytwo{x}{y} = \twobytwo{\norm{x}^2}{y^\top x}{x^\top y}{\norm{y}^2}.$

Since $A$ is a Gram matrix, it is positive semidefinite. Therefore, its determinant is nonnegative:

$\det(A) = \norm{x}^2\norm{y}^2 - |x^\top y|^2 \ge 0.$

Rearrange to obtain the Cauchy–Schwarz inequality

$|x^\top y| \le \norm{x}\norm{y}.$

Equality occurs if and only if $A$ is a rank-one matrix, which occurs if and only if $x$ and $y$ are scalar multiples.

I like this proof because it is perhaps the simplest example of the (block) matrix technique for proving inequalities. Using this technique, one proves inequalities about scalars (or matrices) by embedding them in a clever way into a (larger) matrix. Here is another example of the matrix technique, adapted from the proof of Theorem 12.9 of these lecture notes by Joel Tropp. Jensen’s inequality is a far-reaching and very useful inequality in probability theory. Here is one special case of the inequality.

Proposition (Jensen’s inequality for the inverse): Let $x_1,\ldots,x_n$ be strictly positive numbers. Then the inverse of their average is no bigger than the average of their inverses:
$\left( \frac{1}{n} \sum_{i=1}^n x_i \right)^{-1} \le \frac{1}{n} \sum_{i=1}^n \frac{1}{x_i}.$

To prove this result, embed each $x_i$ into a $2\times 2$ positive semidefinite matrix $\twobytwo{x_i}{1}{1}{1/x_i}$ . Taking the average of all such matrices, we observe that

$A = \twobytwo{\frac{1}{n} \sum_{i=1}^n x_i}{1}{1}{\frac{1}{n} \sum_{i=1}^n \frac{1}{x_i}}$

is positive semidefinite as well. Thus, its determinant is nonnegative:

$\det(A) = \left(\frac{1}{n} \sum_{i=1}^n x_i\right) \left(\frac{1}{n} \sum_{i=1}^n \frac{1}{x_i}\right) - 1 \ge 0.$

Rearrange to obtain

$\left( \frac{1}{n} \sum_{i=1}^n x_i \right)^{-1} \le \frac{1}{n} \sum_{i=1}^n \frac{1}{x_i}.$

Remarkably, we have proven a purely scalar inequality by appeals to matrix theory.

The matrix technique for proving inequalities is very powerful. Check out Chapter 3 of The Schur complement and its applications for many more examples.

If you like this blog and want another way to follow it, I am starting newsletter:

2 thoughts on “My Favorite Proof of the Cauchy–Schwarz Inequality”

Jeremy J Kun says:

December 28, 2024 at 5:40 am

I recall many basic results in linear algebra (which may include the basic results surrounding positive semidefiniteness) require Cauchy-Schwarz to prove, which makes me wonder if there is circular reasoning at work in this proof.

1. Ethan N. Epperly says:
  
  February 11, 2025 at 7:59 pm
  
  It’s a good question!. I am pretty sure the proof can be made to be non-circular. You can prove the spectral theorem without Cauchy–Schwarz (basically by going through the Schur decomposition and proving that a unitary triangularization of a Hermitian matrix must be a diagonalization), and from there you obtain that the determinant of Hermitian matrix is the product of eigenvalues and that $A$ is positive semidefinite if and only if $x^\top Ax\ge 0$ for all $x$ . Therefore, the Gram matrix A = B^\top B $is positive semidefinite since$ x^\top Ax = \sum_{i=1}^n (Bx)_i^2 \ge 0$. I think that establishes all of the results you need for this proof of Cauchy–Schwarz to go through.

Sign up to receive new blog posts by email!

2 thoughts on “My Favorite Proof of the Cauchy–Schwarz Inequality”

Leave a Reply Cancel reply