The Cauchy–Schwarz inequality has many proofs. Here is my favorite, taken from Chapter 3 of The Schur complement and its applications; the book is edited by Fuzhen Zhang, and this chapter was contributed by him as well. Let be vectors, assemble the matrix , and form the Gram matrix
Since is a Gram matrix, it is positive semidefinite. Therefore, its determinant is nonnegative:
Rearrange to obtain the Cauchy–Schwarz inequality
Equality occurs if and only if is a rank-one matrix, which occurs if and only if and are scalar multiples.
I like this proof because it is perhaps the simplest example of the (block) matrix technique for proving inequalities. Using this technique, one proves inequalities about scalars (or matrices) by embedding them in a clever way into a (larger) matrix. Here is another example of the matrix technique, adapted from the proof of Theorem 12.9 of these lecture notes by Joel Tropp. Jensen’s inequality is a far-reaching and very useful inequality in probability theory. Here is one special case of the inequality.
Proposition (Jensen’s inequality for the inverse): Let be strictly positive numbers. Then the inverse of their average is no bigger than the average of their inverses:
To prove this result, embed each into a positive semidefinite matrix . Taking the average of all such matrices, we observe that
is positive semidefinite as well. Thus, its determinant is nonnegative:
Rearrange to obtain
Remarkably, we have proven a purely scalar inequality by appeals to matrix theory.
The matrix technique for proving inequalities is very powerful. Check out Chapter 3 of The Schur complement and its applications for many more examples.
If you like this blog and want another way to follow it, I am starting newsletter:
I recall many basic results in linear algebra (which may include the basic results surrounding positive semidefiniteness) require Cauchy-Schwarz to prove, which makes me wonder if there is circular reasoning at work in this proof.