Neat Randomized Algorithms: RandDiag for Rapidly Diagonalizing Normal Matrices

Consider two complex-valued square matrices A\in\complex^{n\times n} and B\in\complex^{n\times n}. The first matrix A is Hermitian, being equal A = A^* to its conjugate transpose A^*. The other matrix B is non-Hermitian, B \ne B^*. Let’s see how long it takes to compute their eigenvalue decompositions in MATLAB:

>> A = randn(1e3) + 1i*randn(1e3); A = (A+A')/2;
>> tic; [V_A,D_A] = eig(A); toc % Hermitian matrix
Elapsed time is 0.415145 seconds.
>> B = randn(1e3) + 1i*randn(1e3);
>> tic; [V_B,D_B] = eig(B); toc % non-Hermitian matrix
Elapsed time is 1.668246 seconds.

We see that it takes 4\times longer to compute the eigenvalues of the non-Hermitian matrix B as compared to the Hermitian matrix A. Moreover, the matrix V_A of eigenvectors for a Hermitian matrix A = V_AD_AV_A^{-1} is a unitary matrix, V_A^*V_A = V_AV_A^* = I.

There are another class of matrices with nice eigenvalue decompositions, normal matrices. A square, complex-valued matrix C is normal if C^*C = CC^*. The matrix V_C of eigenvectors for a normal matrix C = V_C D_C V_C^{-1} is also unitary, V_C^*V_C = V_CV_C^* = I. An important class of normal matrices are unitary matrices themselves. A unitary matrix U is always normal since it satisfies U^*U = UU^* = I.

Let’s see how long it takes MATLAB to compute the eigenvalue decomposition of a unitary (and thus normal) matrix:

>> U = V_A;                     % unitary, and thus normal, matrix
>> tic; [V_U,D_U] = eig(U); toc % normal matrix
Elapsed time is 2.201017 seconds.

Even longer than it took to compute an eigenvalue decomposition of the non-normal matrix B! Can we make the normal eigenvalue decomposition closer to the speed of the Hermitian eigenvalue decomposition?

Here is the start of an idea. Every square matrix C has a Cartesian decomposition:

    \[C = H + iS, \quad H = \frac{C+C^*}{2}, \quad S = \frac{C-C^*}{2i}.\]

We have written C as a combination of its Hermitian part H and i times its skew-Hermitian part S. Both H and S are Hermitian matrices. The Cartesian decomposition of a square matrix is analogous to the decomposition of a complex number into its real and imaginary parts.

For a normal matrix C, the Hermitian and skew-Hermitian parts commute, HS = SH. We know from matrix theory that commuting Hermitian matrices are simultaneously diagonalizable, i.e., there exists Q such that H = QD_HQ^* and S = QD_SQ^* for diagonal matrices D_H and D_S. Thus, given access to such Q, C has eigenvalue decomposition

    \[C = Q(D_H+iD_S)Q^*.\]

Here’s a first attempt to turn this insight into an algorithm. First, compute the Hermitian part H of C, diagonalize H = QD_HQ^*, and then see if Q diagonalizes C. Let’s test this out on a 2\times 2 example:

>> C = orth(randn(2) + 1i*randn(2)); % unitary matrix
>> H = (C+C')/2;                     % Hermitian part
>> [Q,~] = eig(H);
>> Q'*C*Q                            % check to see if diagonal
ans =
  -0.9933 + 0.1152i  -0.0000 + 0.0000i
   0.0000 + 0.0000i  -0.3175 - 0.9483i

Yay! We’ve succeeded at diagonalizing the matrix C using only a Hermitian eigenvalue decomposition. But we should be careful about declaring victory too early. Here’s a bad example:

>> C = [1 1i;1i 1]; % normal matrix
>> H = (C+C')/2;
>> [Q,~] = eig(H);
>> Q'*C*Q           % oh no! not diagonal
ans =
   1.0000 + 0.0000i   0.0000 + 1.0000i
   0.0000 + 1.0000i   1.0000 + 0.0000i

What’s going on here? The issue is that the Hermitian part H = I for this matrix has a repeated eigenvalue. Thus, H has multiple different valid matrices of eigenvectors. (In this specific case, every unitary matrix Q diagonalizes H.) By looking at H alone, we don’t know which Q matrix to pick which also diagonalizes S.

He and Kressner developed a beautifully simple randomized algorithm called RandDiag to circumvent this failure mode. The idea is straightforward:

  1. Form a random linear combination M = \gamma_1 H + \gamma_2 S of the Hermitian and skew-Hermitian parts of A, with standard normal random coefficients \gamma_1 and \gamma_2.
  2. Compute Q that diagonalizes M.

That’s it!

To get a sense of why He and Kressner’s algorithm works, suppose that H has some repeated eigenvalues and S has all distinct eigenvalues. Given this setup, it seems likely that a random linear combination of S and H will also have all distinct eigenvalues. (It would take a very special circumstances for a random linear combination to yield two eigenvalues that are exactly the same!) Indeed, this intuition is a fact: With 100% probability, Q diagonalizing a Gaussian random linear combination of simultaneously diagonalizable matrices H and S also diagonalizes H and S individually.

MATLAB code for RandDiag is as follows:

function Q = rand_diag(C)
   H = (C+C')/2; S = (C-C')/2i;
   M = randn*H + randn*S;
   [Q,~] = eig(M);
end

When applied to our hard 2\times 2 example from before, RandDiag succeeds at giving a matrix that diagonalizes C:

>> Q = rand_diag(C);
>> Q'*C*Q
ans =
   1.0000 - 1.0000i  -0.0000 + 0.0000i
  -0.0000 - 0.0000i   1.0000 + 1.0000i

For computing the matrix of eigenvectors for a 1000\times 1000 unitary matrix, RandDiag takes 0.4 seconds, just as fast as the Hermitian eigendecomposition did.

>> tic; V_U = rand_diag(U); toc
Elapsed time is 0.437309 seconds.

He and Kressner’s algorithm is delightful. Ultimately, it uses randomness in only a small way. For most coefficients a_1,a_2 \in \real, a matrix Q diagonalizing a_1 H + a_2 S will also diagonalize A = H+iS. But, for any specific choice of a_1,a_2, there is a possibility of failure. To avoid this possibility, we can just pick a_1 and a_2 at random. It’s really as simple as that.

References: RandDiag was proposed in A simple, randomized algorithm for diagonalizing normal matrices by He and Kressner (2024), building on their earlier work in Randomized Joint Diagonalization of Symmetric Matrices (2022) which considers the general case of using random linear combinations to (approximately) simultaneous diagonalize (nearly) commuting matrices. RandDiag is an example of a linear algebraic algorithm that uses randomness to put the input into “general position”; see Randomized matrix computations: Themes and variations by Kireeva and Tropp (2024) for a discussion of this, and other, ways of using randomness to design matrix algorithms.

Leave a Reply

Your email address will not be published. Required fields are marked *