I am excited to share that my paper Does block size matter in randomized block Krylov low-rank approximation? has recently been released on arXiv. In that paper, we study the randomized block Krylov iteration (RBKI) algorithm for low-rank approximation. Existing results show that RBKI is efficient at producing rank-
 approximations with a large block size of 
 or a small block size of 
, but these results give poor results for intermediate block sizes 
. But often these intermediate block sizes are the most efficient in practice. In our paper, we close this theoretical gap, showing RBKI is efficient for any block size 
. Check out the paper for details!
In our paper, the core technical challenge is understanding the condition number of a random block Krylov matrix of the form
      ![]()
Evaluating a Polynomial and Vandermonde Matrices
Let us begin with one the humblest but most important characters in mathematics, the univariate polynomial
      ![]()
Given a polynomial, we often wish to evaluate it at a set of inputs. Specifically, let 
 be 
 (distinct) input locations. If we evaluate 
 at each number, we obtain a list of (output) values, which we denote by 
 of 
 (distinct) values, each of which given by the formula 
      ![Rendered by QuickLaTeX.com \[p_a(\lambda_i) = \sum_{j=1}^t \lambda_i^{j-1} a_j.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-eaaf53acb1f30e1ced80af4704800e02_l3.png)
Every linear transformation between vectors can be realized as a matrix–vector product, and the matrix for the coefficients-to-values map is called a Vandermonde matrix 
. It is given by the formula 
      ![Rendered by QuickLaTeX.com \[V= \begin{bmatrix} 1 & \lambda_1 & \lambda_1^2 & \cdots & \lambda_1^{t-1} \\ 1 & \lambda_2 & \lambda_2^2 & \cdots & \lambda_2^{t-1} \\ 1 & \lambda_3 & \lambda_3^2 & \cdots & \lambda_3^{t-1} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & \lambda_s & \lambda_s^2 & \cdots & \lambda_s^{t-1} \end{bmatrix} \in \complex^{s\times t}.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-2d100f803f43a4910ba6c03d2edde384_l3.png)
Interpolating by a Polynomial and Inverting a Vandermonde Matrix
Going forward, let us set 
 so the number of locations 
 equals the number of coefficients 
. The Vandermonde matrix maps the vector of coefficients 
 to the vector of values 
. Its inverse 
 maps a set of values 
 to a set of coefficients 
 defining a polynomial 
 which interpolates the values 
: 
      ![]()
To solve the polynomial interpolation problem with Vandermonde matrices, we can do the following. Given values 
, we first solve the linear system of equations 
, obtaining a vector of coefficients 
. Then, define the interpolating polynomial 
 (1)    ![]()
Lagrange Interpolation
Inverting a the Vandermonde matrix is one way to solve the polynomial interpolation problem, but the polynomial interpolation can also be solved directly. To do so, first notice that we can construct a special polynomial 
 that is zero at the locations 
 but nonzero at the first location 
. (Remember that we have assumed that 
 are distinct.) Further, by rescaling this polynomial to
      ![]()
 (2)    ![]()
      ![Rendered by QuickLaTeX.com \[\ell_i(\lambda_j) = \delta_{ij} = \begin{cases} 1, & i = j, \\ 0, & i\ne j. \end{cases}\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-64c4effeb0f12225a76059d89ccce692_l3.png)
With the Lagrange polynomials in hand, the polynomial interpolation problem is easy. To obtain a polynomial whose values are 
, simply multiply each Lagrange polynomial 
 by the value 
 and sum up, obtaining 
      ![Rendered by QuickLaTeX.com \[q(u) = \sum_{i=1}^t f_i \ell_i(u).\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-c1c5043718b5df795a883a6a60d26aaf_l3.png)
 (3)    ![Rendered by QuickLaTeX.com \[q(\lambda_j) = \sum_{i=1}^t f_i \ell_i(\lambda_j) = \sum_{i=1}^t f_i \delta_{ij} = f_j. \]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-90032b377ebe250ef36dccba2186b6ce_l3.png)
From Lagrange to Vandermonde via Elementary Symmetric Polynomials
We now have two ways of solving the polynomial interpolation problem, the Vandermonde way (1) and the Lagrange way (3). Ultimately, the difference between these formulas is one of basis: The Vandermonde formula (1) expresses the interpolating polynomial 
 as a linear combination of monomials 
 and the Lagrange formula (3) expresses 
 as a linear combination of the Lagrange polynomials 
. To convert between these formulas, we just need to express the Lagrange polynomial basis in the monomial basis.
To do so, let us examine the Lagrange polynomials more closely. Consider first the case 
, and consider the fourth unnormalized Lagrange polynomial 
      ![]()
      ![]()
      
Indeed, these expressions are special. Up to a plus-or-minus sign, they are called the elementary symmetric polynomials of the locations 
. Specifically, given numbers 
, the 
th elementary symmetric polynomial 
 is defined as the sum of all products 
 of 
 values, i.e., 
      ![]()
The elementary symmetric polynomials appear all the time in mathematics. In particular, they are the coefficients of the characteristic polynomial of a matrix and feature heavily in the theory of determinantal point processes. For our purposes, the key observation will be that the elementary symmetric polynomials appear whenever one expands out an expression like 
:
Lemma 1 (Expanding a product of linear functions). It holds that
Using this fact, we obtain an expression for the Lagrange polynomials in the monomial basis. Let 
 denote the list of locations without 
. Then the 
th Lagrange polynomial is given by 
      ![Rendered by QuickLaTeX.com \[\ell_i(u) = \frac{\sum_{j=1}^t e_{t-j}(-\lambda_{-i})u^{j-1}}{\prod_{k\ne i} (\lambda_i - \lambda_k)}.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-15d6cb602dcf916e14363b25a013870d_l3.png)
Indeed, we can write the interpolating polynomial as
      ![Rendered by QuickLaTeX.com \[q(u) = \sum_{i=1}^t f_i\ell_i(u) = \sum_{i=1}^t \sum_{j=1}^t f_i \frac{e_{t-j}(-\lambda_{-i})}{\prod_{k\ne i} (\lambda_i - \lambda_k)} u^{j-1}.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-17012971c8949a309b08ae230bf7fa94_l3.png)
      ![Rendered by QuickLaTeX.com \[q(u) = \sum_{j=1}^t \sum_{i=1}^t f_i \frac{e_{t-j}(-\lambda_{-i})}{\prod_{k\ne i} (\lambda_i - \lambda_k)} u^{j-1} = \sum_{j=1}^t \left(\sum_{i=1}^t \frac{e_{t-j}(-\lambda_{-i})}{\prod_{k\ne i} (\lambda_i - \lambda_k)}\cdot f_i \right) u^{j-1}.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-10960bce48c6a124c253db4c1272f2e2_l3.png)
      ![Rendered by QuickLaTeX.com \[a_j = \sum_{i=1}^t \frac{e_{t-j}(-\lambda_{-i})}{\prod_{k\ne i} (\lambda_i - \lambda_k)}\cdot f_i.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-2bb3d1280c99ca124084f031ed343679_l3.png)
 (4)    ![]()
Vandermonde Matrices are Merely Exponentially Ill-Conditioned
Vandermonde matrices are notoriously ill-conditioned, meaning that small changes to the values 
 can cause large changes to the coefficients 
. On its face, this might seem like the problem of polynomial interpolation itself is ill-conditioned, but this is too hasty a conclusion. After all, it is only the mapping from values 
 to coefficients 
 in the monomial basis that is ill-conditioned. Fortunately, there are much better, more numerically stable bases for representing a polynomial like the Chebyshev polynomials.
But these more stable methods of polynomial interpolation and approximation are not the subject of this post: Here, our task is to will be to characterize just how ill-conditioned the computation of 
 is. To characterize this ill-conditioning, we will utilize the condition number of the matrix 
. Given a norm 
, the condition number of 
 is defined to be 
      ![]()
      ![]()
 (5)    ![]()
Bounding 
 is straightforward. Indeed, setting 
 and using (5), we compute 
      ![Rendered by QuickLaTeX.com \[\norm{V}_1= \max_{1\le j \le t} \sum_{i=1}^t |\lambda_i|^{j-1} \le \max_{1\le j \le t} tM^{j-1} = t\max\{1,M^{t-1}\}.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-ec5f488cb8233df69dd22a00e9bae784_l3.png)
      ![]()
The harder task is bounding 
. Fortunately, we have already done most of the hard work needed to bound this quantity. Using our expression (4) for the entries of 
 and using the formula (5) for the 
-norm, we have 
      ![Rendered by QuickLaTeX.com \[\norm{\smash{V^{-1}}}_1= \max_{1\le j \le t} \sum_{i=1}^t |(V^{-1})_{ij}| = \max_{1\le j \le t}\frac{ \sum_{i=1}^t |e_{t-i}(-\lambda_{-j})|}{\prod_{k\ne j} |\lambda_j- \lambda_k|}.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-b431915bf4dd9e0a3087cd91f5e3533a_l3.png)
      ![]()
      ![Rendered by QuickLaTeX.com \[\norm{\smash{V^{-1}}}_1\le \max_{1\le j \le t}\frac{ \sum_{i=1}^t e_{t-i}(|\lambda_{-j}|)}{\prod_{k\ne j} |\lambda_j- \lambda_k|}.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-34c4b502a43c91513781e51c25190ad1_l3.png)
 (6)    ![]()
Often, it is helpful to simply Gautschi’s bound a bit. Setting 
 as above, the numerator is bounded as 
. To bound the denominator, let 
 be the smallest distance between two locations. Using 
 and 
, we can weaken the bound (6) to obtain 
      ![]()
      ![]()
Theorem 2 (Gautschi’s bound, simplified). Introduce
and
. Then
and
Gautschi’s bound suggests that Vandermonde matrices can be very ill-conditioned, which is disappointing. But Gautschi’s bound also shows that Vandermonde matrices are merely exponentially ill-conditioned—that is, they are not worse than exponentially conditioned. The fact that Vandermonde matrices are only exponentially ill-conditioned plays a crucial role in our analysis of randomized block Krylov iteration.
Gautschi’s Bound as a Robust Version of the Fundamental Theorem of Algebra
The fundamental theorem of algebra states that a degree-
 polynomial has precisely 
 roots. Consequently, at 
 locations, it must be nonzero at least one. But how nonzero must the polynomial be at that one location? How large must it be? On this subject, the fundamental theorem of algebra is moot. However, Gautschi’s bound provides an answer.
To answer this question, we ask: What is the minimum possible size 
 of the values 
? Well, if we set all the coefficients 
 to zero, then 
 as well. So to avoid this trivial case, we should enforce a normalization condition on the coefficient vector 
, say 
. With this setting, we are ready to compute. Begin by observing that 
      ![]()
      ![]()
      ![]()
      ![]()
Proposition 3 (Minimum stretch). For vector norm
and its induced operator norm
, it holds that for any square invertible matrix
that
Using this result, we obtain the lower bound 
 on the values 
 of a polynomial with coefficients 
. Combining with Gautschi’s bound gives the following robust version of the fundamental theorem of algebra:
Theorem 4 (Robust fundamental theorem of algebra). Fix a polynomial
and locations
. Define
and
. Then
Thus, at 
 locations, a degree-
 polynomial must be nonzero at least one point. In fact, the sum of the values at these 
 locations must be no worse than exponentially small in 
.
![Rendered by QuickLaTeX.com \[(u + \mu_1) (u + \mu_2) \cdots (u+\mu_k) = \sum_{j=0}^k e_{k-j}(\mu_1,\ldots,\mu_k) u^j.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-fe26dd7ba2e215fba4fea5cd28147cb2_l3.png)
![Rendered by QuickLaTeX.com \[\operatorname{sign}(x) = \begin{cases} 1, & x > 0, \\ 0 & x = 0, \\ -1 & x < 0.\end{cases}\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-28ef3c2b8f17f4fc65f112fc1ea06c86_l3.png)
![Rendered by QuickLaTeX.com \[\mu^{(t)} = \left(x^{(t)}\right)^\top Ax^{(t)}= \frac{\left(x^{(0)}\right)^\top A^{2t+1} x^{(0)}}{\left(x^{(0)}\right)^\top A^{2t} x^{(0)}}.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-8744da7cad08a86191f52369bd86e7b3_l3.png)
![Rendered by QuickLaTeX.com \[\frac{\lambda_1 - \mu^{(t)}}{\lambda_1} \le \frac{\sum_{i=2}^n \lambda_i^{2t} z_i^2}{\lambda_1^{2t} z_1^2 + \sum_{i=2}^n \lambda_i^{2t} z_i^2} = \frac{c^2}{\lambda_1^{2t} z_1^2 + c^2} \quad \text{for } c^2 = \sum_{i=2}^n \lambda_i^{2t} z_i^2.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-752c7ced1eb1bd8cc0ddf1eacdbc5b1c_l3.png)
![Rendered by QuickLaTeX.com \[\expect_{z_1}\left[\frac{\lambda_1 - \mu^{(t)}}{\lambda_1}\right] \le \expect_{z_1} \left[z_1 \cdot \frac{c}{\lambda_1} \arctan \left( \frac{\lambda_1^t}{c} \cdot z_1 \right) \right] = \expect_{z_1} \left[|z_1| \cdot \frac{c}{\lambda_1^t} \left|\arctan \left( \frac{\lambda_1^t}{c} \cdot z_1 \right)\right| \right].\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-b8e649fd7a63c495f6f2fa09d9534c3b_l3.png)
![Rendered by QuickLaTeX.com \[\expect_{z_1}\left[\frac{\lambda_1 - \mu^{(t)}}{\lambda_1}\right] \le = \frac{\pi}{2} \cdot \frac{c}{\lambda_1^t} \cdot \expect_{z_1} \left[|z_1|\right] \le \sqrt{\frac{\pi}{2}} \cdot \frac{c}{\lambda_1^t}.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-e9d84cb3710fdc80b0166f5491e5b3da_l3.png)
![Rendered by QuickLaTeX.com \[c = \sqrt{\sum_{i=2} \lambda_i^{2t} z_i^2} \le \lambda_2^{t} \sqrt{\sum_{i=2}^n z_i^2} = \lambda_2^t \norm{(z_2,\ldots,z_n)}.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-7facff60e6dd9b6579d16ed28a6b0f47_l3.png)
![Rendered by QuickLaTeX.com \[\expect\left[\frac{\lambda_1 - \mu^{(t)}}{\lambda_1}\right] = \expect_c\left[\expect_{z_1} \left[ \frac{\lambda_1 - \mu^{(t)}}{\lambda_1}  \right]\right] \le \sqrt{\frac{\pi}{2}} \cdot \expect\left[ \frac{c}{\lambda_1^t} \right] =\sqrt{\frac{\pi}{2}} \cdot \left(\frac{\lambda_2}{\lambda_1}\right)^t \cdot \sqrt{n-1} .\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-71e78c2a3b2280f94d13d08cfeed954c_l3.png)


![Rendered by QuickLaTeX.com \[\kappa_{\rm dem}(A) = \frac{\norm{A}_{\rm F}}{\sigma_{\rm min}(A)} = \sqrt{\sum_i \left(\frac{\sigma_i(A)}{\sigma_{\rm min}(A)}\right)^2} \]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-824557fbf90de5be0f9756f9089728aa_l3.png)
![Rendered by QuickLaTeX.com \[\operatorname{sign}(x) = \begin{cases} 1, & x > 0, \\ 0, & x = 0, \\ -1, & x < 0. \end{cases}\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-103d0549df7eab731cd55caeaef8f0f2_l3.png)
![Rendered by QuickLaTeX.com \[\langle \varphi_i ,\varphi_j\rangle = \begin{cases}1, & i = j, \\0, & i \ne j.\end{cases}\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-1bcdaefcfb18b68324fe7793e2cb3534_l3.png)

![Rendered by QuickLaTeX.com \[\mathcal{E}(f) = \frac{1}{2} \sum_{i,j=1}^m (f(i)-f(j))^2 \pi_iP_{ij}.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-3b001418aee305354517a3784887e27f_l3.png)

![Rendered by QuickLaTeX.com \[{\rm B} = \frac{1}{2} \sum_{j=1}^m (f(j))^2 \left(\sum_{i=1}^m \pi_j P_{ji} \right) = \frac{1}{2} \sum_{j=1}^m (f(j))^2 \pi_j = \frac{1}{2} \langle f, f\rangle.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-d26f896a5e016a55ba59c4b0abca8f69_l3.png)


![Rendered by QuickLaTeX.com \[x^\top (A\circ M)x = \sum_{i,j=1}^n x_i (A\circ M)_{ij} x_j = \sum_{i,j=1}^n x_i A_{ij} M_{ij} x_j.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-ad02e17a6c90aca9b469ef0fe73bb1d3_l3.png)
![Rendered by QuickLaTeX.com \[x^\top (A\circ M)x = \sum_{i,j=1}^n x_i A_{ij} x_j M_{ji} = \tr(\operatorname{diag}(x) A \operatorname{diag}(x) M).\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-4386c5381ce8bcc4e8ded09b2c00bcf5_l3.png)
![Rendered by QuickLaTeX.com \[\left( \frac{1}{n} \sum_{i=1}^n x_i \right)^{-1} \le \frac{1}{n} \sum_{i=1}^n \frac{1}{x_i}.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-db85840bf998e96281f1c2605ca740c5_l3.png)
![Rendered by QuickLaTeX.com \[\det(A) = \left(\frac{1}{n} \sum_{i=1}^n x_i\right) \left(\frac{1}{n} \sum_{i=1}^n \frac{1}{x_i}\right) - 1 \ge 0.\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-e1e6cd521eaf4f49c17bab509cf71cd2_l3.png)
![Rendered by QuickLaTeX.com \[\hat{A} &\gets \left(\hat{B} + \frac{B(:,s_i) (B(:,s_i)^\top B)}{\norm{B(:,s_i)}^2}\right)^\top \left(\hat{B} + \frac{B(:,s_i) (B(:,s_i)^\top B)}{\norm{B(:,s_i)}^2}\right).\]](https://www.ethanepperly.com/wp-content/ql-cache/quicklatex.com-d691d56d7580112fdd2d2db8eb4cd4aa_l3.png)
