A software engineer's circuitous journey to calculate eigenvalues
Or, how to calculate the eigenvalues and eigenvectors of a complex matrix using a routine that only works on real matrices.
By Robert Smith
Contents
If we have a complex matrix, how do we calculate its eigenvalues and eigenvectors using a procedure that can only work on real matrices? This recounts my journey on solving that problem and where it came from in the first place.
TL;DR: If you came here just looking for an algorithm, scroll to the bottom for a pseudocode listing.
Why?
MAGICL is a Common Lisp library for doing matrix arithmetic. To make a long story short, there’s a desire to reduce MAGICL’s dependence on foreign libraries (e.g., LAPACK), and instead use pure Common Lisp routines. Except, implementing numerical linear algebra is difficult, and the MAGICL maintainers usually have more important things to work on. So, instead of writing routines from scratch via textbooks, we sometimes resort to mechanically translating an old distribution of LAPACK, written in FORTRAN 77, into Common Lisp. Due to the age of the routines, I personally think it’s prudent to minimize its usage.
One routine that’s generally reliable—as both FORTRAN 77 code as
well as its mechanically translated Common Lisp counterpart—is
DGEEV
, a LAPACK routine to compute eigenvalues and eigenvectors of a
general real-matrix of double-precision floats. (We’ll call a set of
eigenvectors and the corresponding eigenvalues the eigensystem.)
This routine is made nice in MAGICL and exposed as the Lisp function
MAGICL:EIG
.
The MAGICL:EIG
function, however, is required to be able to work
with both real and complex matrices, yet DGEEV
only works with
reals. So, we are left with two reasonable options:
-
Get mechanical translation of complex-matrix BLAS and LAPACK functions working so that we can call the complex-matrix counterpart of
DGEEV
calledZGEEV
, or -
Figure out how to only use
DGEEV
to somehow compute the eigensystem of a complex matrix.
Because I like puzzles, and because I’m loathe to add to the existing virtually unmaintainable pile of 50,000 lines of mechanically translated code, I opted for the latter option.
Complex numbers as matrices
Complex numbers look like pairs of real numbers that have funny rules
for addition and multiplication. More precisely, complex numbers form
a two-dimensional real vector space, and multiplication is in fact an
We thus immediately conclude
This is to say that the multiplication map
(This representation is not unique. We could also exchange the
One can verify that this matrix can be both added and multiplied, and it works as expected.
Complex matrices as real matrices
We have given a way to identify elements of
For example, this is a transformation from an element of
Due to how matrix arithmetic works with block matrices (which these real matrices essentially are), we at least get ordinary addition and multiplication in this representation.
Does this mean we can now just apply any linear algebra routine to
these real matrices and get the same answer as we would for complex
matrices? Unfortunately not, and eigensystems are a perfect example
way. How many eigenvalues (at most) does an operator in a
Some experimentation
I was experimenting with computing eigensystems of complex matrices and their corresponding real variants, and noticed a pattern. If the complex matrix had a real eigenvalue, it would be also show up as an eigenvalue (of double multiplicity) of the real matrix. If the complex matrix had a complex eigenvalue, then it would show up as an eigenvalue of the real matrix, along with its complex conjugate.
This lead me to the following conjecture:
Conjecture: Let
With this conjecture and my chin up, I could implement a routine to
compute eigenvalues of
First, let
Second, recall that
This fact isn’t computationally useless; it can be verified as a sanity check in code immediately because there is no ambiguity in the real parts, if the conjecture is true.
The imaginary part is a little trickier, since there is ambiguity
stemming from uncertainty around which conjugate is actually an
eigenvalue of
(The ordering of
Though asymptotically inefficient, the values for
While perhaps a neat conjecture, this wasn’t very satisfying to me. I simultaneously hadn’t fully solved what I set out to solve (finding the eigensystem, not just the eigenvalues), and I had a nagging conjecture that exposed my lack of knowledge about the problem.
Proving the conjecture
Ultimately, I had to go back to basics, ask my friends and family for help, and try to break through. I figured that while the conjecture wasn’t ultimately very computationally useful on its own, if I could prove it, maybe it would give me enough insight mathematically and computationally to come up with something better.
After some trial and error, I settled on trying to use the following fact.
Fact: Let
This is readily seen by computing the characteristic polynomial. Let
This ended up being a crucial insight, as we’ll see.
Another fact I needed was the following. As a matter of notation, let
Fact: Let
This is seen by, again, looking at the characteristic polynomial and using the properties of complex conjugation:
These facts were enough for me to refine the conjecture:
Conjecture (redux): Let
This new conjecture is equivalent to the old one by way of those two facts.
Now, things started to look good. If I could find a similarity
transform in
Since we are “allowed” to work over
Again, after trial and error, I found that
Only until I constructed this matrix, let’s call it
did I have a big “aha” moment. I was hitherto so focused on the
(wrong) idea that our complex-to-real transformation was unique or
canonical. I “knew” that we could choose either position of
At this point, it was obvious what to do. This was the critical insight.
Our matrix
is a block matrix and
i.e., multiplication of these matrices results in
will be a block matrix equivalent to substituting each disjoint
We’re still not where we’re at. This transformed matrix will be a
checkerboard pattern of
We can recover the matrix for
And with that, we have a similary transform:
Since eigenvalues are preserved under similarity, we’ve proved the conjecture.
Revisiting the computation
We have proved the conjecture, but does that actually get us any further in our quest to compute the eigensystem of a complex matrix using an algorithm for real matrices?
It does; we now know that the eigenvalues of
If we return to thinking about direct sums, then the eigenvectors of a
matrix
As such, in our block-diagonal basis, the eigenvectors of
All that’s left to determine is: Which eigenvector is the right one without doing a costly similarity transform?
To do this, we “disembed” the eigenvector from the vector space of
where these correspond to eigenvalues
- The second vector is the entry-wise conjugate of the first vector,
directly suggesting they’re each drawn from either of the
eigenvector sets of
or , and - each
pair of entries in each vector corresponds to our basis vectors of our ambient vector space.
Also, notice the resemblance between pairs of entries in our first (“true”) eigenvector, and our complex number representation:
Taking either vector, we wish to annihilate the “wrong” one and send
the “right” one to the space of
for integers
If
The final pseudocode
In summary, our algorithm to compute the eigensystem of a complex matrix is as follows.
INPUT:
n : an integer, the dimension of the problem
U : an n x n matrix of complex numbers
OUTPUT:
Lambda : a list of complex numbers, eigenvalues of U
Y : a list of complex n-vectors, eigenvectors of U
Step 1:
V : a 2n x 2n matrix of real numbers
Let V = a block matrix constructed by
expanding each element a+bi of U
into a matrix [a, b; -b, a]
Step 2:
Mu : a list of complex numbers
X : a list of complex vectors of dimension n
Let Mu, X = eigenvalues and eigenvectors of V
using a program to compute eigenvalues
of real numbers
Step 3:
Initialize Lambda = [] and Y = []
For mu, x in Mu, X:
y : a complex n-vector
For k from 0 to n-1:
Let y[k] = x[2*k] - i*x[2*k+1]
If y is a non-zero vector:
Push mu onto Mu
Push y onto Y
Much ado about nothing?
While this was an interesting puzzle, and leads to working code, was it all worth it? Honestly, I’m not sure it’s the best engineering decision. What happens when we need singular-value decomposition, or some other goofy matrix algorithm? It’s difficult to imagine the complex embedding trick will work well.
On the other hand, it saves us from using more antiquated FORTRAN 77 code than we need to. :)
Thanks to Juan Bello-Rivas, Erik Davis, Bryan Fong, Brendan Pawlowski, and Eric Peterson for insightful discussions.