A tutorial quantum interpreter in 150 lines of Lisp
By Robert Smith
Simulating a universal, gatebased quantum computer on a classical computer has many uses and benefits. The top benefit is the ability to inspect the amplitudes of the system’s state directly. However, while the mathematics is very well understood, implementing a generalpurpose simulator has largely been folk knowledge. In this tutorial, we show how to build an interpreter for a generalpurpose quantum programming language called $\mathscr{L}$, capable of executing most kinds of quantum circuits found in literature. It is presented economically, allowing its implementation to take fewer than 150 lines of selfcontained Common Lisp code. The language $\mathscr{L}$ is very simple to extend, making the interpreter ripe for testing different kinds of behavior, such as noise models.
Contents
Introduction
Simulating the workings of an ideal quantum computer has many important applications, such as algorithms research and quantum program debugging. A variety of quantum computer simulators exist, both free and commercial. However, while the concept of the simulation of quantum computers is generally well understood at a high level, the devil is in the details when it comes to implementation.
Quantum computer simulators found in the wild often have many limitations. The most prevalent limitation is the number of qubits an operator can act on. Usually, onequbit gates and controlled onequbit^{1} gates are allowed, but nothing more. While these together are sufficient for universal quantum computation, it leaves much to be desired when studying quantum algorithms.
In this post, we present an implementation of a fully general quantum programming language interpreter, allowing measurement as well as arbitrary unitary operators on an arbitrary number of arbitrarily indexed qubits. The implementation weighs in at under 150 lines^{2} of code in Common Lisp, though the ideas make implementation simple in other languages as well. All of the code from this tutorial can be found on GitHub.
This tutorial is aimed at a quantum computing beginner who has some familiarity with the fundamentals of linear algebra and computer programming. Beyond those subjects, this tutorial is relatively selfcontained. We also aim this tutorial at practitioners of quantum computing, who are interested in the brass tacks of simulation, with all of the details filled out. To such practitioners, the bulk of this document will be easy to skim, since we recapitulate topics such as qubits and unitary operators.
A note about Common Lisp
We use Common Lisp, because it is an excellent platform for both exploratory and highperformance computing. One of the fastest and most flexible quantum simulators out there, the Quantum Virtual Machine, is written entirely in Common Lisp.
We wrote this article so that it would be easy to follow along with a Common Lisp implementation. The code has no dependencies, and should work in any ANSIcompliant implementation (I hope).
With that said, this article was also written with portability in mind. Since no especially Lisplike features are used, the code should be easy to port to Python or even C. At minimum, your language should support complex numbers and arrays.
A note to experienced quantum computing practitioners
This section is written for experienced practitioners of quantum computing who happened upon this post, and can be skipped.
In this post, we opt to simulate a quantum circuit the “Schrodinger” way, that is, by evolving a wavefunction explicitly. For a circuit of width $n$, we walk through the mathematics of how to interpret a $k$qubit gate $g \in \mathsf{SU}(2^k)$ for $k\le n$, specified to act on a $k$tuple of numbered qubits—corresponding to each qubit’s position in the tensor product which forms the Hilbert space of the system—as a full operator $g'\in\mathsf{SU}(2^n)$. We do this by providing an explicit construction of the matrix in the computational basis of the system.
An alternative approach would have been to describe the action of a $g$ on an $n$qubit wavefunction by way of careful manipulation of indexes, i.e., to effectively permute and partition our wavefunction into $2^{nk}$ groups of $2^k$dimensional vectors corresponding to the subsystem of qubits being operated on. The major benefit of this approach is efficiency.
As a first introduction to a computer science graduate, I find this explanation lacking in two ways:
 It underemphasizes that a gate like $\mathsf{CNOT}$, typically written as a $4\times 4$ matrix $\mathsf{I}\oplus\mathsf{X}$, in a quantum circuit truly is a linear operator on the Hilbert space of the entire system. “It’s just linear algebra; here’s the matrix and here’s the vector” is a point I want to drive home.
 It requires significant labor to both explain and prove the correctness of the method, without having significant experience with tensor algebra, contractions, Einstein notation, and so on.
The approach of this post can be used as a basis to follow up with more efficient techniques, without relinquishing a strong mathematical foundation. We are very careful to not be handwavy, and to not conflate the different vector spaces at play. We hope that you’ll find this approach agreeable, even if it sacrifices some efficiency.
The Language $\mathscr{L}$
We wish to construct an interpreter for a small quantum programming language named $\mathscr{L}$. This language supports both of the fundamental operations of a quantum computer: gates and measurements.
A gate is an operation that modifies a quantum state. (What a quantum state is exactly we will delve into later.) Because quantum states are large compared to the physical resources used to construct them, gates represent the “powerful” operations of a quantum computer.
A measurement is an observation and collapse of the quantum state, producing one bit (i.e., $0$ or $1$) of classical information per qubit. Measurements represent the only way in which one can extract information from our simulated quantum computer, and indeed, in most programming models for real quantum computers.
In some sense, one might think of the language $\mathscr{L}$ as the simplest nontrivial quantum programming language. A program in $\mathscr{L}$ is just a sequence of gates and measurements. The syntax is as follows:
NonTerminal  Defintion  

program  :=  ( instruction* ) 
instruction  :=  ( GATE matrix qubit+ ) 
  ( MEASURE ) 

matrix  :=  a complex matrix #2A( … ) 
qubit  :=  a nonnegative integer 
Spaces and newlines are ignored, except to delimit the tokens of our language.
We borrow Common Lisp’s twodimensional array syntax for the syntax of
matrices. In Common Lisp, the matrix $\left(\begin{smallmatrix}1 &
2\\3 & 4\end{smallmatrix}\right)$ is written #2A((1 2) (3 4))
. We
also borrow the syntax for complex numbers: $12i$ is written #C(1 2)
.
An example program might be one to construct and subsequently measure
two qubits labeled 2
and 5
in a Bell state configuration:
(
(GATE #2A((0.70710677 0.70710677) (0.70710677 0.70710677)) 2)
(GATE #2A((1 0 0 0) (0 1 0 0) (0 0 0 1) (0 0 1 0)) 2 5)
(MEASURE)
)
We will model the semantics of $\mathscr{L}$ operationally, by way of an abstract machine. The abstract machine for $\mathscr{L}$ is called $M_n$, where $n$ is a positive but fixed^{3} number of qubits. The state of the machine $M_n$ is the pair $(v, b)$ where $v$ is a quantum state, and $b$ is an $n$bit measurement register.
The quantum state is an element of the set
$$\{\Vert v\Vert=1\mid v\in\mathbb{C}^{2^n}\}.$$
In other words, $v$ is a unit vector of dimension $2^n$ over the complex numbers. We will discuss this from first principles in the next section.
The measurement register is an element of the set $\{0,1\}^n$, i.e., a sequence of $n$ bits, which we realize as a nonnegative integer. The $k$th leastsignificant bit of this integer represents the last observation of the qubit numbered as $k$. We will discuss this in detail as well.
In Common Lisp, it suffices to create a structure machine
which holds these two pieces of state.
(defstruct machine
quantumstate
measurementregister)
Typically, the machine is initialized with each classical bit in the measurement register $0$, and each qubit starting in the zerostate. (However, for the purposes of algorithm study or debugging, the machine may be initialized with any valid state.)
The precise way in which the language $\mathscr{L}$ is interpreted on $M_n$ is what we describe in this tutorial. Before that, however, we find it most important to describe what exactly a quantum state is, and how to represent it on a computer.
The Quantum State
Where does one qubit live?
Quantum computers are usually just a collection of interacting computational elements called qubits. A single qubit has two distinguished states: $\ket{0}$ and $\ket{1}$. If the qubit has a name like $q$, then we label the states $\ket{0}_q$ and $\ket{1}_q$.
The funny notation is called Dirac notation or braket notation. It happens to be a convenient notation for doing calculations in quantum mechanics, and we just use it for consistency with other texts. The ket $\ket{\cdots}$, as a physicist would call it, doesn’t add any special significance, except to denote that the quantity is a vector. One can actually put anything inside the brackets. In usual linear algebra, one often writes $\mathbf{e}_i$ to denote a basis vector, where in quantum mechanics, one just writes the subscript in a ket $\ket{i}$, dropping the $\mathbf{e}$ entirely. If the notation throws you off, and you’d like to think in more traditional written linear algebra notation, you can always replace $\ket{x}$ with $\vec x$, and you’ll be safe.
These distinguished states $\ket{0}$ and $\ket{1}$ are understood to be orthonormal basis vectors in a vector space whose scalars are complex numbers $\mathbb{C}$. As such, a qubit can be $\ket{0}$, $\ket{1}$, or a superposition $\alpha\ket 0 + \beta\ket 1$, where $\alpha$ and $\beta$ are complex numbers. The numbers $\alpha$ and $\beta$ are called probability amplitudes, because $\vert\alpha\vert^2$ (resp. $\vert\beta\vert^2$) represent the probability of the qubit being observed in the $\ket 0$ (resp. $\ket 1$) state. Since they represent probabilities, there’s an additional constraint, namely that the probabilities add to one: $\vert\alpha\vert^2 + \vert\beta\vert^2=1$.
To those unfamiliar, it may not be obvious why we’ve opted to use the language of linear algebra. Why do we consider a qubit as being a linear combination? Why do we suppose that the observable states are orthonormal vectors? Why can’t we simply say that a qubit is just a pair of complex numbers and move on?
The reason for this is scientific, and not mathematical. It turns out that the best theory of quantum mechanics we have is one which describes transformations between states as being linear. In fact, the evolution of a quantum mechanical system is not only described by an operation that is just linear, but also reversible. These conditions—linear, reversible, and lengthpreserving—give rise to a special class of transformations called unitary operators, which naturally lead us to the discussion of vector spaces over complex numbers^{4}.
We will discuss the nature of these operations in more depth when we consider how to implement gates later on. For now, however, it’s sufficient to think of a qubit named $q$ as something that lives in a complex, twodimensional vector space, which we will call $$B_q := \operatorname{span}_{\mathbb{C}}\{\ket 0_q, \ket 1_q\}.$$ (We will use this $B_q$ notation a few times throughout this tutorial. Remember it!) We also understand that this space is equipped^{5} with a way to calculate lengths of vectors—the usual norm
$$ \left\Vert\alpha\ket{0}+\beta\ket{1}\right\Vert = \sqrt{\vert\alpha\vert^2+\vert\beta\vert^2}. $$
Many qubits
Roughly speaking, a single qubit can be described by two probabilities. How do we deal with more?
Suppose we have two qubits named $X$ and $Y$. As a pair, quantum mechanics tells us that they can interact. Practically, what that means is that their states can be correlated in some way. If they’ve interacted, knowing information about $X$ might give us a clue about what $Y$ might be. One wellknown example of this is the Bell state, which can be summarized as follows:
Qubit $X$  Qubit $Y$  Prob. Amp.  Probability 

$\ket 0_X$  $\ket 0_Y$  $1/\sqrt{2}$  $50\%$ 
$\ket 0_X$  $\ket 1_Y$  $0$  $0\%$ 
$\ket 1_X$  $\ket 0_Y$  $0$  $0\%$ 
$\ket 1_X$  $\ket 1_Y$  $1/\sqrt{2}$  $50\%$ 
Here, we have an example of a nonfactorizable state; qubits $X$ and $Y$ are correlated to each other dependently. If we know $X$ is in the $\ket 0_X$ state, then we necessarily know that $Y$ is in the $\ket 0_Y$ state. Such a correlation means it’s not possible to express the probabilities independently. It might be tempting to think that one can simply think of $X$ having a $50\%$ probability of being in either basis state, and $Y$ having a $50\%$ probability of being in either state—facts which are certainly true—but considering those independently would give us a different distribution of probabilities of the system:
Qubit $X$  Qubit $Y$  Probability 

$\ket 0_X$  $\ket 0_Y$  $P(X=\ket 0_X)P(Y=\ket 0_Y)=25\%$ 
$\ket 0_X$  $\ket 1_Y$  $P(X=\ket 0_X)P(Y=\ket 1_Y)=25\%$ 
$\ket 1_X$  $\ket 0_Y$  $P(X=\ket 1_X)P(Y=\ket 0_Y)=25\%$ 
$\ket 1_X$  $\ket 1_Y$  $P(X=\ket 1_X)P(Y=\ket 1_Y)=25\%$ 
This state is called factorizable because we can express each probability as a product of probabilities pertaining to the original qubits, i.e., each probability has a form that looks like $P(X)P(Y)$. Note that here, knowing something about $X$ gives us no information about $Y$, since they’re completely independent. With that said, it should be emphasized that factorizable states are perfectly valid states, but they don’t represent the entirety of possible states.
If qubits $X$ and $Y$ live in the linear spaces $B_X$ and $B_Y$ respectively, then the composite space is written $B_X\otimes B_Y$. This is called a tensor product, which is a way to combine spaces with the above structure. Formally, if we have an $m$dimensional vector spaces $V:=\operatorname{span}\{v_1,\ldots,v_m\}$ and an $n$dimensional vector space $W:=\operatorname{span}\{w_1,\ldots,w_n\}$, then their tensor product $T:=V\otimes W$ will be an $mn$dimensional vector space $\operatorname{span}\{t_1,\ldots,t_{mn}\}$, where each $t_i$ is a formal combination of basis vectors from $V$ and $W$. (There are of course $mn$ different combinations of $v$’s and $w$’s.) To give an example without all the abstraction, consider $V$ with a basis $\{\vec x, \vec y, \vec z\}$ and $W$ with a basis $\{\vec p, \vec q\}$. Then $V\otimes W$ will have a basis
$$ \left\{ \begin{array}{lll} \vec x\otimes\vec p, & \vec y\otimes\vec p, & \vec z\otimes\vec p, \\ \vec x\otimes\vec q, & \vec y\otimes\vec q, & \vec z\otimes\vec q\hphantom{,} \end{array} \right\}. $$
An example vector in the space $V\otimes W$ might be
$$ i(\vec x\otimes\vec p)  2(\vec y\otimes\vec p) + 3 (\vec z\otimes\vec p) + \frac{1}{4}(\vec x\otimes\vec q)  \sqrt{5}(\vec y\otimes\vec q) + e^{6\pi}(\vec z\otimes\vec q), $$
assuming these vector spaces are over $\mathbb{C}$.
Intuitively, a tensor product “just” gives us a way to associate a number with each possible combination of basis vector. In our case, we need to associate a probability amplitude with each combination of distinguished qubit basis states. We need this ability since—as we’ve established—we need to consider every possible holistic outcome of a collection of qubits, as opposed to the outcomes of the qubits independently. (The former constitute both factorizable and nonfactorizable states, while the latter only include factorizable states.)
BitString notation and a general quantum state
If we have qubits $X$, $Y$, and $Z$, then they’ll live in the space $B_X\otimes B_Y\otimes B_Z$, which we’ll call $Q_3$. It will be massively inconvenient to write the basis vectors as, for example, $\ket 0_X\otimes \ket 1_Y\otimes\ket 1_Z$, so we instead use the shorthand $\ket{011}$ when the space has been defined. This is called bitstring notation. A general element $\ket\psi$ of $Q_3$ can be written $$\psi_0\ket{000}+\psi_1\ket{001}+\psi_2\ket{010}+\psi_3\ket{011}+\psi_4\ket{100}+\psi_5\ket{101}+\psi_6\ket{110}+\psi_7\ket{111}.$$ There are two substantial benefits from using bitstring notation. These benefits are much more thoroughly explained in this paper—which was a precursor to this very blog post.
The first benefit is that the names of the qubits—$X$, $Y$, and $Z$—have been abstracted away. They’re now just positions in a bitstring, and we can canonically name the qubits according to their position. We record positions from the right starting from zero, so $X$ is in position $2$, $Y$ is in position $1$, and $Z$ is in position $0$.
The second benefit is one relevant to how we implement quantum states on a computer. As written, the probability amplitude $\psi_i$ has an index $i$ whose binary expansion matches the bitstring of the basis vector whose scalar component is $\psi_i$. This is no accident. The main outcome of this is that we can use a nonnegative integer as a way of specifying a bitstring, which also acts as an index into an array of probability amplitudes. So for instance, the above state can be written further compactly as $$\ket\psi=\sum_{i=0}^7\psi_i\ket i.$$ Here, $\ket i$ refers to the $i$th bitstring in lexicographic (“dictionary”) order, or equivalently, the binary expansion of $i$ as a bitstring.
Since qubits live in a twodimensional space, then $n$ qubits will live in a $2^n$dimensional space. With a great deal of work, we’ve come to our most general^{6} representation of an $n$qubit system: $$\sum_{i=0}^{2^n1}\psi_i\ket i,$$ where $\vert\psi_i\vert^2$ gives us the probability of observing the bitstring $\ket i$, implying $$\sum_{i=0}^{2^n1}\vert\psi_i\vert^2=1.$$
On a computer, representing a quantum state for an $n$qubit system is simple: It’s just an array of $2^n$ complex numbers. An index $i$ into the array represents the probability amplitude $\psi_i$, which is the scalar component of $\ket{i}$. So, for instance, the state $\ket{000}$ in a 3qubit system is represented by an array whose first element is $1$ and the rest $0$. Here is a function to allocate a new quantum state of $n$ qubits, initialized to be in the $\ket{\ldots 000}$ state:
(defun makequantumstate (n)
(let ((s (makearray (expt 2 n) :initialelement 0.0d0)))
(setf (aref s 0) 1.0d0)
s))
Sometimes, given a quantum state, or even an operator on a quantum
state, we will want to recover how many qubits the state represents,
or the operator acts on. In both cases, the question reduces to
determining the number of qubits that a dimension represents. Since
our dimensions are always powers of two, we need to compute the
equivalent of a binary logarithm. In Common Lisp, we can compute this
by computing the number of bits an integer takes to represent using
integerlength
. The number $2^n$ is always a 1
followed by $n$
0
’s, so the length of $2^n$ in binary is $n+1$.
(defun dimensionqubits (d)
(1 (integerlength d)))
Evolving the quantum state
Since the quantum state is a vector, the principal way we change it is
through linear operators represented as matrices. As our quantum
program executes, we say that the quantum state
evolves. Matrix–vector multiplication is accomplished with
applyoperator
and matrix–matrix multiplication is accomplished
with composeoperators
. There is nothing special about these
functions; they are the standard textbook algorithms.
(defun applyoperator (matrix column)
(let* ((matrixsize (arraydimension matrix 0))
(result (makearray matrixsize :initialelement 0.0d0)))
(dotimes (i matrixsize)
(let ((element 0))
(dotimes (j matrixsize)
(incf element (* (aref matrix i j) (aref column j))))
(setf (aref result i) element)))
(replace column result)))
(defun composeoperators (A B)
(destructuringbind (m n) (arraydimensions A)
(let* ((l (arraydimension B 1))
(result (makearray (list m l) :initialelement 0)))
(dotimes (i m result)
(dotimes (k l)
(dotimes (j n)
(incf (aref result i k)
(* (aref A i j)
(aref B j k)))))))))
These functions will sit at the heart of the interpreter, which will be elaborated upon in the section about gates.
Measurement
Already, through the construction of our quantum state, we’ve discussed the idea that the probability amplitudes imply a probability of observing a state. Measurement then amounts to looking at a quantum state as a discrete probability distribution and sampling from it.
Measurement in quantum mechanics is sideeffectful; observation of a quantum state also simultaneously collapses that state. This means that when we measure a state to be a bitstring, then the state will also become that bitstring, zeroing out every other component in the process.
We thus implement the process of measurement in two steps: The sampling of the state followed by its collapse.
(defun observe (machine)
(let ((b (sample (machinequantumstate machine))))
(collapse (machinequantumstate machine) b)
(setf (machinemeasurementregister machine) b)
machine))
Note that we’ve recorded our observation into the measurement register. We now proceed to define what we mean by sample
and collapse
.
How shall we sample? This is a classic problem in computer science. If we have $N$ events $\{0, 1,\ldots,N1\}$, such that event $e$ has probability $P(e)$, then we can sample as follows. Consider the partial sums defined by the recurrence $S(0)=0$ and $S(k)=S(k1) + P(k1)$. If we draw a random number $r$ uniformly from $[0,1)$, then we wish to find the $k$ such that $S(k)\leq r < S(k+1)$. Such a $k$ will be a sampling of our events according to the imposed probability distribution.
We can implement this simply by computing successive partial sums, until our condition is satisfied. In fact, we can be a little bit more resourceful. We can find when $rS(k+1)<0$, which amounts to successive updates $r\leftarrow rP(k)$.
With a quantum system, we have $P(\ket i) = \vert\psi_i\vert^2$, and the sampled $k$ is the bitstring $\ket k$ we find.
Let’s do an example. Suppose we have a quantum state
$$ \sqrt{0.2}\ket{00}  \sqrt{0.07}\ket{01} + \sqrt{0.6}\ket{10} + \sqrt{0.13}\ket{11}. $$
Then our discrete probability distribution is:
$$ P(\ket{00}) = 0.2\qquad P(\ket{01}) = 0.07\qquad P(\ket{10}) = 0.6\qquad P(\ket{11}) = 0.13 $$
Next, suppose we draw a random number $r = 0.2436$. We first check if $r < 0.2$. It’s not, so $\ket{00}$ is not our sample. Subtract it from $r$ to get $r = 0.0436$. Next check if $r < 0.07$. Yes, so our sample is $\ket{01}$. Pictorially, this looks like the following:
The implementation is straightforward:
(defun sample (state)
(let ((r (random 1.0d0)))
(dotimes (i (length state))
(decf r (expt (abs (aref state i)) 2))
(when (minusp r) (return i)))))
Collapsing to $\ket k$ is simply zeroing out the array and setting $\psi_k$ to $1$.
(defun collapse (state basiselement)
(fill state 0.0d0)
(setf (aref state basiselement) 1.0d0))
Gates
Gates as matrices
Gates are the meat of most quantum algorithms. They represent the “hard work” a quantum computer does. As previously described, a gate $g$ is a transformation that is linear, invertible, and lengthpreserving.

Linear: $g(a\ket\psi+b\ket\phi)=ag(\ket\psi)+bg(\ket\phi)$.

Invertible: There is always an operation $h$ that can cancel out the effect of $g$: $h(g(\ket\psi))=g(h(\ket\psi))=\ket\psi$.

LengthPreserving: $\Vert g(\ket\psi)\Vert = \Vert\ket\psi\Vert$.
These ideas are captured by an overarching idea called a linear isometry, which comes from the Greek word isometria, with isos meaning “equal” and metria meaning “measuring”. As with all linear transformations, we can write them out as a matrix with respect to a particular basis. Matrices representing linear isometries are called unitary matrices^{7}.
The simplest gate must be identity, a gate which does nothing.
$$ \mathsf{I} := \begin{pmatrix} 1 & 0\\ 0 & 1 \end{pmatrix} $$
In Common Lisp, this would be defined as
(defparameter +I+ #2A((1 0)
(0 1)))
which we will make use of later. Just a notch higher in complexity would be the quantum analog of a Boolean “NOT”. This is called the $\mathsf{X}$ gate:
$$ \mathsf{X} := \begin{pmatrix} 0 & 1\\ 1 & 0 \end{pmatrix}. $$
This has the effect of mapping $\mathsf{X}\ket 0=\ket 1$, which means directly that $\mathsf{X}\ket 1=\ket 0$ and therefore it is its own inverse: $\mathsf{X}\mathsf{X} = \mathsf{I}$ so $\mathsf{X}=\mathsf{X}^{1}$.
We suggest rereviewing how one interprets a matrix as an explicit mapping of each element of the basis, as it helps make sense of gates. In this tutorial, gate matrices are always specified in terms of the bitstring basis
$$ \{\ket{\ldots000}, \ket{\ldots001}, \ket{\ldots010}, \ket{\ldots011}, \ldots\}. $$
We again refer the reader to this paper for an indepth discussion about this basis.
In the rest of this section, the whole goal is to be able to apply
gates to our quantum state. There are two cases of pedagogical and
operational interest: the onequbit gate and the manyqubit gate. We
will write two functions to accomplish each of these, in order to
implement a general function called applygate
for applying any kind
of gate on any collection of qubits for any quantum state.
(defun applygate (state U qubits)
(assert (= (length qubits) (dimensionqubits (arraydimension U 0))))
(if (= 1 (length qubits))
(%apply1Qgate state U (first qubits))
(%applynQgate state U qubits)))
Gates on multiqubit machines
If we are working with the machine $M_n$, then our space is $2^n$dimensional, and as such, our matrices would be written out as $2^n\times 2^n$ arrays of numbers. If we can write out such a matrix, then applying it is as simple as a matrix–vector multiplication. For instance, for a $4$qubit machine, an $\mathsf{X}$ on qubit $0$ would be written
$$ \begin{pmatrix} 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \end{pmatrix}, $$
which could be readily applied to a $16$element quantum state vector. It is easy to verify that this will swap the components of $\ket{\ldots 0}$ with the corresponding components of $\ket{\ldots 1}$.
But as should be plainly obvious from the obnoxious amount of paper wasted by writing out this matrix, it would be better if we could simply generate this matrix with just three pieces of information: the gate matrix $g=\left(\begin{smallmatrix}0 & 1\\1 & 0\end{smallmatrix}\right)$, the qubit index $i=0$, and the size of the machine $n=4$. This is a process we will call lifting.
Lifting requires a fundamental tool for constructing operators on spaces that were formed out of tensor products. If we have two finitedimensional vector spaces $U$ and $V$, and operators $f$ and $g$ on the spaces respectively, then it seems reasonable to consider how $f$ and $g$ transform $U\otimes V$. In some sense, applying $f$ and $g$ “in parallel” on $U\otimes V$ correspond to a new linear operator $h$. If $f$ and $g$ are matrices, then $h$ is defined by a block matrix
$$ \begin{equation} h_{i,j} = f_{i,j} g. \label{eq:kron} \end{equation} $$
More specifically, let $0 \leq i,j < \dim U$. The matrix $h$ will be an array of $\dim U \times \dim U$ copies of $g$, where the entries of the $(i,j)$th blocks are multiplied by the single scalar $f_{i,j}$. This will lead to a matrix with $(\dim U)(\dim V)$ rows and columns, which is exactly the dimension of $U\otimes V$. Incidentally, we write $h$ as $f\otimes g$, and this combination of operators is called the Kronecker product^{8}. As code:
(defun kroneckermultiply (A B)
(destructuringbind (m n) (arraydimensions A)
(destructuringbind (p q) (arraydimensions B)
(let ((result (makearray (list (* m p) (* n q)))))
(dotimes (i m result)
(dotimes (j n)
(let ((Aij (aref A i j))
(y (* i p))
(x (* j q)))
(dotimes (u p)
(dotimes (v q)
(setf (aref result (+ y u) (+ x v))
(* Aij (aref B u v))))))))))))
As a matter of terminology, remember that tensor products combine vector spaces, and Kronecker products combine operator matrices.
Singlequbit gates and gates on adjacent qubits
From here, we can very easily lift onequbit gates to machines with any number of qubits. A gate $g$ on qubit $i$ in an $n$qubit machine is just $g$ applied to qubit $i$ and the identity $\mathsf{I}$ on all other qubits. Writing this out as a Kronecker product, we have
$$ \begin{equation} \operatorname{lift}(g, i, n) := \underbrace{\mathsf{I} \otimes \mathsf{I} \otimes \cdots}_{ni1\text{ factors}} \otimes g \otimes \underbrace{\cdots \otimes \mathsf{I}}_{i\text{ factors}}, \label{eq:liftone} \end{equation} $$
where there are a total of $n$ factors, and $g$ is at positioned $i$ factors from the right.
This concept generalizes to higherdimensional operators which act on indexadjacent qubits. In other words, if $g$ is a $k$qubit operator specifically acting on qubits
$$ (i+k1, i+k2, \ldots, i+2, i+1, i), $$
then the lifting operator from \eqref{eq:liftone} is much the same:
$$ \begin{equation} \operatorname{lift}(g, i, n) := \underbrace{\mathsf{I} \otimes \mathsf{I} \otimes \cdots}_{nik\text{ factors}} \otimes g \otimes \underbrace{\cdots \otimes \mathsf{I}}_{i\text{ factors}}. \label{eq:liftmany} \end{equation} $$
It must be emphasized one last time: This only works for multiqubit operators that act on qubits that are indexadjacent. We will get to how to work with nonadjacent qubits shortly, but first we will turn this into code.
For simplicity, we create a way to iterate a Kronecker product multiple times, that is, compute
$$ \underbrace{g\otimes \cdots \otimes g}_{n\text{ factors}}, $$
which is usually simply written $g^{\otimes n}$. We must use care when handling the case when we are “Kronecker exponentiating” by a nonpositive number, so that $f\otimes g^{\otimes 0} = f$.
(defun kroneckerexpt (U n)
(cond
((< n 1) #2A((1)))
((= n 1) U)
(t (kroneckermultiply (kroneckerexpt U (1 n)) U))))
With kronckerexpt
, we can write lift
following \eqref{eq:liftmany}:
(defun lift (U i n)
(let ((left (kroneckerexpt +I+ ( n i (dimensionqubits
(arraydimension U 0)))))
(right (kroneckerexpt +I+ i)))
(kroneckermultiply left (kroneckermultiply U right))))
Multiqubit gates on nonadjacent qubits
In this section, we assume we are working on a multiqubit machine $M_n$ with $n\ge 2$.
The general idea
So far, we’ve managed to get away with lifting operators that act on either a single qubit, or a collection of indexadjacent qubits. This has been moreorless trivial, because we can tack on a series of identity operators by way of Kronecker products to simulate “doing nothing” to the other qubits. However, if we want to apply a multiqubit gate to a collection of qubits that aren’t indexadjacent, we have to be a little more clever.
The way we accomplish this is by swapping qubits around so that we can move in and out of indexadjacency. In fact, for a given gate acting on a given collection of qubits, we aim to compute an operator $\Pi$ which moves these qubits into indexadjacency, so that we can compute
$$ \begin{equation} \Pi^{1} \operatorname{lift}(g, 0, n) \Pi. \label{eq:upq} \end{equation} $$
This recipe requires many ingredients, each of which we describe in detail.
Swapping two qubits
To start, we need some way to swap the state of two qubits. We can do this with the $\mathsf{SWAP}$ operator:
$$ \mathsf{SWAP} := \begin{pmatrix} 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix}. $$
In Common Lisp, we define this in the same way we defined +I+
.
(defparameter +SWAP+ #2A((1 0 0 0)
(0 0 1 0)
(0 1 0 0)
(0 0 0 1)))
The $\mathsf{SWAP}$ operator takes two qubits and swaps their state. What does this mean in a system of correlations, where qubit state isn’t strictly compartmentalized (i.e., factorized)? Swapping is equivalent to swapping the component of $\ket{01}$ with the component of $\ket{10}$, which are the only two distinguishable correlations^{9}. Still, in a multiqubit system, we can’t immediately arbitrarily swap two qubits with the tools we’ve developed. What we can do is swap indexadjacent qubits. In particular, we can define the transpositions
$$ \tau_i := \operatorname{lift}(\mathsf{SWAP}, i, n),\qquad \text{with }0\leq i < n  1. $$
The transposition $\tau_i$ swaps qubit $i$ with qubit $i+1$. This is our first ingredient.
Rearranging qubits to be indexadjacent
The second ingredient is a way to rearrange our qubits so that they are indexadjacent. Suppose we have a threequbit operator $g$ which acts on qubits $(2, 4, 3)$ in a machine of $n=5$ qubits. The space in which the quantum state of $M_5$ lives is
$$ B_4 \otimes B_3 \otimes B_2 \otimes B_1 \otimes B_0, $$
but we need to rearrange our state vector as if we’ve moved $B_2\to B_0$, $B_4\to B_1$, and $B_3\to B_2$ so that our substate sits indexadjacent. In combinatorics, this permutation is written in twoline notation
$$ \begin{pmatrix} 0 & 1 & 2 & 3 & 4\\ 3 & 4 & 0 & 2 & 1 \end{pmatrix}. $$
Here, we’ve made a few arbitrary decisions. First, we’ve decided to remap a $k$qubit operator to the $B_{k1}\otimes\cdots\otimes B_1\otimes B_0$ subspace. Any other indexadjacent subspace would work, but this simplifies the code. Second, we see that $0\mapsto 3$ and $1\mapsto 4$, but it doesn’t matter so much where they map to, as long as $2$, $4$, and $3$ are mapped correctly.
There’s no sense in writing the first line in twoline notation, so we just write the permutation compactly as $34021$. As a quantum operator, we write this as $\Pi_{34021}$.
The question is: How can we write $\Pi_{34021}$ as familiar operators? It is a wellknown fact in combinatorics that any permutation can be decomposed into a composition of swaps, and every swap can be decomposed into a series of adjacent transpositions. We leave this as an exercise^{10}, but we will show the code to our implementation.
We start with a function which takes a permutation written as a list,
like (3 4 0 2 1)
, and converts it to a list of (possibly
nonadjacent) transpositions to be applied lefttoright, represented
as cons cells ((0 . 3) (1 . 4) (2 . 3))
.
(defun permutationtotranspositions (permutation)
(let ((swaps nil))
(dotimes (dest (length permutation) (nreverse swaps))
(let ((src (elt permutation dest)))
(loop :while (< src dest) :do
(setf src (elt permutation src)))
(cond
((< src dest) (push (cons src dest) swaps))
((> src dest) (push (cons dest src) swaps)))))))
Next, we convert these transpositions as cons cells to adjacent
transposition indexes. This is straightforward. If we are swapping
$(a,b)$ with $a<b$, then we transpose $(a, a+1)$, then $(a+1, a+2)$,
and so on until $(b1, b)$, followed by a reversal of each except
$(b1, b)$. We can simply write this chain of adjacent transpositions
as $(a, a+1, \ldots, b1, \ldots, a+1, a)$. In this example, we’d have
the transposition indexes (0 1 2 1 0 1 2 3 2 1 2)
.
(defun transpositionstoadjacenttranspositions (transpositions)
(flet ((expandcons (c)
(if (= 1 ( (cdr c) (car c)))
(list (car c))
(let ((trans (loop :for i :from (car c) :below (cdr c)
:collect i)))
(append trans (reverse (butlast trans)))))))
(mapcan #'expandcons transpositions)))
These are indexes $i_1, i_2, \ldots$ such that $\Pi = \cdots \tau_{i_2}\tau_{i_1}$
The last ingredient we need is inverting $\Pi$. If we have $\Pi$ represented as a sequence of $\tau$, then we simply reverse the list of $\tau$.
Using transpositions to implement multiqubit gates
With all of these, we write what is perhaps the most important function of our interpreter.
(defun %applynQgate (state U qubits)
(let ((n (dimensionqubits (length state))))
(labels ((swap (i)
(lift +swap+ i n))
(transpositionstooperator (trans)
(reduce #'composeoperators trans :key #'swap)))
(let* ((U01 (lift U 0 n))
(fromspace (append (reverse qubits)
(loop :for i :below n
:when (not (member i qubits))
:collect i)))
(trans (transpositionstoadjacenttranspositions
(permutationtotranspositions
fromspace)))
(to>from (transpositionstooperator trans))
(from>to (transpositionstooperator (reverse trans)))
(Upq (composeoperators to>from
(composeoperators U01
from>to))))
(applyoperator Upq state)))))
A few quick notes for comprehension:

The value of
(swap i)
is $\tau_i$ fully lifted. 
The oneline zinger that defines
transpositionstooperator
takes a list of transposition indexes and converts it into a unitary operator. It does so by doing what’s known in functional programming as a mapreduce, by first mapping $i\mapsto\tau_i$ and reducing by operator composition. 
The variable
fromspace
contains the permutation $p$ that encodes the space in which we’d like to act. This permutation is calculated based off of thequbits
argument. 
The variables
from>to
andto>from
represent $\Pi_p$ and $\Pi^{1}_p$ respectively. 
The variable
Upq
is our fully lifted operator, exactly by way of \eqref{eq:upq}.
The function %applynQgate
is what allows our interpreter to be so
general. Making the interpreter more efficient ultimately is an
exercise in making this function more efficient.
The only thing left to do is integrate all of the topics discussed hitherto into an interpreter!
An interpreter
The driver loop
The bulk of the interpreter has been written. We’ve described the
semantics of the two instructions of interest: MEASURE
and
GATE
. Now we create the interpreter itself, which is just a driver
loop to read and execute these instructions, causing state transitions
of our abstract machine. If we see a GATE
, we call applygate
. If
we see a MEASURE
, we call observe
.
(defun runquantumprogram (qprog machine)
(loop :for (instruction . payload) :in qprog
:do (ecase instruction
((GATE)
(destructuringbind (gate &rest qubits) payload
(applygate (machinequantumstate machine) gate qubits)))
((MEASURE)
(observe machine)))
:finally (return machine)))
Efficiency
Performancefocused individuals will have noticed that this interpreter is pretty costly, in many ways. The biggest cost is also unavoidable: The fact that our state grows exponentially with the number of qubits. Real, physical quantum computers avoid this cost, which makes them alluring machines to both study and construct.
However, even with this unavoidable cost, this interpreter has been implemented for ease of understanding and not machine efficiency. Writing a faster interpreter amounts to avoiding the construction of the lifted operator matrices. This can be done with very careful index wrangling and sensitivity to data types and allocation. This is how the highperformance Quantum Virtual Machine is implemented.
Examples
What good is writing an interpreter if we don’t write any programs worth interpreting? Here are a few examples of programs.
Bell state
The Bell state is one which we’ve explored earlier. It is a twoqubit state $$\frac{1}{\sqrt{2}}(\ket {00} + \ket {11}).$$ Here’s a program to generate one, using two new gates, the controllednot gate $\mathsf{CNOT}$ and the Hadamard gate $\mathsf{H}$.
(defparameter +H+ (makearray '(2 2) :initialcontents (let ((s (/ (sqrt 2))))
(list (list s s)
(list s ( s))))))
(defparameter +CNOT+ #2A((1 0 0 0)
(0 1 0 0)
(0 0 0 1)
(0 0 1 0))))
(defun bell (p q)
`((GATE ,+H+ ,p)
(GATE ,+CNOT+ ,p ,q)))
Greenberger–Horne–Zeilinger state
The Greenberger–Horne–Zeilinger state, or GHZ state, is a generalization of the Bell state on more than two qubits, namely $$\frac{1}{\sqrt{2}}(\ket{0\ldots 000} + \ket{1\ldots 111}).$$ This is accomplished by executing a chain of controllednot gates:
(defun ghz (n)
(cons `(GATE ,+H+ 0)
(loop :for q :below (1 n)
:collect `(GATE ,+CNOT+ ,q ,(1+ q)))))
The quantum Fourier transform
The ordinary discrete Fourier transform of a complex vector is a unitary operator, and as such, it can be encoded as a quantum program. We will write a program which computes the Fourier transform of the probability amplitudes of an input quantum state (a timedomain signal), producing a new quantum state whose amplitudes represent components in the frequency domain. This is the central subroutine to Shor’s algorithm, which is a quantum algorithm which factors integers faster than any known classical method.
First, we will need a gate called the controlledphase gate $\mathsf{CPHASE}(\theta)$:
(defun cphase (angle)
(makearray '(4 4) :initialcontents `((1 0 0 0)
(0 1 0 0)
(0 0 1 0)
(0 0 0 ,(cis angle)))))
Now, we can generate the quantum Fourier transform recursively.
(defun qft (qubits)
(labels ((bitreversal (qubits)
(let ((n (length qubits)))
(if (< n 2)
nil
(loop :repeat (floor n 2)
:for qs :in qubits
:for qe :in (reverse qubits)
:collect `(GATE ,+swap+ ,qs ,qe)))))
(%qft (qubits)
(destructuringbind (q . qs) qubits
(if (null qs)
(list `(GATE ,+H+ ,q))
(let ((cR (loop :with n := (1+ (length qs))
:for i :from 1
:for qi :in qs
:for angle := (/ pi (expt 2 ( n i)))
:collect `(GATE ,(cphase angle) ,q ,qi))))
(append
(qft qs)
cR
(list `(GATE ,+H+ ,q))))))))
(append (%qft qubits) (bitreversal qubits))))
The program for a threequbit quantum Fourier transform (qft '(0 1 2))
looks like this:
(
(GATE #2A((0.7071067811865475d0 0.7071067811865475d0) (0.7071067811865475d0 0.7071067811865475d0)) 2)
(GATE #2A((1 0 0 0) (0 1 0 0) (0 0 1 0) (0 0 0 #C(0.0d0 1.0d0))) 1 2)
(GATE #2A((0.7071067811865475d0 0.7071067811865475d0) (0.7071067811865475d0 0.7071067811865475d0)) 1)
(GATE #2A((1 0 0 0) (0 0 1 0) (0 1 0 0) (0 0 0 1)) 1 2)
(GATE #2A((1 0 0 0) (0 1 0 0) (0 0 1 0) (0 0 0 #C(0.7071067811865476d0 0.7071067811865475d0))) 0 1)
(GATE #2A((1 0 0 0) (0 1 0 0) (0 0 1 0) (0 0 0 #C(0.0d0 1.0d0))) 0 2)
(GATE #2A((0.7071067811865475d0 0.7071067811865475d0) (0.7071067811865475d0 0.7071067811865475d0)) 0)
(GATE #2A((1 0 0 0) (0 0 1 0) (0 1 0 0) (0 0 0 1)) 0 2)
)
(Recall that #C(0 1)
represents the complex number $i$.)
We can see the quantum Fourier transform in action by computing the Fourier transform of $\ket{000}$. Here is a transcript of this calculation:
CLUSER> (runquantumprogram
(qft '(0 1 2))
(makemachine :quantumstate (makequantumstate 3)
:measurementregister 0))
#S(MACHINE
:QUANTUMSTATE #(#C(0.3535533724408484d0 0.0d0)
#C(0.3535533724408484d0 0.0d0)
#C(0.3535533724408484d0 0.0d0)
#C(0.3535533724408484d0 0.0d0)
#C(0.3535533724408484d0 0.0d0)
#C(0.3535533724408484d0 0.0d0)
#C(0.3535533724408484d0 0.0d0)
#C(0.3535533724408484d0 0.0d0))
:MEASUREMENTREGISTER 0)
Indeed, one can verify that the classical Fourier transform of the vector $[1,0,0,0,0,0,0,0]$ is a vector with eight components equal to about $0.35355$.
$ python
Python 2.7.16 (default, May 23 2023, 14:13:27)
[GCC 8.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.fft.fft([1,0,0,0,0,0,0,0], norm="ortho")
array([0.35355339+0.j, 0.35355339+0.j, 0.35355339+0.j, 0.35355339+0.j,
0.35355339+0.j, 0.35355339+0.j, 0.35355339+0.j, 0.35355339+0.j])
Example transcript
Here is an example transcript downloading and using this software, using Steel Bank Common Lisp.
$ git clone https://github.com/stylewarning/quantuminterpreter.git
Cloning into 'quantuminterpreter'...
remote: Counting objects: 10, done.
remote: Compressing objects: 100% (10/10), done.
Unpacking objects: 100% (10/10), done.
remote: Total 10 (delta 2), reused 5 (delta 0), packreused 0
$ cd quantuminterpreter/
$ sbcl noinform
* (load "qsim.lisp")
T
* (load "examples.lisp")
T
* (runquantumprogram (bell 0 1)
(makemachine :quantumstate (makequantumstate 2)
:measurementregister 0))
#S(MACHINE
:QUANTUMSTATE #(0.7071067690849304d0 0.0d0 0.0d0 0.7071067690849304d0)
:MEASUREMENTREGISTER 0)
* (runquantumprogram (qft '(0 1 2))
(makemachine :quantumstate (makequantumstate 3)
:measurementregister 0))
#S(MACHINE
:QUANTUMSTATE #(#C(0.3535533724408484d0 0.0d0) #C(0.3535533724408484d0 0.0d0)
#C(0.3535533724408484d0 0.0d0) #C(0.3535533724408484d0 0.0d0)
#C(0.3535533724408484d0 0.0d0) #C(0.3535533724408484d0 0.0d0)
#C(0.3535533724408484d0 0.0d0) #C(0.3535533724408484d0 0.0d0))
:MEASUREMENTREGISTER 0)
* (defun flipcoin ()
(machinemeasurementregister
(runquantumprogram
`((GATE ,+H+ 0) (MEASURE))
(makemachine :quantumstate (makequantumstate 1)
:measurementregister 0))))
FLIPCOIN
* (loop :repeat 10 :collect (flipcoin))
(1 1 0 1 1 0 0 1 0 1)
* (quit)
Source code
The source code in this tutorial are published under the BSD 3clause license. The complete listing and most uptodate source code can be found on GitHub.
Ports in other languages
Others have written this quantum interpreter in other languages. Here’s a list of ports that people have shared with me:
 Aistis Raulinaitis’s implementation in OCaml
 Graham Enos’s implementation in Rust with a writeup
 Marco Rubin’s implementation in Python, no dependencies
 Francesco Morri’s implementation in Python

A controlled onequbit gate is a kind of twoqubit gate. ↩︎

It’s actually 124 SLOC, and it has not been “code golfed”. If we wanted to make an ever tinier quantum interpreter, we could—but brevity for its own sake is not the point. ↩︎

With only a little bit of extra work, mostly bookkeeping, we could make $n$ finite but unbounded during the execution of a program by having instead a collection of socalled quantum registers. These would be realized by instead a collection of $v$’s, which are opportunistically combined with entanglement occurs. ↩︎

For that matter, why complex numbers, and not just realvalued probabilities? The reason is that a complex number of unit norm can be written as $e^{i\theta}$, where $\theta$ is called the phase. Phases are a wavelike property, and allow the complex probability amplitudes to interfere. Interference is a known and understood phenomenon of quantum mechanical systems, and in fact is critical to the function of a quantum computer. ↩︎

Spaces with all of these properties, including a way to calculate distances, are called Hilbert spaces. ↩︎

The fact of the matter is that we can actual get more general by having classical probability distributions of these states, which leads one to socalled “density operators”. This is extremely useful when studying imperfect quantum computers which have noisy operations. ↩︎

While we won’t use this fact in our interpreter, even though it would be useful for error checking, it is very easy to check if a matrix is unitary. First, we compute another matrix $h$ which is the conjugatetranspose of $g$. The conjugatetranspose of a matrix is just the transpose of a matrix with each complex entry conjugated. Once we have this matrix, we check that $hg$ is an identity matrix. The matrix $g$ is unitary if and only if $hg=gh=\mathsf{I}$. ↩︎

Unfortunately, the definition \eqref{eq:kron} seems somewhat arbitrary and out of nowhere. Fortunately, there is a much more “first principles” approach to understanding the tensor product and the Kronecker product, starting with how we map a pair of vectors $v\in V$ and $w\in W$ to a vector $v\otimes w\in V\otimes W$. Such approach is much more satisfying to a mathematician, and even essential to understanding the “true nature” of the tensor product, but perhaps less so to a curious implementer. ↩︎

There is no sense in moving $\ket{00}$ or $\ket{11}$ to accomplish a swaplike operation, since we identify each qubits' respective $\ket 0$ identically, and each $\ket 1$ identically. ↩︎

If you’re not particularly keen to figure out the math yourself, you might consult Lemma 14.1 of these lecture notes. You’re also welcome to just take my word for it! ↩︎