|
An introduction to quantum probability, quantum mechanics, and quantum |
|
|
An introduction to quantum probability, quantum mechanics, and quantum
computation
Greg Kuperberg?
UC Davis
(Dated: October 8, 2007)
Quantum mechanics is one of the most surprising
sides of modern physics. Its basic precepts require
only undergraduate or early graduate mathemat-
ics; but because quantum mechanics is surprising,
it is more difficult than these prerequisites suggest.
Moreover, the rigorous and clear rules of quantum
mechanicsaresometimesconfused with the moredif-
ficult and less rigorous rules of quantum field theory.
Many working mathematicians have an excellent
intuitive grasp of two parent theories of quantum
mechanics, namely classical mechanics and proba-
bility theory. The empirical interpretations of both
of these theories, above and beyond their mathe-
matical formalism, have been a great source of ideas
in mathematics, even for many questions that have
nothing to do with physicsor practicalstatistics. For
example, the probabilistic method of Erd?os and oth-
ers [? ] is a fundamental method in combinatorics
to show the existence of combinatorial objects. In
principle, the precepts of quantum mechanics could
be similarly influential; there could easily be one
or more kind of “quantum probabilistic method”.
But in practice the precepts of quantum mechan-
ics are not very familiar to most mathematicians.
Two subdisciplines of mathematics that have assim-
ilated these precepts are mathematical physics and
operator algebras. However, much of the intention
of mathematical physics is the converse of our pur-
pose, to apply mathematics to problems in physics.
The theory of operator algebras is close to the spirit
of this article; in this theory what we call quantum
probability is often called “non-commutative proba-
bility”.
Recently quantum computation has entered as a
new reason for both mathematicians and computer
scientists to learn the precepts of quantum mechan-
ics. Just as randomized algorithms can be moder-
ately faster than deterministic algorithms for some
computational problems, quantum algorithms can
be moderately faster or sometimes much faster than
their classical and randomized alternatives. Quan-
tum algorithms can only run on a new kind of com-
puter called a quantum computer. As of this writ-
ing, convincing quantum computers do not exist.
?Electronic address: greg@math.ucdavis.edu
Nonetheless, theoretical results suggest that quan-
tum computers are possible rather than impossi-
ble. Entirely apart from technological implications,
quantum computation is a beautiful subject that
combines mathematics, physics, and computer sci-
ence.
This article is an introduction to quantum prob-
ability theory, quantum mechanics, and quan-
tum computation for the mathematically prepared
reader. Chapters ?? and ?? depend on Section 1
but not on each other, so the reader who is inter-
ested in quantum computation can go directly from
Chapter 1 to Chapter ??.
This article owes a great debt to the textbook on
quantum computation by Nielsen and Chuang [4],
and to the Feynman Lectures, Vol. III [2]. An-
other good textbook written for physics students is
by Sakurai [5].
Exercises
These exercises are meant to illustrate how empir-
ical interpretations can lead to solutions of problems
in pure mathematics.
1. The probabilistic method: The Ramsey num-
ber R(n) is defined as the least R such that if
a simple graph Γ has R vertices, then either it
or its complement must have a complete sub-
graph with n vertices. By considering random
graphs, show that
R(n)≥ 2
(n?1)/2
(2(n!))1/n.
(The proof can be described as a couting ar-
gument. However, a solution phrased in terms
of probabilitistic existence is more in the spirit
of these notes.)
2. Angular momentum: Let S be a smooth sur-
face of revolution about the z-axis in R3, and
let vectorp(t) be a geodesic arc on S, parameterized
by length, that begins at the point (1,0,0) at
t = 0. Show that vectorp(t) never reaches any point
within 1/|p′y(0)|of the vertical axis.
3. Kirchoff’s laws: Suppose that a unit square is
tiled by finitely many smaller squares. Show
2
that the edge lengths are uniquely determined
by the combinatorial structure of the tiling,
and that they are rational. (Hint: Build the
unit squareout of materialwith unit resistivity
with a battery connected to the top and bot-
tom edges. Cut slits along the vertical edges of
the tiles and affix zero-resistance wires to the
horizontal edges. Each square becomes a unit
resistor in an electrical network.)
1. QUANTUM PROBABILITY
The precepts of quantum mechanics are neither
a set of physical forces nor a geometric model for
physical objects. Rather, they are a generalization
of classical probability theory that modifies the ef-
fects of physical forces. If you have firmly accepted
classical probability, it is tempting to suppose that
quantum mechanics is a set of probabilistic objects,
in effect a special case of probability rather than a
generalization. But this is not true in any reasonable
sense; quantum probability violates certain inequal-
ities that hold in classical probability (Section ??).
It is also tempting to view quantum mechanics as
a a deterministic dynamical system that produces
classical probabilities and is otherwise hidden. This
interpretation is not reasonable either.
In physics courses, quantum mechanics is usu-
ally defined in terms of operators acting on Hilbert
spaces. A state of a system is a vector of its Hilbert
space, the vector evolves by unitary operators, the
vector is measured by Hermitian operators, and the
measured values have probability distributions.
Although we will discuss the vector-state model,
we will emphasize the non-commutative probability
model from operator algebras. In this model, a sys-
tem can be fully quantum, or fully classical, or things
in between. The fully quantum case corresponds to
the vector-state model, but even in this case, the
general state is described by an operator rather than
a vector. The states that can be described by vectors
are called pure; the others are mixed states.
The vector-state model of quantum mechanics
was originally known as matrix mechanics and is
due to Heisenberg. The historical alternative is
Schr¨odinger’s wave mechanics. Wave mechanics is
best understood as a special case of matrix mechan-
ics, and we will describe it this way. The probabilis-
tic interpretation of quantum mechanics is due to
Max Born and is known as the Copenhagen inter-
pretation (Section ??).
Since classical probability is a major analogy for
us, it is reviewed in Section ??. The point is that a
classical probabilistic system (or measurable space)
is an algebra of random variables that satisfies rel-
evant axioms. One of the restrictions on the alge-
bra is commutativity: If x and y are two real- or
complex-valued random variables, then xy and yx
are the same random variable. In quantum prob-
ability, this commutative algebra is replaced by a
non-commutative algebra called a von Neumann al-
gebra. The remaining definitions stay as much the
same as possible.
We will mostly consider finite-dimensional quan-
tum systems. These are enough to show most of
the basic ideas of quantum probability, just as finite
or combinatorial probability is enough to show most
of the basic ideas of classical probability. Infinite-
dimensional quantum systems are discussed in Sec-
tion ??.
To summarize, quantum probability is the most
natural non-commutative generalization of classical
probability. In this author’sopinion, this description
does the most to demystify quantum probability and
quantum mechanics.
1.1. Quantum superpositions
We will begin by discussing part of the pure-state
model of quantum mechanics in order to show the
inadequacy of classical probability.
A pure state of a quantum mechanical system can
be described as a vector of a complex vector space
H. If the system is finite, then we can say that the
vector space is Cn. It will be convenient to label
the basis of this vector space by an arbitrary finite
set A rather than by the numbers from 1 to n; we
can then denote the vector space CA. The general
state spaceHis not just a vector space but a Hilbert
space, meaning that it has a positive-definite Hermi-
tian inner product〈·|·〉. WhenHis Cn or CA, then
it has the standard inner product
〈φ|ψ〉=
summationdisplay
a∈A
φaψa.
In quantum theory, the traditional notation is |ψ〉
(a “ket”) for a vector ψ and 〈ψ| (a “bra”) for the
corresponding dual vector
〈ψ|= ψ? =〈ψ|·〉.
This notation is due to Dirac [1] and is called “bra-
ket” notation. Recall also that a linear map from a
Hilbert space to itself is called an operator.
In finite quantum mechanics, as in classical proba-
bility, we can define a physical object by specifying a
finite set A of independent configurations. In infor-
mation theory (both quantum and classical), the ob-
ject is often called “Alice”. Classically, the set of all
normalizedstates of Alice is the simplex ?A spanned
3
by A in the vector space RA (see Section ??). I.e.,
a general state has the form
μ =
summationdisplay
a∈A
pa[a]
for probabilities pa ≥0 that sum to 1. (For unnor-
malized states, the sum need not be 1.) The number
pa is interpreted as the probability that Alice is in
state a. Quantumly, Alice’s set of pure states is the
vector space CA. In other words, a state of Alice is
a vector
|ψ〉=
summationdisplay
a∈A
αa|a〉
with complex coefficients αa that are called ampli-
tudes. The square norm |αa|2 is interpreted as the
probabilitythat Alice isin theconfiguration|a〉. The
total probability is therefore the sum
〈ψ|ψ〉=
summationdisplay
a∈A
|αa|2.
The state |ψ〉 is normalized if this sum is 1. The
phase ofαa (i.e., its argument or angle as a complex
number) has no direct probabilistic interpretation,
but it becomes important when we consider opera-
tors on |ψ〉. While the relative phase of two coor-
dinates αa and αa′ is indirectly measurable, it will
turn out that the global phase of |ψ〉 is not mea-
surable, i.e., it is not empirical. Indeed, the global
phase of |ψ〉 is absent from the operator formalism
that we will define in Section 1.3.
The state |ψ〉 is also called a quantum superposi-
tion, an amplitude function, or a wave function. This
last name is motivated by the fact that|ψ〉typically
satisfies a wave equation in infinite quantum me-
chanics (Example ?? and Section ??). It also pre-
dates the Copenhagen interpretation and arguably
distracts from it.
If A and B (“Alice” and “Bob”) are the configu-
ration sets of two classical systems, then an empiri-
cally allowed map from Alice’s state to Bob’s state
is given by a stochastic linear map
M : RA→RB,
also called a Markov map. The property that M
is linear is the classical superposition principle: dis-
joint probabilities add. In addition, in order to be
stochastic, M must have positive entries (so that
probabilities remain positive) and its column sums
must be 1 (to conserve probability).
In the quantum case, an empirical transition from
Alice’s vector states to Bob’s vector states is a linear
map
U : CA →CB.
The requirement that U is linear is the quantum su-
perposition principle. It appears to contradict the
classicalsuperposition principle, and it is thus an ap-
parent paradox of quantum probability. (However,
the treatment in Section 1.3 reconciles the two sides
of this paradox.) The entries of U are also called
amplitudes, just as the entries of a stochastic map
are also probabilities. Since we have posited that
|αa|2 is a probability, U conserves total probability
if and only if
||Uψ||=||ψ||
for all ψ∈CA; i.e., if U is a unitary embedding. If
A = B or at least |A| = |B|, then U is a unitary
operator.
It will be convenient to consider maps that pre-
serve or decrease probability. Such maps are called
extinction processes; the model random walks that
can terminate, experiments that can be scratched,
etc. A classical map M of this kind is called sub-
stochastic. The corresponding quantum condition is
||Uψ||≤||ψ||
and such as U is subunitary.
i/2
i/2
?i/2
i/2
i/2
i/2
Figure 1: An idealized two-slit experiment.
One traditional, idealized setting for the quan-
tum superposition principle is a diffraction appara-
tus known as the two-slit experiment. Figure1 shows
the basic idea: A laser emits photons that can travel
through either of two slits in a grating and then may
(or may not) reach a detector. The source has a sin-
gle state (the state setA has one element), while the
grating has two states and there are two detectors
(B and C each have two elements). The transitions
for each photon, as it passes from A to B to C, are
described by two subunitary matrices
U : CA→CB V : CB →CC.
We can choose the matrices to be
U =
parenleftbiggi
2i
2
parenrightbigg
V =
parenleftbiggi
2
i
2i
2 ?
i
2
parenrightbigg
,
so that
VU =
parenleftbigg?1
20
parenrightbigg
.
4
The total amplitude of the photon reaching the top
detector is ?12 and the probability is 14; this case
is called constructive interference. The total ampli-
tude reaching the bottom detector is 0, so the photon
never reaches it; this case is called destructive in-
terference. On the other hand, if one of the slits of
blocked, then we can discard one of the states in|B|,
with the result that each detector is reached with
probability 116. The classical superposition principle
would dictate a probability of 18 for each detector
with both slits open; thus it is violated.
i/2
i/2
±i/2
i/2
Figure 2: An angle-dependent detector in the two-slit
experiment.
A natural reaction to the violation of classical su-
perposition is to try to determine which slit the pho-
ton went through. For instance, the detector could
be sensitive to the angle that the photon comes in,
as in Figure 2. Or there could be a detector at one of
the slits that notices that the photon passed through
it. But in any such circumstance, the two paths then
results in different final states (of the experiment as
a whole) rather than in the same state. Thus the
final state vector is
|ψ〉=
parenleftbigg?1
4±1
4
parenrightbigg
and its total probability is
〈ψ|ψ〉=||ψ||2 = 18,
regardless of the phases of path segments to and
from the slits. The lesson is that amplitudes of dif-
ferent trajectories of an object only add when there
is no evidence of which trajectory it took. If the
trajectory is recorded at all, the probabilities add.
If we want to see quantum superposition, it is not
enough to wittingly or unwittingly ignores such ev-
idence. Rather, if the two trajectories induce dif-
ferent states of the universe, so that some observer
could in principle distinguish them, then they obey
classical superposition. Moreover, the effect is not
the result of interaction between photons; photons
do not interact with each other1. Indeed, the laser
1 More precisely, significant photon-photon interactions re-
could be tuned to fire only one photon at a time.
Of course, a two-slit experiment is only an ideal-
ization of a real experiment; however, it is very sim-
ilar to many actual experiments and even routine
demonstrations. Note also that diffraction experi-
ments can portray any operator and therefore any
process in quantum mechanics, just as any classi-
cal stochastic map can be modelled by balls falling
through chutes. The two-slit experiment can be
demonstrated with photons, or electrons or even
molecules, but it really describes general probabilis-
tic rules. See Sections 1.5 and ?? for more discus-
sion.
Examples 1.1.1. A qubit is a two-state quantum
object with configuration set {0,1}. Two of their
quantum superpositions are:
|+〉= |0〉+|1〉√2 |?〉= |0〉?|1〉√2
Both of these states have probability 12 of being in
either configuration |0〉 or |1〉, but they are differ-
ent states. This is demonstrated by the effect of a
unitary operator H called the Hadamard gate:
H =
parenleftbigg1 1
1 ?1
parenrightbigg
.
It exchanges|0〉with|+〉and|1〉with|?〉.
The spin state of a spin-12 particle is a two-state
system which is important in physics. (Electrons,
protons, and neutrons are all spin-12 particles.) The
conventional orthonormal basis is |↑〉(spin up) and
|↓〉 (spin down). The names of the states refer to
the property of the electron spinning (according to
the right-hand rule) about a vertical axis in these
two states. Even though a rotated electron is still
an electron, neither this configuration set nor any
other is preserved by rotations. The resolution of
this paradox is that rotated states appear as super-
positions. For example, the spin left and spin right
states are analogous to|+〉and|?〉:
|→〉= |↑〉+|↓〉√2 |←〉= |↑〉?|↓〉√2 .
Although we will soon switch to a more general
model of quantum probability, we can say a few
words about presenting this vector space model in a
basis-independent form. As we said, a quantum ob-
ject can be assigned any Hilbert spaceHrather than
the standard finite-dimensional vector space CA for
quire the extreme energies of cosmic rays and particle ac-
celerators.
5
a configuration setA. Since states evolve by unitary
operators, we can conjugate the standard basis of
CA by any unitary operator, to conclude that any
orthonormal basis of any Hilbert space H can be
called a configuration set. Likewise, if we accept a
real-valued function f on A as a random variable,
then we can model it by a diagonal matrix D who
entries are the values of f. We can then conjugate
that too by a unitary operator U, to conclude that
any Hermitian operator H = UDU?1 represents a
real-valued random variable. If the eigenvalues of
H are 0 and 1, so that H = P is a Hermitian pro-
jection, then this corresponds to a Boolean random
variable. These are the basic rules of vector-state
quantum probability, with the exception of the cru-
cial tensor product rule for joint states (Section 1.5).
Exercises
1. Suppose that the lengths of the entries of a
complex matrix U are all fixed, but the phases
are all chosen uniformly randomly. (If you like,
you can also suppose that for any choice of the
amplitudes, U is subunitary.) Show that on
average, each entry of U|ψ〉 satisfies the clas-
sical superposition principle.
2. If U is a matrix, then the matrix
Mab =|Uab|2
can be called dephasing of U. A dephasing of
a unitary matrix is always doubly stochastic,
meaning that the entries are non-negative and
the rows and columns sum to 1. Find a 3×
3 doubly stochastic matrix which is not the
dephasing of any unitary matrix.
3. Show that every n×k subunitary matrix U
can be extended to an (n+k)×(n+k) unitary
matrix V:
V =
parenleftbiggU ?
? ?
parenrightbigg
.
Show that V cannot usually have order less
than n+k.
4. If U1,U2,...,Un are unitary operators, then
each entry of their product
U = Un...U2U1
can be expressed as a sum of products of en-
tries of the factors:
〈an|Un...U2U1|a0〉
=
summationdisplay
a0,a1,...,an
〈an|Un|an?1〉...〈a2|U1|a1〉〈a1|U1|a0〉.
Such an expansion is interpreted as path sum-
mation; it is the same idea as a sum over his-
tories in classical probability.
For example, let n = 4 and let each
Uk = 1√2
parenleftBigg
1 1
?1 1
parenrightBigg
.
Find the amplitudes of the 16 paths and group
them according to how they sum.
5. In general for a spin-12 particle, the state
|vectorv〉= α|↑〉+β|↓〉
spins in the direction
vectorv = (Re αβ,Im αβ,|α|2?|β|2).
Check that this is a unit vector when |vectorv〉 is
normalized, and that every unit vector in R3 is
achieved. This formula is therefore a surjective
function from the unit 3-sphereS3?C2 to the
2-sphere S2 ?R3. What is its usual name in
mathematics?
1.2. A classical review
Since our intention is to generalize classical prob-
ability, we will review some of the notions of this
theory in the finite case.
A classical probabilistic system is most commonly
modelled by a Boolean algebra ? of random vari-
ables. (In the infinite case, it should be a σ-algebra;
see Section ??.) That the algebra is Boolean means
that it is an algebra over Z/2, and that every ele-
ment is an idempotent, x2 = x. The elements of
? are called events. They correspond to random
variables that take the values 1 and 0, or equiva-
lently true and false, or yes and no. (Multiplication
is Boolean AND, while adding 1 is Boolean comple-
mentation.) A state or distribution of ? is then a
function ρ from ? to [0,∞] such that:
?ρ(x)≥0, and
?ρ
parenleftBigsummationtext
j xj
parenrightBig
= summationtextj ρ(xj) when xjxk = 0 for all
j and k.
The value
P[x] =ρ(x)
represents the probability of the event x (the prob-
ability that x is true). So the axioms say that prob-
abilities are positive, and probabilities of disjoint
6
events add. The state ρ is normalized if ρ(1) = 1,
which means that the total probability is 1.
If the algebra ? is finite, then it is isomorphic to
(Z/2)A, the (Z/2)-valued functions on some finite
set A or the algebra of subsets of A. The set A is
the set of configurations of ?.
The complex-valued random variables over ? or
A form an algebra denoted L∞(?) or CA = ?∞(A).
This algebra is generated (as an algebra over C) by
the elements of ?, with addition in Z/2 forgotten
and multiplication retained. In other words, if xy =
z in ?, then this is also imposed as a relation in
L∞(?). (Technically, L∞(?) is only the bounded
random variables and is a Banach-space completion
of the algebra so generated, but these concerns are
only important in the infinite case; see Section ??.)
The stateρextends to a linear functional onL∞(?),
so that
E[x] =ρ(x)
now represents the expected value of x as a random
variable. Also,L∞(?) hasan involutionxmapsto→x? that
conjugates the coefficients and values of x and that
will be crucial later. The element x? is called the
adjoint of x and it is also written x? in the physics
literature.
An equivalent formulation is to write axioms for
an algebraMwhich can be recognized asL∞(?) for
some Boolean algebra ?. In this approach,M is a
commutative, positive-definite ?-algebra. By defini-
tion:
?M is an associative algebra over the complex
numbers C.
?Mhas an anti-linear, anti-automorphism?:
(αx)? =αx? (x+y)? = x?+y? (xy)? = y?x?
?M is positive-definite, meaning that if x?x =
0, then x= 0.
?Mis commutative; xy = yx for all x and y.
IfMis finite-dimensional, then these axioms imply
thatMis isomorphic toCA for a finite setA, so that
it is indeed equivalent to the other axiom set. (IfM
is infinite-dimensional, then these axioms should be
strengthened, as we will discuss in Section ??.)
Here are some other important definitions related
toM.
? An element z∈Mis self-adjoint if x =x? ; it
is positive, orx≥0, if x= y?y for somey; and
it is Boolean if it is self-adjoint and if x= x2.
? A state is a dual vector ρ∈M# which is pos-
itive on positive elements: ρ(x)≥ 0 if x≥ 0.
The state ρ is normalized if ρ(1) = 1. (We
write M# instead of M? for the dual space
because?is already used internally toM.)
If you know or suppose thatM~= CA, then it is not
hard to show that the self-adjoint elements are the
real-valued random variables RA, the positive ele-
ments are the non-negative random variables RA≥0,
and the Boolean variables are the 0?1-valued vari-
ables{0,1}A = (Z/2)A = ?.
It is also not hard to show that the two definitions
of a state are equivalent. Indeed, the set of normal-
ized states ofMor ? is the simplex ?A?RA that
consists of convex sums of elements of A. This sim-
plex is shown for a two-state system (a randomized
bit) and a three-state system (a randomized trit) in
Figure ??, together with an example element in each
case. To support this picture, we define [a] to be the
state which is definitely a. For example, if a bit is 1
with probability p, then its state is
ρ = (1?p)[0]+p[1].
The notion of assigning a state or probability dis-
tribution to a probabilistic system has two differ-
ent empirical interpretations, and the distinction be-
tween them will be important. 2 In the frequen-
tist interpretation, the state of an object is always
a configuration a∈A, although you may not know
which one; and a distribution ρ is a summary of
which configuration you witness in repeated trials.
In the Bayesian interpretation, the state of an ob-
ject is a probabilistic stateρ∈?A, which however is
observer-dependent; it represents the observer’s ra-
tional belief about which configurationa∈A will be
witnessed, whether or not repeated trials are possi-
ble.
Frequentism and Bayesianism are mathematically
equivalent. They are only different philosophi-
cally, or they may lead to different practical ad-
vice. However, quantum probability required a de-
gree of Bayesianism. Although frequentism will re-
main valid in some contexts, strict frequentism is un-
tenable as the fundamental interpretation of quan-
tum probability. So it is good practice to think of
a randomized bit, for example, as living in an in-
termediate state between 0 and 1, i.e., a classical
superposition.
Finally, one fundamental operation on states, es-
pecially in the Bayesian interpretation, is the notion
2 Actually there are several variations of both interpretations
in this endless debate in statistics. We have chosen a fairly
aggressive flavor of frequentism and a fairly conservative
flavor of Bayesianism. This is not entirely fair, but it serves
our pedagogical goals.
7
of a conditional state. If p is a Boolean random
variable inMand ρ is a state, then as we said, the
probability of p is P[p] = ρ(p). If p is witnessed by
an observer who knows or believes the prior state ρ,
then afterwords M has an updated state ρp given
by the formula
ρp(x) = ρ(px)ρ(p) .
This is the state ρ conditioned on p, i.e., what ρ
becomes given that p was witnessed. The formula
is not meaningful if ρ(p) = 0, which is to say, if
p is impossible. We also define the unnormalized
conditional state
ρ|p(x) =ρ(px), (1)
which is well-defined regardless of the probability of
p. This state is the empirical posterior state if we
view the measurement of p as an extinction process,
by declaring extinction if p is false.
Exercises
1.3. Algebras and states
In this section we will define quantum probability
as non-commutative probability. This is the other
end from Section 1.1; it makes quantum probability
seem as similar as possible to classical probability.
We will conclude by showing that the the two de-
scriptions are equivalent.
We chose the previous section’s axioms for an al-
gebraMof complex random variables so that they
could be made quantum simply by dropping commu-
tativity. To review,Mis a positive-definite?-algebra
if
?M is an associative algebra over the complex
numbers C.
?Mhas an anti-linear, anti-automorphism?:
(αx)? =αx? (x+y)? = x?+y? (xy)? = y?x?
?M is positive-definite, meaning that if x?x =
0, then x= 0.
If M is finite-dimensional, then these axioms are
adequate; they are the main definition of a finite
quantum system.
We can also repeat these related definitions with-
out changes:
? An element x ∈M is self-adjoint if x = x?.
Such elements form a real vector spaceMsa.
? An element x ∈M is positive, or x ≥ 0, if
x =y?y for somey. IfMis finite-dimensional,
the positive elements form a coneM+.
? An element p ∈ M is Boolean if it is self-
adjoint and if p = p2. Such a p is also called a
self-adjoint projection. The Boolean elements
form a setMbool.
? A dual vector ρ∈M# has an adjoint defined
by ρ?(x) = ρ(x?), and it is self-adjoint if ρ? =
ρ. The set of self-adjoint dual vectors is the
real vector spaceMsa.
? A state is a dual vector ρ∈M# which is pos-
itive on positive elements: ρ(x) ≥0 if x≥0.
The set of states is a dual coneM+.
? The state ρ is normalized if ρ(1) = 1. The set
of normalized states is the state region M?.
These definitions yield the following elementary
inclusions:
Mbool?M+?Msa?M
M??M+?Msa?M#.
Classically (i.e., if M is commutative), Msa and
Mbool are both closed under multiplication, so that
Msa is a real algebra andMbool is a Boolean alge-
bra. However, quantumly neither one is closed under
multiplication, so that at first glance,Msa is only a
real vector space andMbool is only a set. Actually,
Mbool has somewhat more structure than that; see
Exercise ??. The most important extra structure
at the moment is that M is partially ordered with
respect to positivity: x≥y if x?y≥0. The inher-
ited partial orderings of Msa and Mbool are both
important.
If M is finite-dimensional, then we can classify
its structure using the Artin-Schreier theorem, be-
cause its positive-definite structure implies that it is
semisimple (Exercise ??). Since the complex num-
bers are algebraically closed, the theorem says that
Mis isomorphic to a direct sum of matrix algebras:
M~=
circleplusdisplay
k
Mnk.
In particular, ifMis a matrix algebraMn, then it
is as non-commutative as possible. We will call such
an M and the system that it models fully quan-
tum. In basis-independent form, a fully quantum
system M is the algebra B(H) of operators on a
finite-dimensional Hilbert spaceH. (The “B” is for
“bounded”, although in this context all operators
are bounded; see Section ??.)
8
IfMis fully quantum, then we can use the matrix
trace to convert a state ρ from a dual vector onM
to an element.
ρ(x) = Tr(ρx).
(Actually this works in general, using the sum of the
traces of the matrix summands.) Then ρ is positive
as a dual vector if and only if it is positive as an
element ofM, if and only if it is a positive-definite
Hermitian matrix (Exercise ??). Also ρ is normal-
ized if and only if Tr(ρ) = 1. Because such a ρ is
a matrix and because its diagonal entries are proba-
bilities, physicists also call it a density matrix or (in
basis-independent form) a density operator. In this
terminology, “density”means probability density, as
in a probability distribution. The diagonal entries
of a density matrix are in fact probabilities of con-
figuration (Section ??).
Example 1.3.1. The 2×2 matrix algebraM2, or
a system that it models, is a second and better def-
inition of a qubit. The Pauli spin matrices are a
convenient basis for (M2)sa:
σ0 = I =
parenleftBigg
1 0
0 1
parenrightBigg
σ1 =X =
parenleftBigg
0 1
1 0
parenrightBigg
σ2 = Y =
parenleftBigg
0 ?i
i 0
parenrightBigg
σ3 =Z =
parenleftBigg
1 0
0 ?1
parenrightBigg
.
A state ρ of M2 is positive and normalized if and
only if it is of the form
ρ = I +aX +bY +cZ2 ,
where a, b, and c are three real numbers that satisfy
a2 +b2 +c2≤1.
The state regionM?2 is therefore a geometric sphere
in the affine space of unit-trace, 2×2 Hermitian
matrices. It is known as the Bloch sphere.
As in the example of a qubit, an important dif-
ference between quantum probability and classical
probability is that the state regionM? is not a sim-
plex (except in the commutative case). But it is al-
waysconvex, because it is defined bylinear equalities
and inequalities. This convex structure allows clas-
sical superpositions in a quantum setting. Empiri-
cally, if we have two states ρ1 and ρ2 of a quantum
system, and if we prepare a new state ρ by choosing
ρ1 with probability p and ρ2 with probability 1?p,
then
ρ =pρ1 + (1?p)ρ2.
Figure 3: A Josephson junction qubit: superconducting
aluminum on a silicon chip [3].
(This formula is consistent with the fact that all
probabilities are linear in ρ.)
But what about quantum superpositions? IfMis
fully quantum, then they are also present in different
guise from classical superpositions. To help separate
the terminology, classical superpositions are in gen-
eral called mixtures, while quantum superpositions
(when they are defined) are often just called super-
positions.
In general, if K is a convex set in a real vector
space, then a point in K is extremal means that
it is not a convex combination of two other points
in K. An elementary theorem in convex geometry
states that if K is compact and finite-dimensional,
then every point in K is a convex linear combina-
tion of its extremal points. If K = M?, then the
extremal points are those states that are not mix-
tures. These states are called pure and other states
are called mixed. By the theorem, every mixed state
is a mixture of pure states. However, in a fully quan-
tum system, the representation of a state as a mix-
ture is never unique (Exercise ??).
If M = Mn is fully quantum, then a state ρ is
pure if and only if it has rank 1 as a matrix (Exer-
cise ??). It then has the form
ρ = ψ?ψ? =|ψ〉〈ψ|
for some vector ψ∈Cn, since it is also Hermitian.
If ρ is normalized, then in addition ψ is normalized,
by the relation
Trρ =〈ψ|ψ〉.
In basis-independent form, if M = B(H), and if a
stateρonMis pure, then it is described by a vector
|ψ〉∈H. A configuration set ofMis, by definition,
any orthonormal basis ofH. Any state|ψ〉is a com-
plex linear combination of the configurations, and
such a linear combination can be called a quantum
superposition.
9
To summarize, a pure state of a fully quantumM
is represented by a vector in a Hilbert space, and it
is a quantum superposition of any orthonormal basis
of configurations. However, the transformation from
a vector state |ψ〉 to the corresponding density op-
erator ρ =|ψ〉〈ψ|is non-linear and erases the global
phase of |ψ〉. Therefore empirical probabilities are
a non-linear function of the vector state, and the
global phase of a vector state is not directly empiri-
cal. (But relative phases are empirical, so the global
phase of|ψ〉is indirectly empirical, if|ψ〉is used as
a summand of another vector.)
Example 1.3.2. Since the state region M?2 of a
qubit is a geometric sphere, every boundary point is
a pure state|ψ〉〈ψ|. It is not hard to check that any
two opposite points form a (line) basis of the Hilbert
space C2. In the context of quantum computation,
the standard basis of C2 is called |0〉 and |1〉, and
these states are assigned to the top and bottom of
the Bloch sphere, as in Figure ??. Another basis is
|+〉= |0〉+|1〉√2 |?〉= |0〉?|1〉√2 .
These states
In the middle is the uniform state (also called the
maximally mixed or maximum entropy state) ρ =
I/2.
Although probabilities are nonlinear functions of
vector states |ψ〉, there are several important oper-
ations on a quantum systemMwhich are linear on
vector states. We can call such operations coher-
ent; they serve to justify the quantum superposition
principle in Section 1.1.
The most important coherent operation is an al-
gebra isomorphism. Automorphisms and isomor-
phisms in general are the model of reverse dynami-
cal systems and reversible physical transformations.
The wrinkle is that algebra isomorphisms, and more
generally algebra homomorphisms, are contravari-
ant, meaning that they transfer states backward. If
Alice and Bob have algebras MA and MB, then a
homomorphism
E :MB →MA
transfers states from Alice to Bob by means of its
transpose:
E# :M#A →M#B.
If Alice and Bob are both fully quantum and have
Hilbert spaces HA and HB of the same dimension,
then every algebra isomorphism
E :B(HB)→B(HA)
is given by a unitary operator u :HA →HB by the
formula
E(x) =uxu?1.
In more concrete terms, the automorphism group of
Mn as a?-algebra is the unitary group U(n) (Exer-
cise ??). Note that u is conventionally covariant, so
that it transfers pure states forward, from Alice to
Bob.
Note also that this motivation for unitary opera-
torsdoes not support the analogywith Markovmaps
in Section 1.1. Rather, unitary operators are analo-
gous to permutations of configurations of a classical
system, since these are the reversible maps among
Markov maps. This is another way to say that we
have resolved the paradox of that section: Classical
and quantum superposition do not contradict each
other because in some ways, they are not analogous.
Section ?? discusses the correct notion of quantum
Markov maps, namely quantum operations. Classi-
cal and quantum superposition coexist for quantum
operations, just as they do for states in the operator
formalism.
Another coherent operation is conditioning a state
with a Boolean random variable. If ρ is a state of
Mandp∈Mbool isBoolean, then the unnormalized
conditional state is defined by
ρ|p(x) = ρ(pxp).
This reduces to equation (??) whenMis commuta-
tive, but it cannot be exactly the same formula as
before because ρ(px) is not positive as a dual vector
in x, nor even self-adjoint. It is easy to check that if
Mis fully quantum, then the pure state|ψ〉condi-
tions to the pure state p|ψ〉. So conditioning a state
is linear on pure states.
With a bit of modification, unitary operators and
projections generate all subunitary maps between
any two Hilbert spaces (Exercise ??). These are
all of the linear operators, or coherent operations,
used in Section ??. Their construction establishes
the quantum superposition principle as a corollary
of non-commutativity. In Section ??, quantum su-
perpositions appear in another way, as a corollary of
the classical superposition principle.
Exercises
1.4. Measurements
In this section we will look more closely at the
measuring quantum random variables. A random
variable x∈Msa (or more generally a classical do-
mainA?Mas defined below) is also called a mea-
surable or an observable. To measure it is to pass
10
to the conditional state, just as is done in classical
probability.
We said that if p is a Boolean random variable
in an algebraM, then the unnormalized conditional
state is
ρ|p(x) =ρ(pxp).
The normalized conditional state is thus
ρp(x) = ρ(pxp)ρ(p) ,
or in the vector-state case,
|ψp〉= p|ψ〉radicalbig〈ψ|p|ψ〉.
Conditioning on a measurement is also called “state
collapse” or “wave function collapse”, but in the
context of non-commutative probability, this is an
overly dramatic term. The concept of a conditional
state is very natural in classical probability; and it is
equally natural and not all that different in quantum
probability. See Section ??.
The behavior of non-commuting random variables
...
Example 1.4.1.
If any set of Boolean variables in M all com-
mute, or indeed if any set of self-adjoint elements
inMall commute, then they generate a commuta-
tive, positive-definite?-subalgebraA?M. We will
call such anAa classical realm inM. Elements in
A are elements in M, and states on M restrict to
states on A. While we work within A, we are free
to use any notion or result from classical probabil-
ity without modification. In particular, the elements
that we chose to generate A have a joint distribu-
tion, and the order that they are measured does not
matter.
If a classical realm A?M is finite-dimensional,
then it is isomorphic to CA for some set A. Thus a
stateρonMinduces a probability for each outcome
a ∈ A. Moreover, A has a basis {pa} of minimal
projections indexed by the set A. We can then say
that A is a model of an A-valued random variable
with postconditionedstates, using the sameformulas
as for a single Boolean variable:
P[a] = ρ(pa) ρa(x) = ρ(paxpa)ρ(p
a)
.
We can also run the construction backwards to
buildAfrom its minimal projections. Say that two
Booleans p and q are mutually exclusive if pq = 0
(so that they necessarily commute). An A-valued
random variable is then in general defined by a set
of mutually exclusive Booleans that sum to 1:
summationdisplay
a∈A
pa = 1 anegationslash= b =? papb = 0.
IfM=B(H) is fully quantum, then this system of
projections is equivalent to an orthogonal direct sum
decomposition of the Hilbert spaceH:
H=
circleplusdisplay
a∈A
Ha.
It will also be convenient to generalize a classical
realm to allow some of the projections pa to van-
ish. This is equivalent to making the realm a homo-
morphism A→M rather than a subalgebra. Sec-
tion ?? discusses a much more significant general-
ization known as a POVM.
The most important case of a classical realmAis
one generated by a single self-adjoint element x ∈
Msa. In this case the structure ofAimplies that x
has a spectral decomposition,
x=
summationdisplay
λ∈σ(x)
λpλ,
where the value set of the measurement, A = σ(x),
is also called the spectrum of x. The probability
formula,
P[x =λ] = ρ(pλ),
is then consistent with the expectation interpreta-
tion of the state ρ,
E[x] = ρ(x).
SoMsa is the space of real-valued random variables,
just as it was classically. (Note that if M is fully
quantum, then the structure theorem for this A is
equivalent to the spectral theorem for Hermitian ma-
trices.)
Another important type of classical realmA is a
maximal commutative?-subalgebra ofM. (By def-
inition, A is not contained in any commutative ?-
subalgebraB.) IfM=B(H) is fully quantum, then
it is easy to show that A is maximal if and only if
A and H have the same dimension (Exercise ??);
indeed A consists of the diagonal matrices with re-
spect to some orthonormal basis A ofH. Each min-
imal projection pa of A has rank 1. It follows that
the conditional state ρa does not depend on ρ; it is
always the state
ρa =pa =|a〉〈a|.
Although the basis A need only be a line basis ofH,
it is often convenient to make it a vector basis, so
11
that H = CA. The set A is a configuration set, in
keeping with Section 1.1. If n =|A|= dimH, then
we say that M is an n-state system, even though
technically n is the number of configurations rather
than the number of states. For example, a qubit
can also be described as any fully quantum 2-state
system.
A maximal classical realmAis also called a com-
plete measurement. The name evokes the fact that
once any configurationa∈Ais measured, the condi-
tional state is pure and determined by a, so there is
no more left to learn from the state ofM. (Nonethe-
less,Mhas many different complete measurements;
and as we said, even a pure state is typically still
a source of perpetual randomness.) Note also that
if M is fully quantum and we have chosen a basis
so thatAconsists of diagonal matrices, then the di-
agonal entry ρaa of a state ρ is just the probability
of the outcome a∈A. We can view ρ as a classical
probability distribution onA, plus extra off-diagonal
information.
IfMis not fully quantum, then some of the above
analysis has to be modified. Nonetheless, it is still
true that all of the conditional states of a maximal
classical realmAare pure, that a set A of such out-
comes is called a configuration set, and that any two
configuration sets have the same cardinality. If
M~=
circleplusdisplay
k
Mnk,
then the cardinality of A is the total sum of the
matrix sizes, n =summationtextknk.
Similar to a complete measurement, if ρ is a pure
state, then there is an associated minimal Boolean
p with the same matrix as ρ which answers whether
the system is in the state ρ. If p and q are two such
minimal Booleans, then Tr(pq) is both the proba-
bility that the state p will be found in the state q,
and vice-versa; it can be called the overlap between
p and q. If M is fully quantum, so that p and q
have state vectors |a〉 and |b〉, then their overlap is
|〈a|b〉|2. Two pure states are mutually exclusive if
and only if they have no overlap (Exercise ??).
Unlike real-valued random variables, there are two
notionsof complex-valuedand vector-valuedrandom
variables. We can let z be any element ofM, which
is the complexification ofMsa, so that
z =x+iy x= z+z
?
2 y =
z?z?
2i .
Thenzis a complexrandomvariablein a weaksense,
because x and y are both self-adjoint and are both
therefore real random variables. This defines a com-
plex random variable in the weak sense. The wrinkle
is that x and y may not commute with each other,
in which case z and a state ρ do not yield a distri-
bution on C. If the real and imaginary parts x and
y do commute, or equivalently if z and z? commute,
then z is normal. A state ρ and a normal z generate
a classical realm and a classical state on C as usual.
Likewise a vector-valued random variable in the
weak sense is any vectorv∈M?V for some vector space
V. We can say thatvectorv is normal when its components
commute in any basisofV, in which caseit generates
a classical realm and a classical state on V, given a
state ρ on M. An important example of the weak
kind of vector-valued random variable is the angular
momentum operator (Section ??).
Note the set Mnor of normal elements of M is
not closed under either addition or multiplication
(Exercise??). The same is true of the set (M?V)nor
of vector-valued measurements, or in general the A-
valued measurements where the set A is an abelian
group. Only the real random variables, Msa, have
the special property that they can be added even if
they do not commute.
1.4.1. Exercises
1.5. Joint systems
[1] Paul A. Dirac, Principles of quantum mechanics, Ox-
ford University Press, 1930.
[2] Richard P. Feynman, Robert B. Leighton, and
Matthew Sands, The Feynman lectures on physics.
Vol. 3: quantum mechanics, Addison-Wesley, 1965.
[3] K. M. Lang, S. Nam, J. Aumentado, C. Urbina,
and John M. Martinis, Banishing quasiparticles from
josephson-junction qubits: why and how to do it,
IEEE Trans. Appl. Superconduct. 13 (2003), no. 2,
989–993.
[4] Michael A. Nielsen and Isaac L. Chuang, Quantum
computation and quantum information, Cambridge
University Press, Cambridge, 2000.
[5] Jun John Sakurai, Modern quantum mechanics, 2nd
ed., Benjamin/Cummings, 1985.
|
|
|
|
|
|
|
|
|
|
|