配色: 字号:
An introduction to quantum probability, quantum mechanics, and quantum
2023-03-20 | 阅:  转:  |  分享 
  
An introduction to quantum probability, quantum mechanics, and quantum

computation

Greg Kuperberg?

UC Davis

(Dated: October 8, 2007)

Quantum mechanics is one of the most surprising

sides of modern physics. Its basic precepts require

only undergraduate or early graduate mathemat-

ics; but because quantum mechanics is surprising,

it is more difficult than these prerequisites suggest.

Moreover, the rigorous and clear rules of quantum

mechanicsaresometimesconfused with the moredif-

ficult and less rigorous rules of quantum field theory.

Many working mathematicians have an excellent

intuitive grasp of two parent theories of quantum

mechanics, namely classical mechanics and proba-

bility theory. The empirical interpretations of both

of these theories, above and beyond their mathe-

matical formalism, have been a great source of ideas

in mathematics, even for many questions that have

nothing to do with physicsor practicalstatistics. For

example, the probabilistic method of Erd?os and oth-

ers [? ] is a fundamental method in combinatorics

to show the existence of combinatorial objects. In

principle, the precepts of quantum mechanics could

be similarly influential; there could easily be one

or more kind of “quantum probabilistic method”.

But in practice the precepts of quantum mechan-

ics are not very familiar to most mathematicians.

Two subdisciplines of mathematics that have assim-

ilated these precepts are mathematical physics and

operator algebras. However, much of the intention

of mathematical physics is the converse of our pur-

pose, to apply mathematics to problems in physics.

The theory of operator algebras is close to the spirit

of this article; in this theory what we call quantum

probability is often called “non-commutative proba-

bility”.

Recently quantum computation has entered as a

new reason for both mathematicians and computer

scientists to learn the precepts of quantum mechan-

ics. Just as randomized algorithms can be moder-

ately faster than deterministic algorithms for some

computational problems, quantum algorithms can

be moderately faster or sometimes much faster than

their classical and randomized alternatives. Quan-

tum algorithms can only run on a new kind of com-

puter called a quantum computer. As of this writ-

ing, convincing quantum computers do not exist.

?Electronic address: greg@math.ucdavis.edu

Nonetheless, theoretical results suggest that quan-

tum computers are possible rather than impossi-

ble. Entirely apart from technological implications,

quantum computation is a beautiful subject that

combines mathematics, physics, and computer sci-

ence.

This article is an introduction to quantum prob-

ability theory, quantum mechanics, and quan-

tum computation for the mathematically prepared

reader. Chapters ?? and ?? depend on Section 1

but not on each other, so the reader who is inter-

ested in quantum computation can go directly from

Chapter 1 to Chapter ??.

This article owes a great debt to the textbook on

quantum computation by Nielsen and Chuang [4],

and to the Feynman Lectures, Vol. III [2]. An-

other good textbook written for physics students is

by Sakurai [5].

Exercises

These exercises are meant to illustrate how empir-

ical interpretations can lead to solutions of problems

in pure mathematics.

1. The probabilistic method: The Ramsey num-

ber R(n) is defined as the least R such that if

a simple graph Γ has R vertices, then either it

or its complement must have a complete sub-

graph with n vertices. By considering random

graphs, show that

R(n)≥ 2

(n?1)/2

(2(n!))1/n.

(The proof can be described as a couting ar-

gument. However, a solution phrased in terms

of probabilitistic existence is more in the spirit

of these notes.)

2. Angular momentum: Let S be a smooth sur-

face of revolution about the z-axis in R3, and

let vectorp(t) be a geodesic arc on S, parameterized

by length, that begins at the point (1,0,0) at

t = 0. Show that vectorp(t) never reaches any point

within 1/|p′y(0)|of the vertical axis.

3. Kirchoff’s laws: Suppose that a unit square is

tiled by finitely many smaller squares. Show

2

that the edge lengths are uniquely determined

by the combinatorial structure of the tiling,

and that they are rational. (Hint: Build the

unit squareout of materialwith unit resistivity

with a battery connected to the top and bot-

tom edges. Cut slits along the vertical edges of

the tiles and affix zero-resistance wires to the

horizontal edges. Each square becomes a unit

resistor in an electrical network.)

1. QUANTUM PROBABILITY

The precepts of quantum mechanics are neither

a set of physical forces nor a geometric model for

physical objects. Rather, they are a generalization

of classical probability theory that modifies the ef-

fects of physical forces. If you have firmly accepted

classical probability, it is tempting to suppose that

quantum mechanics is a set of probabilistic objects,

in effect a special case of probability rather than a

generalization. But this is not true in any reasonable

sense; quantum probability violates certain inequal-

ities that hold in classical probability (Section ??).

It is also tempting to view quantum mechanics as

a a deterministic dynamical system that produces

classical probabilities and is otherwise hidden. This

interpretation is not reasonable either.

In physics courses, quantum mechanics is usu-

ally defined in terms of operators acting on Hilbert

spaces. A state of a system is a vector of its Hilbert

space, the vector evolves by unitary operators, the

vector is measured by Hermitian operators, and the

measured values have probability distributions.

Although we will discuss the vector-state model,

we will emphasize the non-commutative probability

model from operator algebras. In this model, a sys-

tem can be fully quantum, or fully classical, or things

in between. The fully quantum case corresponds to

the vector-state model, but even in this case, the

general state is described by an operator rather than

a vector. The states that can be described by vectors

are called pure; the others are mixed states.

The vector-state model of quantum mechanics

was originally known as matrix mechanics and is

due to Heisenberg. The historical alternative is

Schr¨odinger’s wave mechanics. Wave mechanics is

best understood as a special case of matrix mechan-

ics, and we will describe it this way. The probabilis-

tic interpretation of quantum mechanics is due to

Max Born and is known as the Copenhagen inter-

pretation (Section ??).

Since classical probability is a major analogy for

us, it is reviewed in Section ??. The point is that a

classical probabilistic system (or measurable space)

is an algebra of random variables that satisfies rel-

evant axioms. One of the restrictions on the alge-

bra is commutativity: If x and y are two real- or

complex-valued random variables, then xy and yx

are the same random variable. In quantum prob-

ability, this commutative algebra is replaced by a

non-commutative algebra called a von Neumann al-

gebra. The remaining definitions stay as much the

same as possible.

We will mostly consider finite-dimensional quan-

tum systems. These are enough to show most of

the basic ideas of quantum probability, just as finite

or combinatorial probability is enough to show most

of the basic ideas of classical probability. Infinite-

dimensional quantum systems are discussed in Sec-

tion ??.

To summarize, quantum probability is the most

natural non-commutative generalization of classical

probability. In this author’sopinion, this description

does the most to demystify quantum probability and

quantum mechanics.

1.1. Quantum superpositions

We will begin by discussing part of the pure-state

model of quantum mechanics in order to show the

inadequacy of classical probability.

A pure state of a quantum mechanical system can

be described as a vector of a complex vector space

H. If the system is finite, then we can say that the

vector space is Cn. It will be convenient to label

the basis of this vector space by an arbitrary finite

set A rather than by the numbers from 1 to n; we

can then denote the vector space CA. The general

state spaceHis not just a vector space but a Hilbert

space, meaning that it has a positive-definite Hermi-

tian inner product〈·|·〉. WhenHis Cn or CA, then

it has the standard inner product

〈φ|ψ〉=

summationdisplay

a∈A

φaψa.

In quantum theory, the traditional notation is |ψ〉

(a “ket”) for a vector ψ and 〈ψ| (a “bra”) for the

corresponding dual vector

〈ψ|= ψ? =〈ψ|·〉.

This notation is due to Dirac [1] and is called “bra-

ket” notation. Recall also that a linear map from a

Hilbert space to itself is called an operator.

In finite quantum mechanics, as in classical proba-

bility, we can define a physical object by specifying a

finite set A of independent configurations. In infor-

mation theory (both quantum and classical), the ob-

ject is often called “Alice”. Classically, the set of all

normalizedstates of Alice is the simplex ?A spanned

3

by A in the vector space RA (see Section ??). I.e.,

a general state has the form

μ =

summationdisplay

a∈A

pa[a]

for probabilities pa ≥0 that sum to 1. (For unnor-

malized states, the sum need not be 1.) The number

pa is interpreted as the probability that Alice is in

state a. Quantumly, Alice’s set of pure states is the

vector space CA. In other words, a state of Alice is

a vector

|ψ〉=

summationdisplay

a∈A

αa|a〉

with complex coefficients αa that are called ampli-

tudes. The square norm |αa|2 is interpreted as the

probabilitythat Alice isin theconfiguration|a〉. The

total probability is therefore the sum

〈ψ|ψ〉=

summationdisplay

a∈A

|αa|2.

The state |ψ〉 is normalized if this sum is 1. The

phase ofαa (i.e., its argument or angle as a complex

number) has no direct probabilistic interpretation,

but it becomes important when we consider opera-

tors on |ψ〉. While the relative phase of two coor-

dinates αa and αa′ is indirectly measurable, it will

turn out that the global phase of |ψ〉 is not mea-

surable, i.e., it is not empirical. Indeed, the global

phase of |ψ〉 is absent from the operator formalism

that we will define in Section 1.3.

The state |ψ〉 is also called a quantum superposi-

tion, an amplitude function, or a wave function. This

last name is motivated by the fact that|ψ〉typically

satisfies a wave equation in infinite quantum me-

chanics (Example ?? and Section ??). It also pre-

dates the Copenhagen interpretation and arguably

distracts from it.

If A and B (“Alice” and “Bob”) are the configu-

ration sets of two classical systems, then an empiri-

cally allowed map from Alice’s state to Bob’s state

is given by a stochastic linear map

M : RA→RB,

also called a Markov map. The property that M

is linear is the classical superposition principle: dis-

joint probabilities add. In addition, in order to be

stochastic, M must have positive entries (so that

probabilities remain positive) and its column sums

must be 1 (to conserve probability).

In the quantum case, an empirical transition from

Alice’s vector states to Bob’s vector states is a linear

map

U : CA →CB.

The requirement that U is linear is the quantum su-

perposition principle. It appears to contradict the

classicalsuperposition principle, and it is thus an ap-

parent paradox of quantum probability. (However,

the treatment in Section 1.3 reconciles the two sides

of this paradox.) The entries of U are also called

amplitudes, just as the entries of a stochastic map

are also probabilities. Since we have posited that

|αa|2 is a probability, U conserves total probability

if and only if

||Uψ||=||ψ||

for all ψ∈CA; i.e., if U is a unitary embedding. If

A = B or at least |A| = |B|, then U is a unitary

operator.

It will be convenient to consider maps that pre-

serve or decrease probability. Such maps are called

extinction processes; the model random walks that

can terminate, experiments that can be scratched,

etc. A classical map M of this kind is called sub-

stochastic. The corresponding quantum condition is

||Uψ||≤||ψ||

and such as U is subunitary.

i/2

i/2

?i/2

i/2

i/2

i/2

Figure 1: An idealized two-slit experiment.

One traditional, idealized setting for the quan-

tum superposition principle is a diffraction appara-

tus known as the two-slit experiment. Figure1 shows

the basic idea: A laser emits photons that can travel

through either of two slits in a grating and then may

(or may not) reach a detector. The source has a sin-

gle state (the state setA has one element), while the

grating has two states and there are two detectors

(B and C each have two elements). The transitions

for each photon, as it passes from A to B to C, are

described by two subunitary matrices

U : CA→CB V : CB →CC.

We can choose the matrices to be

U =

parenleftbiggi

2i

2

parenrightbigg

V =

parenleftbiggi

2

i

2i

2 ?

i

2

parenrightbigg

,

so that

VU =

parenleftbigg?1

20

parenrightbigg

.

4

The total amplitude of the photon reaching the top

detector is ?12 and the probability is 14; this case

is called constructive interference. The total ampli-

tude reaching the bottom detector is 0, so the photon

never reaches it; this case is called destructive in-

terference. On the other hand, if one of the slits of

blocked, then we can discard one of the states in|B|,

with the result that each detector is reached with

probability 116. The classical superposition principle

would dictate a probability of 18 for each detector

with both slits open; thus it is violated.

i/2

i/2

±i/2

i/2

Figure 2: An angle-dependent detector in the two-slit

experiment.

A natural reaction to the violation of classical su-

perposition is to try to determine which slit the pho-

ton went through. For instance, the detector could

be sensitive to the angle that the photon comes in,

as in Figure 2. Or there could be a detector at one of

the slits that notices that the photon passed through

it. But in any such circumstance, the two paths then

results in different final states (of the experiment as

a whole) rather than in the same state. Thus the

final state vector is

|ψ〉=

parenleftbigg?1

4±1

4

parenrightbigg

and its total probability is

〈ψ|ψ〉=||ψ||2 = 18,

regardless of the phases of path segments to and

from the slits. The lesson is that amplitudes of dif-

ferent trajectories of an object only add when there

is no evidence of which trajectory it took. If the

trajectory is recorded at all, the probabilities add.

If we want to see quantum superposition, it is not

enough to wittingly or unwittingly ignores such ev-

idence. Rather, if the two trajectories induce dif-

ferent states of the universe, so that some observer

could in principle distinguish them, then they obey

classical superposition. Moreover, the effect is not

the result of interaction between photons; photons

do not interact with each other1. Indeed, the laser

1 More precisely, significant photon-photon interactions re-

could be tuned to fire only one photon at a time.

Of course, a two-slit experiment is only an ideal-

ization of a real experiment; however, it is very sim-

ilar to many actual experiments and even routine

demonstrations. Note also that diffraction experi-

ments can portray any operator and therefore any

process in quantum mechanics, just as any classi-

cal stochastic map can be modelled by balls falling

through chutes. The two-slit experiment can be

demonstrated with photons, or electrons or even

molecules, but it really describes general probabilis-

tic rules. See Sections 1.5 and ?? for more discus-

sion.

Examples 1.1.1. A qubit is a two-state quantum

object with configuration set {0,1}. Two of their

quantum superpositions are:

|+〉= |0〉+|1〉√2 |?〉= |0〉?|1〉√2

Both of these states have probability 12 of being in

either configuration |0〉 or |1〉, but they are differ-

ent states. This is demonstrated by the effect of a

unitary operator H called the Hadamard gate:

H =

parenleftbigg1 1

1 ?1

parenrightbigg

.

It exchanges|0〉with|+〉and|1〉with|?〉.

The spin state of a spin-12 particle is a two-state

system which is important in physics. (Electrons,

protons, and neutrons are all spin-12 particles.) The

conventional orthonormal basis is |↑〉(spin up) and

|↓〉 (spin down). The names of the states refer to

the property of the electron spinning (according to

the right-hand rule) about a vertical axis in these

two states. Even though a rotated electron is still

an electron, neither this configuration set nor any

other is preserved by rotations. The resolution of

this paradox is that rotated states appear as super-

positions. For example, the spin left and spin right

states are analogous to|+〉and|?〉:

|→〉= |↑〉+|↓〉√2 |←〉= |↑〉?|↓〉√2 .

Although we will soon switch to a more general

model of quantum probability, we can say a few

words about presenting this vector space model in a

basis-independent form. As we said, a quantum ob-

ject can be assigned any Hilbert spaceHrather than

the standard finite-dimensional vector space CA for

quire the extreme energies of cosmic rays and particle ac-

celerators.

5

a configuration setA. Since states evolve by unitary

operators, we can conjugate the standard basis of

CA by any unitary operator, to conclude that any

orthonormal basis of any Hilbert space H can be

called a configuration set. Likewise, if we accept a

real-valued function f on A as a random variable,

then we can model it by a diagonal matrix D who

entries are the values of f. We can then conjugate

that too by a unitary operator U, to conclude that

any Hermitian operator H = UDU?1 represents a

real-valued random variable. If the eigenvalues of

H are 0 and 1, so that H = P is a Hermitian pro-

jection, then this corresponds to a Boolean random

variable. These are the basic rules of vector-state

quantum probability, with the exception of the cru-

cial tensor product rule for joint states (Section 1.5).

Exercises

1. Suppose that the lengths of the entries of a

complex matrix U are all fixed, but the phases

are all chosen uniformly randomly. (If you like,

you can also suppose that for any choice of the

amplitudes, U is subunitary.) Show that on

average, each entry of U|ψ〉 satisfies the clas-

sical superposition principle.

2. If U is a matrix, then the matrix

Mab =|Uab|2

can be called dephasing of U. A dephasing of

a unitary matrix is always doubly stochastic,

meaning that the entries are non-negative and

the rows and columns sum to 1. Find a 3×

3 doubly stochastic matrix which is not the

dephasing of any unitary matrix.

3. Show that every n×k subunitary matrix U

can be extended to an (n+k)×(n+k) unitary

matrix V:

V =

parenleftbiggU ?

? ?

parenrightbigg

.

Show that V cannot usually have order less

than n+k.

4. If U1,U2,...,Un are unitary operators, then

each entry of their product

U = Un...U2U1

can be expressed as a sum of products of en-

tries of the factors:

〈an|Un...U2U1|a0〉

=

summationdisplay

a0,a1,...,an

〈an|Un|an?1〉...〈a2|U1|a1〉〈a1|U1|a0〉.

Such an expansion is interpreted as path sum-

mation; it is the same idea as a sum over his-

tories in classical probability.

For example, let n = 4 and let each

Uk = 1√2

parenleftBigg

1 1

?1 1

parenrightBigg

.

Find the amplitudes of the 16 paths and group

them according to how they sum.

5. In general for a spin-12 particle, the state

|vectorv〉= α|↑〉+β|↓〉

spins in the direction

vectorv = (Re αβ,Im αβ,|α|2?|β|2).

Check that this is a unit vector when |vectorv〉 is

normalized, and that every unit vector in R3 is

achieved. This formula is therefore a surjective

function from the unit 3-sphereS3?C2 to the

2-sphere S2 ?R3. What is its usual name in

mathematics?

1.2. A classical review

Since our intention is to generalize classical prob-

ability, we will review some of the notions of this

theory in the finite case.

A classical probabilistic system is most commonly

modelled by a Boolean algebra ? of random vari-

ables. (In the infinite case, it should be a σ-algebra;

see Section ??.) That the algebra is Boolean means

that it is an algebra over Z/2, and that every ele-

ment is an idempotent, x2 = x. The elements of

? are called events. They correspond to random

variables that take the values 1 and 0, or equiva-

lently true and false, or yes and no. (Multiplication

is Boolean AND, while adding 1 is Boolean comple-

mentation.) A state or distribution of ? is then a

function ρ from ? to [0,∞] such that:

?ρ(x)≥0, and



parenleftBigsummationtext

j xj

parenrightBig

= summationtextj ρ(xj) when xjxk = 0 for all

j and k.

The value

P[x] =ρ(x)

represents the probability of the event x (the prob-

ability that x is true). So the axioms say that prob-

abilities are positive, and probabilities of disjoint

6

events add. The state ρ is normalized if ρ(1) = 1,

which means that the total probability is 1.

If the algebra ? is finite, then it is isomorphic to

(Z/2)A, the (Z/2)-valued functions on some finite

set A or the algebra of subsets of A. The set A is

the set of configurations of ?.

The complex-valued random variables over ? or

A form an algebra denoted L∞(?) or CA = ?∞(A).

This algebra is generated (as an algebra over C) by

the elements of ?, with addition in Z/2 forgotten

and multiplication retained. In other words, if xy =

z in ?, then this is also imposed as a relation in

L∞(?). (Technically, L∞(?) is only the bounded

random variables and is a Banach-space completion

of the algebra so generated, but these concerns are

only important in the infinite case; see Section ??.)

The stateρextends to a linear functional onL∞(?),

so that

E[x] =ρ(x)

now represents the expected value of x as a random

variable. Also,L∞(?) hasan involutionxmapsto→x? that

conjugates the coefficients and values of x and that

will be crucial later. The element x? is called the

adjoint of x and it is also written x? in the physics

literature.

An equivalent formulation is to write axioms for

an algebraMwhich can be recognized asL∞(?) for

some Boolean algebra ?. In this approach,M is a

commutative, positive-definite ?-algebra. By defini-

tion:

?M is an associative algebra over the complex

numbers C.

?Mhas an anti-linear, anti-automorphism?:

(αx)? =αx? (x+y)? = x?+y? (xy)? = y?x?

?M is positive-definite, meaning that if x?x =

0, then x= 0.

?Mis commutative; xy = yx for all x and y.

IfMis finite-dimensional, then these axioms imply

thatMis isomorphic toCA for a finite setA, so that

it is indeed equivalent to the other axiom set. (IfM

is infinite-dimensional, then these axioms should be

strengthened, as we will discuss in Section ??.)

Here are some other important definitions related

toM.

? An element z∈Mis self-adjoint if x =x? ; it

is positive, orx≥0, if x= y?y for somey; and

it is Boolean if it is self-adjoint and if x= x2.

? A state is a dual vector ρ∈M# which is pos-

itive on positive elements: ρ(x)≥ 0 if x≥ 0.

The state ρ is normalized if ρ(1) = 1. (We

write M# instead of M? for the dual space

because?is already used internally toM.)

If you know or suppose thatM~= CA, then it is not

hard to show that the self-adjoint elements are the

real-valued random variables RA, the positive ele-

ments are the non-negative random variables RA≥0,

and the Boolean variables are the 0?1-valued vari-

ables{0,1}A = (Z/2)A = ?.

It is also not hard to show that the two definitions

of a state are equivalent. Indeed, the set of normal-

ized states ofMor ? is the simplex ?A?RA that

consists of convex sums of elements of A. This sim-

plex is shown for a two-state system (a randomized

bit) and a three-state system (a randomized trit) in

Figure ??, together with an example element in each

case. To support this picture, we define [a] to be the

state which is definitely a. For example, if a bit is 1

with probability p, then its state is

ρ = (1?p)[0]+p[1].

The notion of assigning a state or probability dis-

tribution to a probabilistic system has two differ-

ent empirical interpretations, and the distinction be-

tween them will be important. 2 In the frequen-

tist interpretation, the state of an object is always

a configuration a∈A, although you may not know

which one; and a distribution ρ is a summary of

which configuration you witness in repeated trials.

In the Bayesian interpretation, the state of an ob-

ject is a probabilistic stateρ∈?A, which however is

observer-dependent; it represents the observer’s ra-

tional belief about which configurationa∈A will be

witnessed, whether or not repeated trials are possi-

ble.

Frequentism and Bayesianism are mathematically

equivalent. They are only different philosophi-

cally, or they may lead to different practical ad-

vice. However, quantum probability required a de-

gree of Bayesianism. Although frequentism will re-

main valid in some contexts, strict frequentism is un-

tenable as the fundamental interpretation of quan-

tum probability. So it is good practice to think of

a randomized bit, for example, as living in an in-

termediate state between 0 and 1, i.e., a classical

superposition.

Finally, one fundamental operation on states, es-

pecially in the Bayesian interpretation, is the notion

2 Actually there are several variations of both interpretations

in this endless debate in statistics. We have chosen a fairly

aggressive flavor of frequentism and a fairly conservative

flavor of Bayesianism. This is not entirely fair, but it serves

our pedagogical goals.

7

of a conditional state. If p is a Boolean random

variable inMand ρ is a state, then as we said, the

probability of p is P[p] = ρ(p). If p is witnessed by

an observer who knows or believes the prior state ρ,

then afterwords M has an updated state ρp given

by the formula

ρp(x) = ρ(px)ρ(p) .

This is the state ρ conditioned on p, i.e., what ρ

becomes given that p was witnessed. The formula

is not meaningful if ρ(p) = 0, which is to say, if

p is impossible. We also define the unnormalized

conditional state

ρ|p(x) =ρ(px), (1)

which is well-defined regardless of the probability of

p. This state is the empirical posterior state if we

view the measurement of p as an extinction process,

by declaring extinction if p is false.

Exercises

1.3. Algebras and states

In this section we will define quantum probability

as non-commutative probability. This is the other

end from Section 1.1; it makes quantum probability

seem as similar as possible to classical probability.

We will conclude by showing that the the two de-

scriptions are equivalent.

We chose the previous section’s axioms for an al-

gebraMof complex random variables so that they

could be made quantum simply by dropping commu-

tativity. To review,Mis a positive-definite?-algebra

if

?M is an associative algebra over the complex

numbers C.

?Mhas an anti-linear, anti-automorphism?:

(αx)? =αx? (x+y)? = x?+y? (xy)? = y?x?

?M is positive-definite, meaning that if x?x =

0, then x= 0.

If M is finite-dimensional, then these axioms are

adequate; they are the main definition of a finite

quantum system.

We can also repeat these related definitions with-

out changes:

? An element x ∈M is self-adjoint if x = x?.

Such elements form a real vector spaceMsa.

? An element x ∈M is positive, or x ≥ 0, if

x =y?y for somey. IfMis finite-dimensional,

the positive elements form a coneM+.

? An element p ∈ M is Boolean if it is self-

adjoint and if p = p2. Such a p is also called a

self-adjoint projection. The Boolean elements

form a setMbool.

? A dual vector ρ∈M# has an adjoint defined

by ρ?(x) = ρ(x?), and it is self-adjoint if ρ? =

ρ. The set of self-adjoint dual vectors is the

real vector spaceMsa.

? A state is a dual vector ρ∈M# which is pos-

itive on positive elements: ρ(x) ≥0 if x≥0.

The set of states is a dual coneM+.

? The state ρ is normalized if ρ(1) = 1. The set

of normalized states is the state region M?.

These definitions yield the following elementary

inclusions:

Mbool?M+?Msa?M

M??M+?Msa?M#.

Classically (i.e., if M is commutative), Msa and

Mbool are both closed under multiplication, so that

Msa is a real algebra andMbool is a Boolean alge-

bra. However, quantumly neither one is closed under

multiplication, so that at first glance,Msa is only a

real vector space andMbool is only a set. Actually,

Mbool has somewhat more structure than that; see

Exercise ??. The most important extra structure

at the moment is that M is partially ordered with

respect to positivity: x≥y if x?y≥0. The inher-

ited partial orderings of Msa and Mbool are both

important.

If M is finite-dimensional, then we can classify

its structure using the Artin-Schreier theorem, be-

cause its positive-definite structure implies that it is

semisimple (Exercise ??). Since the complex num-

bers are algebraically closed, the theorem says that

Mis isomorphic to a direct sum of matrix algebras:

M~=

circleplusdisplay

k

Mnk.

In particular, ifMis a matrix algebraMn, then it

is as non-commutative as possible. We will call such

an M and the system that it models fully quan-

tum. In basis-independent form, a fully quantum

system M is the algebra B(H) of operators on a

finite-dimensional Hilbert spaceH. (The “B” is for

“bounded”, although in this context all operators

are bounded; see Section ??.)

8

IfMis fully quantum, then we can use the matrix

trace to convert a state ρ from a dual vector onM

to an element.

ρ(x) = Tr(ρx).

(Actually this works in general, using the sum of the

traces of the matrix summands.) Then ρ is positive

as a dual vector if and only if it is positive as an

element ofM, if and only if it is a positive-definite

Hermitian matrix (Exercise ??). Also ρ is normal-

ized if and only if Tr(ρ) = 1. Because such a ρ is

a matrix and because its diagonal entries are proba-

bilities, physicists also call it a density matrix or (in

basis-independent form) a density operator. In this

terminology, “density”means probability density, as

in a probability distribution. The diagonal entries

of a density matrix are in fact probabilities of con-

figuration (Section ??).

Example 1.3.1. The 2×2 matrix algebraM2, or

a system that it models, is a second and better def-

inition of a qubit. The Pauli spin matrices are a

convenient basis for (M2)sa:

σ0 = I =

parenleftBigg

1 0

0 1

parenrightBigg

σ1 =X =

parenleftBigg

0 1

1 0

parenrightBigg

σ2 = Y =

parenleftBigg

0 ?i

i 0

parenrightBigg

σ3 =Z =

parenleftBigg

1 0

0 ?1

parenrightBigg

.

A state ρ of M2 is positive and normalized if and

only if it is of the form

ρ = I +aX +bY +cZ2 ,

where a, b, and c are three real numbers that satisfy

a2 +b2 +c2≤1.

The state regionM?2 is therefore a geometric sphere

in the affine space of unit-trace, 2×2 Hermitian

matrices. It is known as the Bloch sphere.

As in the example of a qubit, an important dif-

ference between quantum probability and classical

probability is that the state regionM? is not a sim-

plex (except in the commutative case). But it is al-

waysconvex, because it is defined bylinear equalities

and inequalities. This convex structure allows clas-

sical superpositions in a quantum setting. Empiri-

cally, if we have two states ρ1 and ρ2 of a quantum

system, and if we prepare a new state ρ by choosing

ρ1 with probability p and ρ2 with probability 1?p,

then

ρ =pρ1 + (1?p)ρ2.

Figure 3: A Josephson junction qubit: superconducting

aluminum on a silicon chip [3].

(This formula is consistent with the fact that all

probabilities are linear in ρ.)

But what about quantum superpositions? IfMis

fully quantum, then they are also present in different

guise from classical superpositions. To help separate

the terminology, classical superpositions are in gen-

eral called mixtures, while quantum superpositions

(when they are defined) are often just called super-

positions.

In general, if K is a convex set in a real vector

space, then a point in K is extremal means that

it is not a convex combination of two other points

in K. An elementary theorem in convex geometry

states that if K is compact and finite-dimensional,

then every point in K is a convex linear combina-

tion of its extremal points. If K = M?, then the

extremal points are those states that are not mix-

tures. These states are called pure and other states

are called mixed. By the theorem, every mixed state

is a mixture of pure states. However, in a fully quan-

tum system, the representation of a state as a mix-

ture is never unique (Exercise ??).

If M = Mn is fully quantum, then a state ρ is

pure if and only if it has rank 1 as a matrix (Exer-

cise ??). It then has the form

ρ = ψ?ψ? =|ψ〉〈ψ|

for some vector ψ∈Cn, since it is also Hermitian.

If ρ is normalized, then in addition ψ is normalized,

by the relation

Trρ =〈ψ|ψ〉.

In basis-independent form, if M = B(H), and if a

stateρonMis pure, then it is described by a vector

|ψ〉∈H. A configuration set ofMis, by definition,

any orthonormal basis ofH. Any state|ψ〉is a com-

plex linear combination of the configurations, and

such a linear combination can be called a quantum

superposition.

9

To summarize, a pure state of a fully quantumM

is represented by a vector in a Hilbert space, and it

is a quantum superposition of any orthonormal basis

of configurations. However, the transformation from

a vector state |ψ〉 to the corresponding density op-

erator ρ =|ψ〉〈ψ|is non-linear and erases the global

phase of |ψ〉. Therefore empirical probabilities are

a non-linear function of the vector state, and the

global phase of a vector state is not directly empiri-

cal. (But relative phases are empirical, so the global

phase of|ψ〉is indirectly empirical, if|ψ〉is used as

a summand of another vector.)

Example 1.3.2. Since the state region M?2 of a

qubit is a geometric sphere, every boundary point is

a pure state|ψ〉〈ψ|. It is not hard to check that any

two opposite points form a (line) basis of the Hilbert

space C2. In the context of quantum computation,

the standard basis of C2 is called |0〉 and |1〉, and

these states are assigned to the top and bottom of

the Bloch sphere, as in Figure ??. Another basis is

|+〉= |0〉+|1〉√2 |?〉= |0〉?|1〉√2 .

These states

In the middle is the uniform state (also called the

maximally mixed or maximum entropy state) ρ =

I/2.

Although probabilities are nonlinear functions of

vector states |ψ〉, there are several important oper-

ations on a quantum systemMwhich are linear on

vector states. We can call such operations coher-

ent; they serve to justify the quantum superposition

principle in Section 1.1.

The most important coherent operation is an al-

gebra isomorphism. Automorphisms and isomor-

phisms in general are the model of reverse dynami-

cal systems and reversible physical transformations.

The wrinkle is that algebra isomorphisms, and more

generally algebra homomorphisms, are contravari-

ant, meaning that they transfer states backward. If

Alice and Bob have algebras MA and MB, then a

homomorphism

E :MB →MA

transfers states from Alice to Bob by means of its

transpose:

E# :M#A →M#B.

If Alice and Bob are both fully quantum and have

Hilbert spaces HA and HB of the same dimension,

then every algebra isomorphism

E :B(HB)→B(HA)

is given by a unitary operator u :HA →HB by the

formula

E(x) =uxu?1.

In more concrete terms, the automorphism group of

Mn as a?-algebra is the unitary group U(n) (Exer-

cise ??). Note that u is conventionally covariant, so

that it transfers pure states forward, from Alice to

Bob.

Note also that this motivation for unitary opera-

torsdoes not support the analogywith Markovmaps

in Section 1.1. Rather, unitary operators are analo-

gous to permutations of configurations of a classical

system, since these are the reversible maps among

Markov maps. This is another way to say that we

have resolved the paradox of that section: Classical

and quantum superposition do not contradict each

other because in some ways, they are not analogous.

Section ?? discusses the correct notion of quantum

Markov maps, namely quantum operations. Classi-

cal and quantum superposition coexist for quantum

operations, just as they do for states in the operator

formalism.

Another coherent operation is conditioning a state

with a Boolean random variable. If ρ is a state of

Mandp∈Mbool isBoolean, then the unnormalized

conditional state is defined by

ρ|p(x) = ρ(pxp).

This reduces to equation (??) whenMis commuta-

tive, but it cannot be exactly the same formula as

before because ρ(px) is not positive as a dual vector

in x, nor even self-adjoint. It is easy to check that if

Mis fully quantum, then the pure state|ψ〉condi-

tions to the pure state p|ψ〉. So conditioning a state

is linear on pure states.

With a bit of modification, unitary operators and

projections generate all subunitary maps between

any two Hilbert spaces (Exercise ??). These are

all of the linear operators, or coherent operations,

used in Section ??. Their construction establishes

the quantum superposition principle as a corollary

of non-commutativity. In Section ??, quantum su-

perpositions appear in another way, as a corollary of

the classical superposition principle.

Exercises

1.4. Measurements

In this section we will look more closely at the

measuring quantum random variables. A random

variable x∈Msa (or more generally a classical do-

mainA?Mas defined below) is also called a mea-

surable or an observable. To measure it is to pass

10

to the conditional state, just as is done in classical

probability.

We said that if p is a Boolean random variable

in an algebraM, then the unnormalized conditional

state is

ρ|p(x) =ρ(pxp).

The normalized conditional state is thus

ρp(x) = ρ(pxp)ρ(p) ,

or in the vector-state case,

|ψp〉= p|ψ〉radicalbig〈ψ|p|ψ〉.

Conditioning on a measurement is also called “state

collapse” or “wave function collapse”, but in the

context of non-commutative probability, this is an

overly dramatic term. The concept of a conditional

state is very natural in classical probability; and it is

equally natural and not all that different in quantum

probability. See Section ??.

The behavior of non-commuting random variables

...

Example 1.4.1.

If any set of Boolean variables in M all com-

mute, or indeed if any set of self-adjoint elements

inMall commute, then they generate a commuta-

tive, positive-definite?-subalgebraA?M. We will

call such anAa classical realm inM. Elements in

A are elements in M, and states on M restrict to

states on A. While we work within A, we are free

to use any notion or result from classical probabil-

ity without modification. In particular, the elements

that we chose to generate A have a joint distribu-

tion, and the order that they are measured does not

matter.

If a classical realm A?M is finite-dimensional,

then it is isomorphic to CA for some set A. Thus a

stateρonMinduces a probability for each outcome

a ∈ A. Moreover, A has a basis {pa} of minimal

projections indexed by the set A. We can then say

that A is a model of an A-valued random variable

with postconditionedstates, using the sameformulas

as for a single Boolean variable:

P[a] = ρ(pa) ρa(x) = ρ(paxpa)ρ(p

a)

.

We can also run the construction backwards to

buildAfrom its minimal projections. Say that two

Booleans p and q are mutually exclusive if pq = 0

(so that they necessarily commute). An A-valued

random variable is then in general defined by a set

of mutually exclusive Booleans that sum to 1:

summationdisplay

a∈A

pa = 1 anegationslash= b =? papb = 0.

IfM=B(H) is fully quantum, then this system of

projections is equivalent to an orthogonal direct sum

decomposition of the Hilbert spaceH:

H=

circleplusdisplay

a∈A

Ha.

It will also be convenient to generalize a classical

realm to allow some of the projections pa to van-

ish. This is equivalent to making the realm a homo-

morphism A→M rather than a subalgebra. Sec-

tion ?? discusses a much more significant general-

ization known as a POVM.

The most important case of a classical realmAis

one generated by a single self-adjoint element x ∈

Msa. In this case the structure ofAimplies that x

has a spectral decomposition,

x=

summationdisplay

λ∈σ(x)

λpλ,

where the value set of the measurement, A = σ(x),

is also called the spectrum of x. The probability

formula,

P[x =λ] = ρ(pλ),

is then consistent with the expectation interpreta-

tion of the state ρ,

E[x] = ρ(x).

SoMsa is the space of real-valued random variables,

just as it was classically. (Note that if M is fully

quantum, then the structure theorem for this A is

equivalent to the spectral theorem for Hermitian ma-

trices.)

Another important type of classical realmA is a

maximal commutative?-subalgebra ofM. (By def-

inition, A is not contained in any commutative ?-

subalgebraB.) IfM=B(H) is fully quantum, then

it is easy to show that A is maximal if and only if

A and H have the same dimension (Exercise ??);

indeed A consists of the diagonal matrices with re-

spect to some orthonormal basis A ofH. Each min-

imal projection pa of A has rank 1. It follows that

the conditional state ρa does not depend on ρ; it is

always the state

ρa =pa =|a〉〈a|.

Although the basis A need only be a line basis ofH,

it is often convenient to make it a vector basis, so

11

that H = CA. The set A is a configuration set, in

keeping with Section 1.1. If n =|A|= dimH, then

we say that M is an n-state system, even though

technically n is the number of configurations rather

than the number of states. For example, a qubit

can also be described as any fully quantum 2-state

system.

A maximal classical realmAis also called a com-

plete measurement. The name evokes the fact that

once any configurationa∈Ais measured, the condi-

tional state is pure and determined by a, so there is

no more left to learn from the state ofM. (Nonethe-

less,Mhas many different complete measurements;

and as we said, even a pure state is typically still

a source of perpetual randomness.) Note also that

if M is fully quantum and we have chosen a basis

so thatAconsists of diagonal matrices, then the di-

agonal entry ρaa of a state ρ is just the probability

of the outcome a∈A. We can view ρ as a classical

probability distribution onA, plus extra off-diagonal

information.

IfMis not fully quantum, then some of the above

analysis has to be modified. Nonetheless, it is still

true that all of the conditional states of a maximal

classical realmAare pure, that a set A of such out-

comes is called a configuration set, and that any two

configuration sets have the same cardinality. If

M~=

circleplusdisplay

k

Mnk,

then the cardinality of A is the total sum of the

matrix sizes, n =summationtextknk.

Similar to a complete measurement, if ρ is a pure

state, then there is an associated minimal Boolean

p with the same matrix as ρ which answers whether

the system is in the state ρ. If p and q are two such

minimal Booleans, then Tr(pq) is both the proba-

bility that the state p will be found in the state q,

and vice-versa; it can be called the overlap between

p and q. If M is fully quantum, so that p and q

have state vectors |a〉 and |b〉, then their overlap is

|〈a|b〉|2. Two pure states are mutually exclusive if

and only if they have no overlap (Exercise ??).

Unlike real-valued random variables, there are two

notionsof complex-valuedand vector-valuedrandom

variables. We can let z be any element ofM, which

is the complexification ofMsa, so that

z =x+iy x= z+z

?

2 y =

z?z?

2i .

Thenzis a complexrandomvariablein a weaksense,

because x and y are both self-adjoint and are both

therefore real random variables. This defines a com-

plex random variable in the weak sense. The wrinkle

is that x and y may not commute with each other,

in which case z and a state ρ do not yield a distri-

bution on C. If the real and imaginary parts x and

y do commute, or equivalently if z and z? commute,

then z is normal. A state ρ and a normal z generate

a classical realm and a classical state on C as usual.

Likewise a vector-valued random variable in the

weak sense is any vectorv∈M?V for some vector space

V. We can say thatvectorv is normal when its components

commute in any basisofV, in which caseit generates

a classical realm and a classical state on V, given a

state ρ on M. An important example of the weak

kind of vector-valued random variable is the angular

momentum operator (Section ??).

Note the set Mnor of normal elements of M is

not closed under either addition or multiplication

(Exercise??). The same is true of the set (M?V)nor

of vector-valued measurements, or in general the A-

valued measurements where the set A is an abelian

group. Only the real random variables, Msa, have

the special property that they can be added even if

they do not commute.

1.4.1. Exercises

1.5. Joint systems

[1] Paul A. Dirac, Principles of quantum mechanics, Ox-

ford University Press, 1930.

[2] Richard P. Feynman, Robert B. Leighton, and

Matthew Sands, The Feynman lectures on physics.

Vol. 3: quantum mechanics, Addison-Wesley, 1965.

[3] K. M. Lang, S. Nam, J. Aumentado, C. Urbina,

and John M. Martinis, Banishing quasiparticles from

josephson-junction qubits: why and how to do it,

IEEE Trans. Appl. Superconduct. 13 (2003), no. 2,

989–993.

[4] Michael A. Nielsen and Isaac L. Chuang, Quantum

computation and quantum information, Cambridge

University Press, Cambridge, 2000.

[5] Jun John Sakurai, Modern quantum mechanics, 2nd

ed., Benjamin/Cummings, 1985.

献花(0)
+1
(本文系mc_eastian首藏)