I just decided to write a short (but not too short) text elaborating on the April 2011 Physics Stack Exchange question by Jane.
Are the Lagrangians in the Feynman path integrals operators? And if they're not, why doesn't it conflict with the basic fact of quantum mechanics that observables are represented by noncommuting objects, or \(q\)-numbers, to use Dirac's terminology, rather than \(c\)-numbers?
These questions and several related ones are what I want to be clarified by this text although no one can guarantee that it will.
The answer to the first question is that all the variables we integrate over are \(c\)-numbers: observables are represented by numbers, the same ones as the numbers in classical physics. And the Lagrangian that appears in the exponent is a \(c\)-number-valued functions of these \(c\)-number variables; it is not a genuine operator here, despite its similarities with the Hamiltonian. This fact is OK because we don't really use these quantities to directly make statements about the world, at least not about the final state. Feynman's approach to quantum mechanics never says that \(x\) in the final state is equal to a particular value.
Instead, \(x(t)\) for all intermediate values between the initial (\(i\)) and final (\(f\)) moments are dummy variables we integrate over. We shut up and integrate over them in order to calculate different quantities, the probability amplitudes, which may be easily converted to probabilities that the final state will have various properties.
So if we consider such a probability amplitude\[
{\mathcal A}_{i\to f} = \int {\mathcal D} x(t)\cdot \exp(iS[x(t)]/\hbar)
\] where the integral goes over all continuous (but not necessarily smooth – this will be our big point later) curves \(x(t)\) connecting \(x(t_i)=x_i\) and \(x(t_f)=x_f\), we are probing all conceivable histories but the result of the calculation is \({\mathcal A}_{i\to f}\) rather than \(x(t_f)\).
I decided to preserve the undergraduate reduced Planck constant \(\hbar=h/2\pi\) in this entry; all the comments may be translated to the \(\hbar=1\) units of mature physicists by simply erasing \(\hbar\) everywhere.
It may be proved that the complex probability amplitudes that result from Feynman's path integral are the same ones as those you may obtain by solving Schrödinger's equation with the initial wave function at \(t=t_i\)\[
\psi_i(x) = \delta(x - x_i).
\] The proof is usually explained in the introductions to the path integral. Feynman's integral formula for the evolution in a finite period of time may be separated to smaller moments; the integral respects the "transitivity" of the evolution operator. So it's enough to prove the equivalence for the evolution over very short intervals of time. And for the infinitesimal intervals, the equivalence may be proved rather easily.
This point is presented in the basic texts so I won't dedicate it too much time here. However, what is not often explained is why this path integral approach doesn't contradict the commutator behind the Heisenberg uncertainty principle,\[
x(t)p(t)-p(t)x(t)=i\hbar.
\] Imagine that \(x(t),\,p(t)\) are the values of the position and momentum somewhere in the middle of the interval whose evolution we study. Because both \(x(t)\) and \(p(t)\) are represented by ordinary \(c\)-numbers in Feynman's path integral expression, isn't it inevitable that they commute with one another, i.e. \(xp=px\)? This would contradict the uncertainty principle.
The answer is that it is not guaranteed but the way in which Feynman's path integral formula avoids the trap is kind of clever and subtle; it has something to do with the short-time (and short-distance, in quantum field theory) behavior of the trajectories that contribute to the path integral. The fact that the short-time behavior is "unusual" and the fact that it may lead to conclusions we wouldn't expect in the classical reasoning may be viewed as a "demo" of tons of similar short-distance subtleties that quantum field theory, a more realistic subset of quantum theories, is literally full of.
OK, so why does Feynman's expression agree with the usual Heisenberg's nonzero commutator?
There are two types of Feynman's path integral for mechanics. One of them integrates over the phase space; one of them integrates over the configuration space. In plain mathematical English, the former is \(\int{\mathcal D}x\,{\mathcal D}p\) while the latter is just \(\int{\mathcal D}x\). The relationship between them is pretty much straightforward: we may simply integrate over \(p(t)\) first and eliminate it: such a step transforms any phase-space path integral to a configuration-space one.
When we discretize the time to "visualize" the functional integral as a finite-dimensional one (and I invite you to interpret this procedure merely as a way to present the smooth objects in Nature – who has no problems with the functional integrals etc. – to us humans who prefer finite-dimensional integrals because we're just stupid animals, especially some of us, and because many of us are lazy and want to use computers that really don't want to compute infinite-dimensional integrals "mechanically"), we describe the trajectory \(x(t)\), defined as a function of a continuous \(t\), to a finite collection of numbers\[
x(t_i), \, x(t_i+\epsilon), \, x(t_i+2\epsilon),\,\dots\,, x(t_f).
\] We divided the interval from \(t_i\) to \(t_f\) to many short intervals whose duration is \(\epsilon\). Let me just remind you that in this treatment, the natural phase-space path integral uses momenta in the middle of the short intervals,\[
p(t_i+\frac\epsilon 2), \, p(t_i+\frac{3\epsilon}{2}), \, \dots ,\,, p(t_f-\frac{\epsilon}{2}).
\] This pattern is natural because the momentum \(p(t_i+\epsilon/2)\) is related to the velocity \(v(t_i+\epsilon/2)\) which may be approximated by \((x(t_i+\epsilon)-x(t_i))/\epsilon\) and the latter expression uses the values of \(x(t)\) at two nearby moments: one of them may be naturally chosen to come "before" the argument of \(v(t_i+\epsilon/2)\), the other one is symmetrically "after" that moment.
Although the phase-space path integral could be employed to produce a more general path-integral proof of \(xp-px=i\hbar\), one that would work for complicated Lagrangians mixing positions and velocities in pretty general ways, it is probably more pedagogical to explain where the nonzero commutator comes from in the configuration-space path integral. After all, the configuration-space path integral – where we only integrate over the coordinates but not the momenta – is the version of the path integral that becomes natural and preferred in quantum field theory, the framework where Feynman's approach becomes really useful.
A reason to prefer the configuration-space path integral is that the set of counterparts of \(x(t)\), namely \(\phi(t,x,y,z)\), if I mention a Klein-Gordon field, are relativistic covariant (Lorentz transformations only transform the arguments \((t,x,y,z)\)) while the momenta or velocity \(\partial\phi/\partial t\) would have to pick a preferred reference frame and its preferred coordinate \(t\), thus making the Lorentz invariance less obvious. In other words, the phase-space path integrals are close to the Hamiltonian formalism which is less natural for Lorentz-invariant theories in which the Hamiltonian (energy) is just one component or projection of the energy-momentum vector.
Getting started with the calculation
Because of these disclaimers, we finally want to solve the following problem: Calculate the nonzero Heisenberg commutator from the configuration-space Feynman path integral. I will restrict the proof to the "ordinary" actions having the form\[
S=\int\dd t\,L(t),\quad L(t)=\frac{mv(t)^2}{2} - U[x(t)]
\] where the energy is separated to the kinetic and potential energy, kinetic energy has the usual form, and the potential energy only depends on \(x\). What we want to compute is the counterpart of the operator \(x(t)p(t)-p(t)x(t)\) for some moment \(t\) inside the interval whose evolution operator was considered.
The ordering matters for operators but it seems not to matter for the Feynman integration variables. So shouldn't the commutator be zero? No. We must appreciate what it means for the operator \(p(t)\) to be on the right side of \(x(t)\) in the product \(x(t)p(t)\), for example. In the operator language, it means that it acts on the ket vector \(\ket\psi\) before \(x(t)\) does. Let's only consider ket vectors.
Off-topic, soccer: Czechia is playing quarterfinals of Euro against Portugal tonight. We must show the door to Ronaldo and similar boys. Of course, we can do that. The video above is from the (similar?) Czech-Portuguese quarterfinals in 1996: we won 1-to-0 by Karel Poborský's astronautic goal (see above). At the end, we played the finals where we lost to Germany. A New York Times blog previews the match tonight; the author is so impressed by our guys that he completely forgets about the Portuguese! :-) Between 1930 and 1989, Czechoslovakia won 3 times, lost 3 times, and tied 3 times with Portugal.
This word "before" may look like a mathematical abstraction that has nothing to do with "before" that encodes the ordering in the actual time \(t\). The ordering of some mathematicians' steps in time – the mathematician is computing a matrix product – may seem to be completely independent from the chronology of events and positions \(x(t)\) that define a trajectory. However, this ordering is actually the very same one. An operator acting on the ket vector is a sort of an event in quantum mechanics.
(This is one of the ways to see that a logical arrow of time is intrinsically imprinted into the basic framework of quantum mechanics. The past and the future don't play the same role in a quantum theory for the same reason why it matters whether one operator/event is closer to a ket vector or further from it. Microscopic laws and/or Schrödinger's equation may have a time-reversal or CPT symmetry but this fact doesn't imply that the actual events and the physical relationships between them will respect the symmetry. They don't as thermodynamics makes flagrantly obvious.)
So the commutator \(x(t)p(t)-p(t)x(t)\) in quantum mechanics must really be replaced by\[
x(t+\frac\epsilon 2) p(t) - p(t) x(t-\frac\epsilon 2) = \dots
\] where the magnitude of the infinitesimal \(\epsilon\) shouldn't matter but it should be a positive infinitesimal number. Conveniently enough, we may choose the size of this \(\epsilon\) to exactly agree with the spacing between the moments into which we discretized the trajectory. In this way, the calculation will only involve a one-dimensional integral but believe me that if you tried to compute it with a different \(\epsilon\), you would obtain the same resulting commutator. The nonzero commutator only boils down to a single discontinuity. Also, I chose the (infinitesimal and therefore irrelevant) shifts of the arguments (time) in such a way that both terms of the commutator included the same \(p(t)\).
The most recent displayed formula may obviously be written as \[
\dots = p(t)\cdot [ x(t+\frac\epsilon 2) - x(t-\frac\epsilon 2) \approx p(t) \cdot \epsilon v(t) = \dots ]
\] where I approximated the difference of the two values of \(x\) by the derivative (the velocity) multiplied by the separation of the times. At any rate, if we use the configuration-space variables, the expression for \([x,p]\) is simply\[
\dots = \epsilon\cdot mv(t)^2.
\] We want to calculate what the value of this expression "is" for an infinitesimal positive \(epsilon\) according to the Feynman path integral. The way how the path integral answers such questions is that it tells us to add this factor as a factor to the path integral. If the path integral itself jumps by a certain resulting multiplicative factor for every history (one that doesn't try to squeeze additional factors/events/measurements to the moment \(t\)), it means that the inserted factor should be said to be equal to the resulting factor.
Note that naively, \(m,v\) are finite so when we multiply it by \(\epsilon\) and send the latter to zero, we should get zero. But we won't because Nature isn't naive.
Fine. Our next step has to be as follows: instead of considering the original path integral (which may already have some insertions, hopefully at mostly different moments), we consider a path integral with an extra insertion (the potential energy is called \(U(x)\) and not \(V(x)\), to reserve the letter \(V\) in all forms for velocities – thanks to Bill Zajc for this correction and several others)\[
\int{\mathcal D}x(t) \,\exp\int \dd t[\frac{imv^2}{2\hbar}-\frac{iU(x)}{\hbar}]\,\cdots\, \epsilon mv^2.
\] In the discretized trajectories, we may consider the integral over \(x(t-\epsilon/2)\) and \(x(t+\epsilon/2)\): assume that these are among the allowed "lattice sites" in the discretized time. The integral over the "earlier" \(x(t-\epsilon/2)\) may be kept and the integral over the later one may be replaced by \(\int \dd v(t)\) with the right factors (which cancel and may therefore be ignored because the commutator we want to get is the ratio of the path integral with an extra insertion and without an extra insertion).
What we're inserting only depends on \(v(t)\) so the path-integral version of the value of \(x(t)p(t)-p(t)x(t)\) which was converted to \(\epsilon mv(t)^2\) is simply\[
"[x,p]" = \frac{\int \dd v(t) \exp(\frac{i\epsilon mv^2}{2\hbar}) \cdot \epsilon mv(t)^2} {\int \dd v(t) \exp(\frac{i\epsilon mv^2}{2\hbar})}=\dots
\] The Gaussian is the exponential of \(i\epsilon mv^2/2\). Complex calculus allows you not to worry about the fact that it is actually a phase. You may see that if the \(i\) were absent, the width of the Gaussian would be, because \(\epsilon mv^2\hbar\sim 1\), of order \(\Delta v\sim \sqrt{\hbar/2\epsilon}\). It's important to realize that if \(\epsilon\to 0\), the width goes like \(\epsilon^{-1/2}\) and diverges in the limit.
At any rate, you see that up to factors such as \(2,i,\hbar\), the insertion is exactly what appears as the exponent in the exponential we inherited from the path integral i.e. from the action. With the simple substitution \(V=\sqrt{m\epsilon/\hbar}\cdot v\), the integral above becomes\[
\frac{\int \dd V \,\exp(iV^2/2)\cdot \hbar V^2}{\int \dd V\, \exp (iV^2/2)}
\] Well, it's annoying that the integrand has a constant absolute value – it is a phase - but you may define it by another natural limit which adds a modest true Gaussian suppression at infinity. I don't want to justify all these things because their legitimacy and "rightness" really boils down to one's proper intuition about how physics works – it works in such a way that many things are analytic and if they seem ill-defined and may be fixed by an analytic continuation, one should definitely do it – and if you don't see this point, you're just missing some innate aptitudes for theoretical physics and I won't be able to convince about them, anyway.
Instead, let me assume that the reader has no problem with such continuations etc. The right way to compute the right ratio – which is mathematically the same task as a task with ordinary real Gaussians – is to define another variable \(W\) such that \(iV^2 = -W^2\) i.e. \(V^2 = iW^2\) so that we get a nice Gaussian. The ratio \(\dd W/\dd V\) from the substitution cancels in the ratio we calculate so the \(xp-px\) commutator we wanted to compute finally boils down to\[
\frac{\int \dd W \,\exp(-W^2/2)\cdot i\hbar \cdot W^2}{\int \dd W\, \exp (-W^2/2)}.
\] The integrals go from \(-\infty\) to \(+\infty\). But up to the universal factor \(i\hbar\), this is nothing else than the expectation value of \(W^2\) for a normal distribution with a unit standard deviation: this is nothing else than the \(\exp(-W^2/2)\) distribution. But this expectation value is one – feel free to calculate the integral by your favorite tools. So we have just calculated in the Feynmanian way that\[
xp-px = i\hbar.
\] It may be useful to return a little bit and see how it was possible for us to obtain such a "paradoxical" result. The reason was that despite the overall factor of \(\epsilon\) in the formula \(\Delta x(t) \sim \epsilon v(t)\), we got a nonzero result because the commutator boiled down to the expectation value of \(v(t)^2\) and \(v(t)\) was normal-distributed with a width that became infinite as the spacing of time, \(\epsilon\), was sent to zero.
Paths must be unsmooth
This "infinite width" of the distribution for the velocity (as it implicitly appears in the path integral) is really the point. This is the place that stores the weapons that make the surprising result (the nonzero commutator even though we superficially deal with \(c\)-numbers all the time) possible and true. I must repeat this sentence once again because it's important:
To deny that the non-differentiable trajectories (or, in quantum field theory, spacetime configuration/histories of the fields) are paramount contributors to the path integral of any consistent quantum theory means to deny the uncertainty principle! Although quantum mechanics has superficially nothing to do with the requirement that important trajectories in the path integral description must be non-differentiable, the uncertainty principle is actually the same thing!
How much non-differentiable the trajectories are? We've seen that the typical value of \(v(t)^2\) was scaling like \(1/\epsilon\) i.e. \(v(t)\) was proportional to \(1/\sqrt{\epsilon}\). You may translate it to \(\Delta x\), the change of \(x\) over the unit of time into which we divide the trajectory. We get \(\Delta x\sim \epsilon v\sim \epsilon / \sqrt{\epsilon}\sim \sqrt{\epsilon}\). What does it mean?
Well, if the typical distance you move after time \(\epsilon\) scales like \(\sqrt{t}\), it's nothing else than the Brownian motion! So when it comes to the power law that determines the dependence on the velocities (and position changes) on the period of time, the typical trajectories contributing to the Feynman path integral resemble the Brownian motion. They look like random walks! This shouldn't be shocking even at the "linguistic level" because the Feynman path integral does integrate over random walks because quantum mechanics says that particles walk in random ways.
The unsmoothness of these random walks is actually another way to formulate the uncertainty principle. The precise commutators of the observables are encoded in the precise shape of the "infinitely wide" distributions for the velocities etc.
I must mention that in quantum field theory in \(d\) dimensions – we have done quantum mechanics so far which is quantum field theory in \(d=1\) (time is the only spacetime variable on which the degrees of freedom depend) – the power laws will be different. If we still use \(\epsilon\) for the lattice spacing, the action will be discretized to boxes of volume \(\epsilon^d\). This tiny factor will multiply the Lagrangian density at each lattice site so the velocities (derivatives of fields...) of typical trajectories will scale like \(1/\sqrt{\epsilon^d}\).
Note that for \(\epsilon\to 0\), these velocities diverge even more quickly than they did in \(d=1\). The short distance fluctuations of the histories in \(d\gt 1\) quantum field theories are not just those of the random walk we found in quantum mechanics; they are even more violently oscillating. This may be heuristically interpreted as the reason why quantum field theories in ever higher numbers of spacetime dimensions suffer from increasingly severe short-distance problems. You may say that it is one of the ways to see why these theories ultimately become non-renormalizable and ill-defined in the ultraviolet.
Not only old Englishment could have built Stonehenge. Å koda has built this Citihenge, named after Citigo, our version of Volkswaven Up!, out of old cars. The vicinity of the Tower Bridge is immediately prettier than before. :-D
Implications for spin foams and discreteness of time
We have emphasized – or at least I have emphasized – that the divergent values of derivatives in the typical histories were needed for the path integral to agree with the nonzero commutators. This is actually a simple way to see that all would-be path integral theories that want to make the time discrete – e.g. they want to have a built-in \(\epsilon=t_{\rm Planck}\) which is constant in the quantum gravity realm which means that it cannot be sent to zero – inevitably violate the uncertainty principle, a basic postulate of quantum mechanics.
If the degrees of freedom were discrete in this way, e.g. if the time were divided to Planckian intervals, all observables would have finite-width distributions and you couldn't get the finite, nonzero commutators. In such a theory, there would be no observables that are linked to functions of the dummy variables (we path-integrate over) and that refuse to commute with each other. This is another simple way to exclude all theories of the "spin foam" kind (a path-integral incarnation of loop quantum gravity although the equivalence obviously can't hold because LQG still tries to pretend that some commutators are nonzero). The people who study this garbage don't understand the basic stuff about path integrals because they would otherwise know that divergent standard deviations of the velocities are needed to get nonzero commutators from the path integral. Again, don't mess with the path integral.
And that's the memo.
Bonus: why it's OK that these paths have an infinite action
Jan Reimers made a good point in the comments. Textbooks (correctly!) say that the action computed from a particular random-walk-like trajectory mentioned above is infinite. It is indeed infinite. If \(v^2\) has an expectation value going like \(1/ \epsilon\to\infty\), the integral of such a kinetic term \(mv^2/2\) over time is bound to diverge, too.
But that's how the things are. These paths dominate the path integral, anyway. It's because there are many of them. If you consider differentiable trajectories, you may get a smaller action, namely a finite one, but you will integrate over a smaller volume of trajectories in the infinite-dimensional space of paths and this suppression by the "excessively small volume" in the space of trajectories is more (well, in some counting equally) important than (or as) the exponential suppression due to the divergent action.
One may imagine that all the relevant un-smooth paths are fluctuations away from a classical, smooth one whose action is finite. Quantum mechanics allows one to deviate from such smooth paths and it actually allows enough so that the typical "allowed" paths are non-differentiable.
It's useful to do some maths. Expand the path \(x(t)\) into some standing waves (Fourier modes), with terms \(a_k\cdot \sin (\pi k t/\Delta t)\) plus some linear term to obey the right condition at the initial and final moments. Now, how do the coefficients \(a_k\) of the typical allowed trajectories scale with \(k\)? If you rewrite the kinetic part of the action which is proportional to \(\int v^2\dd t\), you will get terms such as \(k^2|a_k|^2\). The extra factor of \(k^2\) came from the need to differentiate \(x(t)\) to obtain \(v(t)\); and this got squared because we had \(v^2\).
The path integral contains the factor of \(\exp(-S_E/\hbar)\): let us switch to the Euclidean space so that I don't have to apologize for the imaginary unit again. Because \(S_E\) is a sum over \(k\), essentially, we get factors in the path integral of the form\[
\exp(-C\cdot k^2 |a_k|^2).
\] You may see that the distribution of each coefficient \(a_k\) is essentially independent of others and \(k^2|a_k|^2\) is of order one, independently of \(k\), which means that \(|a_k|\) scales like \(1/k\) for large \(k\). If you have a function with Fourier coefficients scaling in this way and translate it to \(x(t)\), a function of a continuous time \(t\), you will get a discontinuous function of the same random-walk type discussed above.
On the other hand, the action for such a typical trajectory is infinite because it's the sum over \(k\) of \(k^2|a_k|^2\), up to some overall constants and other details, and because each term is of order one, independently of \(k\), and because you have infinitely many terms of this kind (infinitely many Fourier modes), the action becomes infinite. But that's not a problem. Most of the infinity comes from "very large" or "infinite" values of \(k\), i.e. very quickly oscillating Fourier modes, and those have a very small impact on the low-frequency observations that can be made with large and clumsy "classical" probes.
If you only have a classical probe, you're back to the classical intuition because \(a_k\) modes with too high values of \(k\) become invisible while their contribution to the action becomes "universal": every smooth classical action allows pretty much the same un-smooth deviations from it so having the family of nearby un-smooth paths essentially adds a universal factor to the path integral only (if you only compute low-frequency processes etc.). Of course, whenever the quantum fluctuations become so large and important that you can't consistently separate them from the "classical smooth parts" of \(x(t)\), the classical limit and the classical intuition become invalid with all the implications.
Are the Lagrangians in the Feynman path integrals operators? And if they're not, why doesn't it conflict with the basic fact of quantum mechanics that observables are represented by noncommuting objects, or \(q\)-numbers, to use Dirac's terminology, rather than \(c\)-numbers?
These questions and several related ones are what I want to be clarified by this text although no one can guarantee that it will.
The answer to the first question is that all the variables we integrate over are \(c\)-numbers: observables are represented by numbers, the same ones as the numbers in classical physics. And the Lagrangian that appears in the exponent is a \(c\)-number-valued functions of these \(c\)-number variables; it is not a genuine operator here, despite its similarities with the Hamiltonian. This fact is OK because we don't really use these quantities to directly make statements about the world, at least not about the final state. Feynman's approach to quantum mechanics never says that \(x\) in the final state is equal to a particular value.
Instead, \(x(t)\) for all intermediate values between the initial (\(i\)) and final (\(f\)) moments are dummy variables we integrate over. We shut up and integrate over them in order to calculate different quantities, the probability amplitudes, which may be easily converted to probabilities that the final state will have various properties.
So if we consider such a probability amplitude\[
{\mathcal A}_{i\to f} = \int {\mathcal D} x(t)\cdot \exp(iS[x(t)]/\hbar)
\] where the integral goes over all continuous (but not necessarily smooth – this will be our big point later) curves \(x(t)\) connecting \(x(t_i)=x_i\) and \(x(t_f)=x_f\), we are probing all conceivable histories but the result of the calculation is \({\mathcal A}_{i\to f}\) rather than \(x(t_f)\).
I decided to preserve the undergraduate reduced Planck constant \(\hbar=h/2\pi\) in this entry; all the comments may be translated to the \(\hbar=1\) units of mature physicists by simply erasing \(\hbar\) everywhere.
It may be proved that the complex probability amplitudes that result from Feynman's path integral are the same ones as those you may obtain by solving Schrödinger's equation with the initial wave function at \(t=t_i\)\[
\psi_i(x) = \delta(x - x_i).
\] The proof is usually explained in the introductions to the path integral. Feynman's integral formula for the evolution in a finite period of time may be separated to smaller moments; the integral respects the "transitivity" of the evolution operator. So it's enough to prove the equivalence for the evolution over very short intervals of time. And for the infinitesimal intervals, the equivalence may be proved rather easily.
This point is presented in the basic texts so I won't dedicate it too much time here. However, what is not often explained is why this path integral approach doesn't contradict the commutator behind the Heisenberg uncertainty principle,\[
x(t)p(t)-p(t)x(t)=i\hbar.
\] Imagine that \(x(t),\,p(t)\) are the values of the position and momentum somewhere in the middle of the interval whose evolution we study. Because both \(x(t)\) and \(p(t)\) are represented by ordinary \(c\)-numbers in Feynman's path integral expression, isn't it inevitable that they commute with one another, i.e. \(xp=px\)? This would contradict the uncertainty principle.
The answer is that it is not guaranteed but the way in which Feynman's path integral formula avoids the trap is kind of clever and subtle; it has something to do with the short-time (and short-distance, in quantum field theory) behavior of the trajectories that contribute to the path integral. The fact that the short-time behavior is "unusual" and the fact that it may lead to conclusions we wouldn't expect in the classical reasoning may be viewed as a "demo" of tons of similar short-distance subtleties that quantum field theory, a more realistic subset of quantum theories, is literally full of.
OK, so why does Feynman's expression agree with the usual Heisenberg's nonzero commutator?
There are two types of Feynman's path integral for mechanics. One of them integrates over the phase space; one of them integrates over the configuration space. In plain mathematical English, the former is \(\int{\mathcal D}x\,{\mathcal D}p\) while the latter is just \(\int{\mathcal D}x\). The relationship between them is pretty much straightforward: we may simply integrate over \(p(t)\) first and eliminate it: such a step transforms any phase-space path integral to a configuration-space one.
When we discretize the time to "visualize" the functional integral as a finite-dimensional one (and I invite you to interpret this procedure merely as a way to present the smooth objects in Nature – who has no problems with the functional integrals etc. – to us humans who prefer finite-dimensional integrals because we're just stupid animals, especially some of us, and because many of us are lazy and want to use computers that really don't want to compute infinite-dimensional integrals "mechanically"), we describe the trajectory \(x(t)\), defined as a function of a continuous \(t\), to a finite collection of numbers\[
x(t_i), \, x(t_i+\epsilon), \, x(t_i+2\epsilon),\,\dots\,, x(t_f).
\] We divided the interval from \(t_i\) to \(t_f\) to many short intervals whose duration is \(\epsilon\). Let me just remind you that in this treatment, the natural phase-space path integral uses momenta in the middle of the short intervals,\[
p(t_i+\frac\epsilon 2), \, p(t_i+\frac{3\epsilon}{2}), \, \dots ,\,, p(t_f-\frac{\epsilon}{2}).
\] This pattern is natural because the momentum \(p(t_i+\epsilon/2)\) is related to the velocity \(v(t_i+\epsilon/2)\) which may be approximated by \((x(t_i+\epsilon)-x(t_i))/\epsilon\) and the latter expression uses the values of \(x(t)\) at two nearby moments: one of them may be naturally chosen to come "before" the argument of \(v(t_i+\epsilon/2)\), the other one is symmetrically "after" that moment.
Although the phase-space path integral could be employed to produce a more general path-integral proof of \(xp-px=i\hbar\), one that would work for complicated Lagrangians mixing positions and velocities in pretty general ways, it is probably more pedagogical to explain where the nonzero commutator comes from in the configuration-space path integral. After all, the configuration-space path integral – where we only integrate over the coordinates but not the momenta – is the version of the path integral that becomes natural and preferred in quantum field theory, the framework where Feynman's approach becomes really useful.
A reason to prefer the configuration-space path integral is that the set of counterparts of \(x(t)\), namely \(\phi(t,x,y,z)\), if I mention a Klein-Gordon field, are relativistic covariant (Lorentz transformations only transform the arguments \((t,x,y,z)\)) while the momenta or velocity \(\partial\phi/\partial t\) would have to pick a preferred reference frame and its preferred coordinate \(t\), thus making the Lorentz invariance less obvious. In other words, the phase-space path integrals are close to the Hamiltonian formalism which is less natural for Lorentz-invariant theories in which the Hamiltonian (energy) is just one component or projection of the energy-momentum vector.
Getting started with the calculation
Because of these disclaimers, we finally want to solve the following problem: Calculate the nonzero Heisenberg commutator from the configuration-space Feynman path integral. I will restrict the proof to the "ordinary" actions having the form\[
S=\int\dd t\,L(t),\quad L(t)=\frac{mv(t)^2}{2} - U[x(t)]
\] where the energy is separated to the kinetic and potential energy, kinetic energy has the usual form, and the potential energy only depends on \(x\). What we want to compute is the counterpart of the operator \(x(t)p(t)-p(t)x(t)\) for some moment \(t\) inside the interval whose evolution operator was considered.
The ordering matters for operators but it seems not to matter for the Feynman integration variables. So shouldn't the commutator be zero? No. We must appreciate what it means for the operator \(p(t)\) to be on the right side of \(x(t)\) in the product \(x(t)p(t)\), for example. In the operator language, it means that it acts on the ket vector \(\ket\psi\) before \(x(t)\) does. Let's only consider ket vectors.
Off-topic, soccer: Czechia is playing quarterfinals of Euro against Portugal tonight. We must show the door to Ronaldo and similar boys. Of course, we can do that. The video above is from the (similar?) Czech-Portuguese quarterfinals in 1996: we won 1-to-0 by Karel Poborský's astronautic goal (see above). At the end, we played the finals where we lost to Germany. A New York Times blog previews the match tonight; the author is so impressed by our guys that he completely forgets about the Portuguese! :-) Between 1930 and 1989, Czechoslovakia won 3 times, lost 3 times, and tied 3 times with Portugal.
This word "before" may look like a mathematical abstraction that has nothing to do with "before" that encodes the ordering in the actual time \(t\). The ordering of some mathematicians' steps in time – the mathematician is computing a matrix product – may seem to be completely independent from the chronology of events and positions \(x(t)\) that define a trajectory. However, this ordering is actually the very same one. An operator acting on the ket vector is a sort of an event in quantum mechanics.
(This is one of the ways to see that a logical arrow of time is intrinsically imprinted into the basic framework of quantum mechanics. The past and the future don't play the same role in a quantum theory for the same reason why it matters whether one operator/event is closer to a ket vector or further from it. Microscopic laws and/or Schrödinger's equation may have a time-reversal or CPT symmetry but this fact doesn't imply that the actual events and the physical relationships between them will respect the symmetry. They don't as thermodynamics makes flagrantly obvious.)
So the commutator \(x(t)p(t)-p(t)x(t)\) in quantum mechanics must really be replaced by\[
x(t+\frac\epsilon 2) p(t) - p(t) x(t-\frac\epsilon 2) = \dots
\] where the magnitude of the infinitesimal \(\epsilon\) shouldn't matter but it should be a positive infinitesimal number. Conveniently enough, we may choose the size of this \(\epsilon\) to exactly agree with the spacing between the moments into which we discretized the trajectory. In this way, the calculation will only involve a one-dimensional integral but believe me that if you tried to compute it with a different \(\epsilon\), you would obtain the same resulting commutator. The nonzero commutator only boils down to a single discontinuity. Also, I chose the (infinitesimal and therefore irrelevant) shifts of the arguments (time) in such a way that both terms of the commutator included the same \(p(t)\).
The most recent displayed formula may obviously be written as \[
\dots = p(t)\cdot [ x(t+\frac\epsilon 2) - x(t-\frac\epsilon 2) \approx p(t) \cdot \epsilon v(t) = \dots ]
\] where I approximated the difference of the two values of \(x\) by the derivative (the velocity) multiplied by the separation of the times. At any rate, if we use the configuration-space variables, the expression for \([x,p]\) is simply\[
\dots = \epsilon\cdot mv(t)^2.
\] We want to calculate what the value of this expression "is" for an infinitesimal positive \(epsilon\) according to the Feynman path integral. The way how the path integral answers such questions is that it tells us to add this factor as a factor to the path integral. If the path integral itself jumps by a certain resulting multiplicative factor for every history (one that doesn't try to squeeze additional factors/events/measurements to the moment \(t\)), it means that the inserted factor should be said to be equal to the resulting factor.
Note that naively, \(m,v\) are finite so when we multiply it by \(\epsilon\) and send the latter to zero, we should get zero. But we won't because Nature isn't naive.
Fine. Our next step has to be as follows: instead of considering the original path integral (which may already have some insertions, hopefully at mostly different moments), we consider a path integral with an extra insertion (the potential energy is called \(U(x)\) and not \(V(x)\), to reserve the letter \(V\) in all forms for velocities – thanks to Bill Zajc for this correction and several others)\[
\int{\mathcal D}x(t) \,\exp\int \dd t[\frac{imv^2}{2\hbar}-\frac{iU(x)}{\hbar}]\,\cdots\, \epsilon mv^2.
\] In the discretized trajectories, we may consider the integral over \(x(t-\epsilon/2)\) and \(x(t+\epsilon/2)\): assume that these are among the allowed "lattice sites" in the discretized time. The integral over the "earlier" \(x(t-\epsilon/2)\) may be kept and the integral over the later one may be replaced by \(\int \dd v(t)\) with the right factors (which cancel and may therefore be ignored because the commutator we want to get is the ratio of the path integral with an extra insertion and without an extra insertion).
What we're inserting only depends on \(v(t)\) so the path-integral version of the value of \(x(t)p(t)-p(t)x(t)\) which was converted to \(\epsilon mv(t)^2\) is simply\[
"[x,p]" = \frac{\int \dd v(t) \exp(\frac{i\epsilon mv^2}{2\hbar}) \cdot \epsilon mv(t)^2} {\int \dd v(t) \exp(\frac{i\epsilon mv^2}{2\hbar})}=\dots
\] The Gaussian is the exponential of \(i\epsilon mv^2/2\). Complex calculus allows you not to worry about the fact that it is actually a phase. You may see that if the \(i\) were absent, the width of the Gaussian would be, because \(\epsilon mv^2\hbar\sim 1\), of order \(\Delta v\sim \sqrt{\hbar/2\epsilon}\). It's important to realize that if \(\epsilon\to 0\), the width goes like \(\epsilon^{-1/2}\) and diverges in the limit.
At any rate, you see that up to factors such as \(2,i,\hbar\), the insertion is exactly what appears as the exponent in the exponential we inherited from the path integral i.e. from the action. With the simple substitution \(V=\sqrt{m\epsilon/\hbar}\cdot v\), the integral above becomes\[
\frac{\int \dd V \,\exp(iV^2/2)\cdot \hbar V^2}{\int \dd V\, \exp (iV^2/2)}
\] Well, it's annoying that the integrand has a constant absolute value – it is a phase - but you may define it by another natural limit which adds a modest true Gaussian suppression at infinity. I don't want to justify all these things because their legitimacy and "rightness" really boils down to one's proper intuition about how physics works – it works in such a way that many things are analytic and if they seem ill-defined and may be fixed by an analytic continuation, one should definitely do it – and if you don't see this point, you're just missing some innate aptitudes for theoretical physics and I won't be able to convince about them, anyway.
Instead, let me assume that the reader has no problem with such continuations etc. The right way to compute the right ratio – which is mathematically the same task as a task with ordinary real Gaussians – is to define another variable \(W\) such that \(iV^2 = -W^2\) i.e. \(V^2 = iW^2\) so that we get a nice Gaussian. The ratio \(\dd W/\dd V\) from the substitution cancels in the ratio we calculate so the \(xp-px\) commutator we wanted to compute finally boils down to\[
\frac{\int \dd W \,\exp(-W^2/2)\cdot i\hbar \cdot W^2}{\int \dd W\, \exp (-W^2/2)}.
\] The integrals go from \(-\infty\) to \(+\infty\). But up to the universal factor \(i\hbar\), this is nothing else than the expectation value of \(W^2\) for a normal distribution with a unit standard deviation: this is nothing else than the \(\exp(-W^2/2)\) distribution. But this expectation value is one – feel free to calculate the integral by your favorite tools. So we have just calculated in the Feynmanian way that\[
xp-px = i\hbar.
\] It may be useful to return a little bit and see how it was possible for us to obtain such a "paradoxical" result. The reason was that despite the overall factor of \(\epsilon\) in the formula \(\Delta x(t) \sim \epsilon v(t)\), we got a nonzero result because the commutator boiled down to the expectation value of \(v(t)^2\) and \(v(t)\) was normal-distributed with a width that became infinite as the spacing of time, \(\epsilon\), was sent to zero.
Paths must be unsmooth
This "infinite width" of the distribution for the velocity (as it implicitly appears in the path integral) is really the point. This is the place that stores the weapons that make the surprising result (the nonzero commutator even though we superficially deal with \(c\)-numbers all the time) possible and true. I must repeat this sentence once again because it's important:
The reason why Feynman's path integral agrees with the nonzero commutators i.e. with the uncertainty principle is that the standard deviation of the velocity goes to infinity as we approach the continuum limit \(\epsilon=\Delta t \to 0\).If we were thinking that we're only integrating over differentiable, smooth trajectories, we would still be able to derive that the commutator has to vanish. It is extremely important for the path integral to get a contribution from non-differentiable trajectories in which \(v(t)\) is effectively divergent. In fact, "almost all" trajectories that contribute to the path integral are non-differentiable in this sense: the differentiable trajectories form a "subset of measure zero" and may actually be ignored!
To deny that the non-differentiable trajectories (or, in quantum field theory, spacetime configuration/histories of the fields) are paramount contributors to the path integral of any consistent quantum theory means to deny the uncertainty principle! Although quantum mechanics has superficially nothing to do with the requirement that important trajectories in the path integral description must be non-differentiable, the uncertainty principle is actually the same thing!
How much non-differentiable the trajectories are? We've seen that the typical value of \(v(t)^2\) was scaling like \(1/\epsilon\) i.e. \(v(t)\) was proportional to \(1/\sqrt{\epsilon}\). You may translate it to \(\Delta x\), the change of \(x\) over the unit of time into which we divide the trajectory. We get \(\Delta x\sim \epsilon v\sim \epsilon / \sqrt{\epsilon}\sim \sqrt{\epsilon}\). What does it mean?
Well, if the typical distance you move after time \(\epsilon\) scales like \(\sqrt{t}\), it's nothing else than the Brownian motion! So when it comes to the power law that determines the dependence on the velocities (and position changes) on the period of time, the typical trajectories contributing to the Feynman path integral resemble the Brownian motion. They look like random walks! This shouldn't be shocking even at the "linguistic level" because the Feynman path integral does integrate over random walks because quantum mechanics says that particles walk in random ways.
The unsmoothness of these random walks is actually another way to formulate the uncertainty principle. The precise commutators of the observables are encoded in the precise shape of the "infinitely wide" distributions for the velocities etc.
I must mention that in quantum field theory in \(d\) dimensions – we have done quantum mechanics so far which is quantum field theory in \(d=1\) (time is the only spacetime variable on which the degrees of freedom depend) – the power laws will be different. If we still use \(\epsilon\) for the lattice spacing, the action will be discretized to boxes of volume \(\epsilon^d\). This tiny factor will multiply the Lagrangian density at each lattice site so the velocities (derivatives of fields...) of typical trajectories will scale like \(1/\sqrt{\epsilon^d}\).
Note that for \(\epsilon\to 0\), these velocities diverge even more quickly than they did in \(d=1\). The short distance fluctuations of the histories in \(d\gt 1\) quantum field theories are not just those of the random walk we found in quantum mechanics; they are even more violently oscillating. This may be heuristically interpreted as the reason why quantum field theories in ever higher numbers of spacetime dimensions suffer from increasingly severe short-distance problems. You may say that it is one of the ways to see why these theories ultimately become non-renormalizable and ill-defined in the ultraviolet.
Not only old Englishment could have built Stonehenge. Å koda has built this Citihenge, named after Citigo, our version of Volkswaven Up!, out of old cars. The vicinity of the Tower Bridge is immediately prettier than before. :-D
Implications for spin foams and discreteness of time
We have emphasized – or at least I have emphasized – that the divergent values of derivatives in the typical histories were needed for the path integral to agree with the nonzero commutators. This is actually a simple way to see that all would-be path integral theories that want to make the time discrete – e.g. they want to have a built-in \(\epsilon=t_{\rm Planck}\) which is constant in the quantum gravity realm which means that it cannot be sent to zero – inevitably violate the uncertainty principle, a basic postulate of quantum mechanics.
If the degrees of freedom were discrete in this way, e.g. if the time were divided to Planckian intervals, all observables would have finite-width distributions and you couldn't get the finite, nonzero commutators. In such a theory, there would be no observables that are linked to functions of the dummy variables (we path-integrate over) and that refuse to commute with each other. This is another simple way to exclude all theories of the "spin foam" kind (a path-integral incarnation of loop quantum gravity although the equivalence obviously can't hold because LQG still tries to pretend that some commutators are nonzero). The people who study this garbage don't understand the basic stuff about path integrals because they would otherwise know that divergent standard deviations of the velocities are needed to get nonzero commutators from the path integral. Again, don't mess with the path integral.
And that's the memo.
Bonus: why it's OK that these paths have an infinite action
Jan Reimers made a good point in the comments. Textbooks (correctly!) say that the action computed from a particular random-walk-like trajectory mentioned above is infinite. It is indeed infinite. If \(v^2\) has an expectation value going like \(1/ \epsilon\to\infty\), the integral of such a kinetic term \(mv^2/2\) over time is bound to diverge, too.
But that's how the things are. These paths dominate the path integral, anyway. It's because there are many of them. If you consider differentiable trajectories, you may get a smaller action, namely a finite one, but you will integrate over a smaller volume of trajectories in the infinite-dimensional space of paths and this suppression by the "excessively small volume" in the space of trajectories is more (well, in some counting equally) important than (or as) the exponential suppression due to the divergent action.
One may imagine that all the relevant un-smooth paths are fluctuations away from a classical, smooth one whose action is finite. Quantum mechanics allows one to deviate from such smooth paths and it actually allows enough so that the typical "allowed" paths are non-differentiable.
It's useful to do some maths. Expand the path \(x(t)\) into some standing waves (Fourier modes), with terms \(a_k\cdot \sin (\pi k t/\Delta t)\) plus some linear term to obey the right condition at the initial and final moments. Now, how do the coefficients \(a_k\) of the typical allowed trajectories scale with \(k\)? If you rewrite the kinetic part of the action which is proportional to \(\int v^2\dd t\), you will get terms such as \(k^2|a_k|^2\). The extra factor of \(k^2\) came from the need to differentiate \(x(t)\) to obtain \(v(t)\); and this got squared because we had \(v^2\).
The path integral contains the factor of \(\exp(-S_E/\hbar)\): let us switch to the Euclidean space so that I don't have to apologize for the imaginary unit again. Because \(S_E\) is a sum over \(k\), essentially, we get factors in the path integral of the form\[
\exp(-C\cdot k^2 |a_k|^2).
\] You may see that the distribution of each coefficient \(a_k\) is essentially independent of others and \(k^2|a_k|^2\) is of order one, independently of \(k\), which means that \(|a_k|\) scales like \(1/k\) for large \(k\). If you have a function with Fourier coefficients scaling in this way and translate it to \(x(t)\), a function of a continuous time \(t\), you will get a discontinuous function of the same random-walk type discussed above.
On the other hand, the action for such a typical trajectory is infinite because it's the sum over \(k\) of \(k^2|a_k|^2\), up to some overall constants and other details, and because each term is of order one, independently of \(k\), and because you have infinitely many terms of this kind (infinitely many Fourier modes), the action becomes infinite. But that's not a problem. Most of the infinity comes from "very large" or "infinite" values of \(k\), i.e. very quickly oscillating Fourier modes, and those have a very small impact on the low-frequency observations that can be made with large and clumsy "classical" probes.
If you only have a classical probe, you're back to the classical intuition because \(a_k\) modes with too high values of \(k\) become invisible while their contribution to the action becomes "universal": every smooth classical action allows pretty much the same un-smooth deviations from it so having the family of nearby un-smooth paths essentially adds a universal factor to the path integral only (if you only compute low-frequency processes etc.). Of course, whenever the quantum fluctuations become so large and important that you can't consistently separate them from the "classical smooth parts" of \(x(t)\), the classical limit and the classical intuition become invalid with all the implications.
Why Feynman's path integral doesn't contradict the uncertainty principle
Reviewed by DAL
on
June 21, 2012
Rating:
No comments: