...more precisely screwing string theory...
The 5,250+ TRF blog entries discuss various topics, mostly scientific ones, including minor advances. However, there isn't any text on this website that would talk about matrix string theory (inpendently found 2 months later by a herald who inaugurated the new Dutch king and an ex-co-author of mine along with two twins).
If you search for the closest topic, you will find one article about Matrix theory published a year ago and a supplement about membranes in Matrix theory that was added a week later.
But now we want to talk about matrix string theory. It's a version of Matrix theory. Much like Matrix theory – or M(atrix) Theory – describes M-theory in 11 dimensions (which has no strings), matrix string theory describes type IIA or heterotic \(E_8\times E_8\) string theory in \(d=10\). So it's a stringy version of Matrix theory; or string theory formulated in a matrix form.
The discovery of matrix string theory was important for several reasons. First, it was an important confirmation of the ability of the Matrix theory concept to define the dynamics of string/M-theory in many situations; and it was the first time when we had a complete, non-perturbative definition of a string theory.
What do I mean by this comment? Before Matrix theory, all calculations in string theory would be organized as Taylor expansions in \(g_s\), the string coupling. All amplitudes would be written as \(A_0 + A_1 g_s + A_2 g_s^2\dots\), and so on. However, not every function may be expanded in this way and the general amplitudes in quantum field theory or string theory can't. For example, \(\exp(-C/g_s^2)\) has a Taylor expansion whose terms vanish (because all higher-order derivatives of this function at \(g_s=0\) vanish) even though the function was non-vanishing.
In this sense, a complete definition was absent. One could have even believed that the existence or consistency of string theory was just a perturbative illusion. Matrix string theory was the first "constructive proof" that string theory is well-defined even non-perturbatively. In the type IIA case, one had a definition for any \(g_s\). In the \(g_s\to\infty\) limit, one could easily show that the theory reduces to Matrix theory, the matrix model for M-theory; in the \(g_s\to 0\) limit, one could prove – and this is the main achievement of the matrix string theory founding papers – that the dynamics reproduces the states and interactions of type IIA string theory as we had known them from the perturbative approaches.
Formal and informal derivations of the matrix string Lagrangian
Matrix theory is formulated in terms of the following Hamiltonian\[
H = P^- = \frac{N}{2} {\rm Tr}\zav{ \Pi_i^2 - [X_i,X_j]^2 +{\rm fermionic} }
\] which is interpreted as a light-cone component \(P^- = (P^0-P^{10})/\sqrt{2}\) of the spacetime energy-momentum vector. Well, the original Matrix theory paper by BFSS (Banks, Fischler, Shenker, Susskind) talked about the "infinite momentum frame" and various "highly boosted limits". But one could easily go to the limit and rewrite the quantities in the light-cone gauge. I was always baffled how a paper by Lenny could have become well-known just because it made this self-evident point. My papers (written before Susskind) always took the light-cone gauge as an obvious fact, for granted, and I am confident that everyone who followed the Green-Schwarz machinery from the early 1980s (these physicists preferred to calculate things in the light-cone gauge at that time) had to immediately see that the more natural and more right way to interpret the BFSS model was the light-cone gauge and not just some half-baked "infinite momentum frame".
But let me avoid these discussions. I will assume that the reader has no problem with null combinations of spacelike and timelike components of the energy-momentum vector and realizes that they are often natural combinations to consider.
The Hamiltonian above also contains fermionic, Yukawa-like terms of the form \({\rm Tr}(\theta\gamma_i [X_i,\theta])\) needed for supersymmetry (and various related crucial cancellations) and all the fields are \(N\times N\) matrices chosen for the matrix model to respect the \(U(N)\) gauge symmetry; yes, all physically allowed states must be invariant under the whole \(U(N)\) group.
In the previous articles, I tried to explain why this quantum mechanical model whose fields are "large matrices", generalizations of the usual non-relativistic operators \(X_i,P_i\), contains multi-graviton states, their superpartners, and large membranes: it has all the objects it needs to agree with the physical spectrum of M-theory in 11 dimensions.
Now, we want to compactify M-theory on a circle. M-theory on \(S^1\times \RR^{10}\) has been known to be equivalent to type IIA string theory in 10 dimensions (from the very first paper by Witten that introduced M-theory: the equivalence of the low-energy limits had been known for 10 years before that Witten's paper). What do we have to do with the matrix model to see all the physics of type IIA string theory?
There was some confusion about this question in the original BFSS paper on Matrix theory. The authors tended to believe that their exact Hamiltonian contains "the whole Hilbert space" of string/M-theory in all of its backgrounds. However, it wasn't the case. The moduli are modes with \(P^-=0\) and they correspond to excitations of the \(U(0)\) matrix model. The BFSS matrix model has no degrees of freedom for \(N=0\) so there are no ways to change the moduli. Consequently, the model may only describe one particular superselection sectors – the states of string/M-theory that respect the asymptotic form of the spacetime that looks like one in 11-dimensional M-theory (with one light-like direction compactified on a "long" circle).
To see type IIA string theory, i.e. the states in a different superselection sector of string/M-theory, we need to construct a different matrix model. What is it?
At the end of 1997, Ashoke Sen and especially Nathan Seiberg proposed a straightforward way to derive the BFSS matrix model and its compactifications from a limiting procedure combined with some widely believed dualities in string/M-theory. It's a clever (and superior) derivation that allows us to derive matrix models that are gauge theories; as well as matrix models that aren't just "ordinary" gauge theories but their novel UV completions such as the \((2,0)\) theory in \(d=6\) and little string theory.
However, if we want to find a matrix model for a compactification of M-theory on \(T^k\) and the dimension \(k\) of the torus isn't greater than three, it's enough to use the formal "gauge theory assuming" derivation I used at the beginning of 1997. How does it work?
One develops (your humble correspondent developed) a more general procedure to "orbifold a matrix model". The compactification on a circle is an orbifold by the group isomorphic to \(\ZZ\) composed of translations by \(2\pi R n\) in the direction of the circular dimension. To find the matrix description of the orbifold, we need to enhance \(N\) sufficiently and constrain the matrices of this "enhanced BFSS model" in a way that says that "the matrices transformed by elements of the orbifold group are gauge conjugations of the original ones".
This may sound complicated but the example of the compactification, an important one, makes it rather clear what I mean. The BFSS model has matrices with elements such as \(X^i_{mn}\) where \(m,n=1,2,\dots N\) are the gauge indices. We need the set of values of these indices to be infinitely greater. So we replace these matrix degrees of freedom by \(X^i_{mn}(\sigma,\sigma')\) where \(\sigma\in(0,2\pi)\) with periodic boundary conditions (a circular set of possible values of this "index") is a continuous counterpart of the index \(m\) and similarly for \(\sigma'\) and \(n\).
Now the group \(\ZZ\) of the translations in the direction \(X^9\) has a generator, a translation by \(2\pi R_{9}\), and we identify it with the conjugation by \(\exp(i\sigma)\), a gauge transformation matrix that only acts on the continuous \(\sigma\) indices. Because the translation doesn't physically act on the bosons \(X^1\dots X^8\) and their momenta \(\Pi^i\), the condition "physical transformation equals gauge transformation" says that these matrices are simply functions of one \(\sigma\) because they impose \(\sigma=\sigma'\), or demand \(\delta(\sigma-\sigma')\) in the kernel, along the way. Similarly, \(X^9\) has an extra \(\delta'(\sigma-\sigma')\) term on the right hand side so this matrix gets promoted to the covariant derivative \(D_\sigma\). Again, what used to be the degrees of freedom in \(X^9(\sigma)\) get reinterpreted as the component \(A_\sigma\) of a gauge field.
It may sound incomprehensible or difficult or abstract but I don't find it constructive to spend too much time with that. When you do these operations properly, you will find out that the matrix model for type IIA string theory is a 1+1-dimensional gauge theory with the same group \(U(N)\) as the BFSS model compactified on \(S^1\times\RR\) where the \(S^1\) part of the infinite cylinder arises from the \(\sigma\) "continuous index" we had to add. This 1+1-dimensional gauge theory has a dimensionful parameter \(g_{YM}^2\). The formal procedure "physical transformation defining the orbifold equals gauge transformation of the matrices" even tells us how the coupling \(g_{YM}^2\) depends on the length of the circle \(2\pi R_9\) in the compactification of M-theory. Together with some analyses of the interactions in the resulting matrix model, we may derive that \(R_9/l_{Pl,11}\sim g_s^{3/2}\).
But let's not be too acausal. So far, we have derived the matrix model for type IIA string theory. It looks like the integral of the BFSS Hamiltonian over the circle \(\sigma\) except that the component \(X^9\) of the bosonic fields is replaced by the covariant derivative \(D_9\) involving the 1+1-dimensional gauge field. The original BFSS matrix model may be viewed as the compactification of the 10-dimensional (non-renormalizable) supersymmetric gauge theory to 0+1 dimensions. When we're compactifying the dimensions of the M-theory we want to describe by a matrix model, we must decompactify the spatial dimensions that were dimensionally reduced in the BFSS matrix model to start with. For type IIA string theory in ten dimensions, we must decompactify one (add the single "continuous index" \(\sigma\)). This operation is the opposite of dimensional reduction and because in chemistry, the opposite of reduction is oxidation, this procedure to construct higher-dimensional versions of the BFSS model to describe lower-dimensional vacua of M-theory is sometimes jokingly called the dimensional oxidation. ;-)
Minimizing the energy
Just to be sure: we have "derived" that type IIA string theory in ten dimensions at any coupling is completely equivalent to the maximally supersymmetric \(U(N)\) gauge theory in 1+1 dimensions whose "world volume" has one infinite timelike dimension and one circular, compact spacelike dimension. To get rid of the effects of the compactification of the light-like dimension, we need to take the large \(N\) limit.
In some sense, this is a very modest generalization or variation of the original BFSS claim. I became totally certain that this matrix model is the right one. This certainty is probably necessary for one to be sufficiently motivated to study its physics a bit more closely. So I started with that.
If the 1+1-dimensional gauge theory is the full type IIA string theory, including its D-branes, type IIA supergravity at low energies, black holes, and many other things, it should contain what type IIA string theory is known to contain. For example, it must contain the strings. They must also be able to split and join.
Diagonal in a basis that may change
A general Hamiltonian defines the energy in a quantum mechanical model. All states may be written as superpositions of energy eigenstates. However, some states are more interesting than others: the low-energy eigenstates of the Hamiltonian. Because energy tends to dissipates, physical systems generally like to "drop" to their low-lying states. That's why the low-lying states, starting from the ground state (lowest-eigenvalue eigenstate of the Hamiltonian), are the most important ones.
In other words, the first step in trying to understand the physics of a Hamiltonian in a quantum mechanical theory is to try to help Nature to minimize the energy. How do we do it with the matrix model for matrix string theory?
Let's consider the bosons only; the fermions add additional degrees of freedom, terms in the zero-point energy (that mostly cancel some bosonic terms that would destroy a consistent spacetime interpretation of the physics if they remained uncancelled), and other details. If you assume that fermions play this peaceful, calming, generalizing role, you may say that the important physics is already contained in the bosons.
How do we minimize the energy carried by the bosonic parts of the Hamiltonian? The matrix string Hamiltonian contains \(\int \dd \sigma\,{\rm Tr}(\Pi_i^2)\) times a coefficient. Clearly, this is minimized if the momenta \(\Pi_i(\sigma)\) are zero. More realistically, these matrices may be approximately diagonal and the diagonal entries \(\Pi^i_{nn}(\sigma)\) will behave as the degrees of freedom \(\pi_i(\sigma)\) defined on a Green-Schwarz string. Soon we will see what happens with the extra \(n\) etc.
The off-diagonal entries of \(\Pi^i\) as well as the same entries of \(X^i\) behave like W-bosons of a sort, massive degrees of freedom, and at low energies, the wave function is almost required to be proportional to the ground states wave function as a function of these off-diagonal entries.
More interestingly, we want to minimize the term \({\rm Tr}\zav{-[X_i,X_j]^2}\) in the energy, too. The minus sign has to be there because for each \(i,j\), the commutator is anti-Hermitian so its square is negatively definite, not positively definite. How do we minimize it? Clearly, it will be smaller if the eight matrices \(X^i\) commute with each other. (Quantum mechanically, the wave function will be concentrated near the points on the configuration space where they commute with each other.)
If they commute with each other, it means that we can simultaneously diagonalize them. In other words, we can write\[
X^i(\sigma) = U(\sigma) X^i_{\rm diag}(\sigma) U^{-1}(\sigma).
\] The matrix \(U\) may be assumed to be unitary because Hermitian matrices are diagonalized in an orthonormal basis. The matrix with the "diag" subscript on the right hand side is diagonal. But an important detail is that \(U(\sigma)\) must be allowed to be arbitrary because the energy minimization tells us nothing about the basis in which all the \(X^i\) matrices are diagonal.
And that makes a difference because \(U(\sigma)\) doesn't have to be periodic with the period of \(2\pi\). Only the total field \(X^i(\sigma)\) of the gauge theory has to be periodic. However, the transformation \(U(\sigma)\) to the basis in which \(X^i(\sigma)\) is diagonal may undergo a nontrivial monodromy if we change \(\sigma\) by \(2\pi\). The matrix \(X^i_{\rm diag}(0)\), for example, was constrained by our rules to be diagonal but the matrix \(U(0)\) that (via conjugation) brings a given \(X^i(\sigma)\) to the diagonal form is "almost unique" but not quite. First, one may add some \(N\) phases on the diagonal of \(U\).
Second, and this is more important here, the matrix \(U\) may be multiplied by a permutation matrix! If a matrix is diagonal in a certain basis, it is diagonal in a permutation of this basis, too! So we must consider more general matrices \(U(\sigma)\) that are continuous functions of \(\sigma\) but that obey\[
U(\sigma+2\pi) = U(\sigma) P
\] where \(P\) is a permutation matrix. In combination with some continuous but also aperiodic diagonal matrices \(X^i_{\rm diag}\), such a unitary matrix may still produce an energy-minimizing, periodic field \(X^{i}(\sigma)\). This is the key subtlety not to be overlooked if you want to understand physics of matrix string theory.
What is this fact good for?
It's easy to see how the \(U(N)\) matrix model, the two-dimensional gauge theory, contains \(N\) "short strings". The degrees of freedom of each such short string is carried by the diagonal entries of \(X^i(\sigma)\). There are \(N\) such entries along the diagonal. However, we also need "long strings"; the length of the \(\sigma\) coordinate space has been known to be proportional to the light-cone momentum \(P^+\) to everyone who was familiar with the light-cone gauge string theory.
This \(P^+\) is quantized, equal to \(N/R\), because the null coordinate \(X^-\) is compactified on a circle of radius \(R\) (we want to send \(R\to\infty\) to get rid of this semi-unphysical compactification which also forces us to send \(N\to\infty\) to keep \(P^+\) fixed). And we know how to find strings with \(P^+=1/R\) i.e. with the \(N=1\) unit of the light-like longitudinal momentum.
However, the permutation business tells us how to find the "long strings" with \(P^+=N/R\) for any positive integer \(N\). You pick an eigenvalue of \(X^i\) along the diagonal; trace it as you continuously change \(\sigma\) from \(0\) to \(2\pi\); and when you reach \(\sigma=2\pi\), this eigenvalue doesn't connect to the original one at \(\sigma=0\). Instead, it will connect to a different one and only if you increase \(\sigma\) by \(2\pi N\), you may return to the original function because \(N\) basis vectors participate in a cycle of the permutation (used in the boundary conditions for \(U(\sigma)\).
(The "long strings" were also called "screwing strings" by your humble correspondent because the monodromy bringing the eigenvalue to a new level every time you get around the circle looks like a screw. I didn't know what the verb "screw" had meant informally. But this informal meaning of "screwing" is one of the reasons why the incorrect name "matrix string theory" became more frequently used than the technically correct name "screwing string theory". Incidentally, note that "matrices" and "nuts [waiting for screws]" are translated by the same Czech word, "matice".)
Because every permutation may be decomposed into a product of circular cycles, we see that every low-energy state in matrix string theory is composed of several strings with arbitrary values of \(P^+=N/R\). The permutation defines a "sector" of matrix string theory. The decomposition into the sector is just an artifact of the low-energy approximation; there is no sharp "barrier" between the sectors as they're continuously connected on the configuration space of the 1+1-dimensional gauge theory.
One may also derive the origin of some other subtle conditions. For example, the bosonic/fermionic states of the long strings obey the right statistics because the permutations that interchange the whole long strings are elements of the \(U(N)\) gauge group that must keep all physical states invariant. However, one may also derive the \(L_0=\tilde L_0\) condition for each separate string as the gauge invariance under the generator of the \(ZZ_k\) cyclic group that defines the cyclical permutations associated with a given string. Well, this is really equivalent to \(L_0-\tilde L_0 \in k\ZZ\) but for large values \(k\), all values except for \(L_0-\tilde L_0=0\) will correspond to string states of a high energy and will not belong to the low-energy spectrum.
Merging and splitting strings: jumping in between the permutation sectors
I have already said that in the low-energy limit, it looks like the Hilbert space is composed of sectors labeled by permutations in \(S_N\subset U(N)\). Each cycle that such a permutation is composed of corresponds to one "long string" – an ordinary type IIA string – present in the configuration.
At the same time, matrix string theory allows you to continuously switch between different "sectors". This corresponds to changing the permutation or, equivalently, the decomposition of the total longitudinal momentum \(P^+\) to the individual strings.
The most elementary operation changing a permutation is the composition of this permutation with an extra transposition (of two pieces of the string; or two eigenvalues). The low-energy approximation of the gauge theory's (matrix model's) Hamiltonian will involve the list of the allowed sectors and the free Hamiltonian for the individual strings that match the free type IIA string theory. However, the gauge theory isn't quite free so there will also be corrections and those may change the sector (the permutation). Those that only add one transposition will be the leading ones and they will correspond to nothing else than the usual splitting or merging of strings, a three-closed-string vertex.
We know that the gauge theory is supersymmetric so the interactions will have to preserve the same supersymmetry. DVV showed that the form of the splitting/merging leading interaction is essentially unique. But even without knowing its form, I could have derived – using a trick using the assumption that the large \(N\) limit is universal and independent of \(R\), the light-like radius – how the coefficient of the three-string vertex depends on the radius \(R_9\) of the coordinate we compactified to get the matrix model of type IIA string theory out of the BFSS model for M-theory. (There are two radii compactified here which are often labeled as \(R_9\) and \(R_{11}\). People who don't understand the logic of matrix string theory may confuse them. The exchange of these two radii that is effectively used in the construction was also called the 9/11 flip and be sure that it was before my PhD defense on 9/11/2001.)
The DVV description of the permutations
In March 1997, DVV who were much more familiar with the standard machinery of two-dimensional conformal field theories described the free-string limit of the gauge theory by a concise term: the symmetric orbifold CFT. It means a CFT – a linear (not non-linear, in this case) sigma model on \(\RR^{8N}/S_N\) where \(S_N\) is the permutation group exchanging the \(N\) copies of the 8-dimensional transverse space.
They also wrote down the explicit form of the three-string interaction vertex (leading interaction) emerging in this limit in terms of spin fields and twist fields, fixed a mistake in my not quite correct derivation of the level-matching \(L_0=\tilde L_0\) condition, and added some comments about the appearance of the D0-branes (short strings with the electric field etc.).
Higher-order terms in the Hamiltonian
The transposition of two eigenvalues is just the simplest among the extra permutations that may change the sector. In reality, the matrix model for string theory predicts all the complicated permutations (cycles with 3 elements or any number of elements), too. One may guess a natural Ansatz how these terms look like at any order in \(g_s\). We wrote these formulae with Dijkgraaf – a paper showing that the matrix string Hamiltonian is corrected at every order and how (these extra high-order terms produce contact terms interactions that are needed for the consistency of the light-cone gauge string theory but they may be largely circumvented in the usual covariant calculations based on moduli spaces of Riemann surfaces). This particular paper remained almost unknown, one of the numerous testimonies of the fact that in the 21st century, the interest in technical things such as "filling the gaps in the only non-perturbative definition of type IIA string theory we have" was dropping to zero. In 2003, people were already much more excited with philosophical gibberish such as the anthropic lack of principle and fabricated "technical evidence" that it applies in string theory.
I won't proof-read this text because I am afraid that its technical character will shrink its readership close to an infinitesimal number that can't justify the extra work needed for proofreading.
The 5,250+ TRF blog entries discuss various topics, mostly scientific ones, including minor advances. However, there isn't any text on this website that would talk about matrix string theory (inpendently found 2 months later by a herald who inaugurated the new Dutch king and an ex-co-author of mine along with two twins).
If you search for the closest topic, you will find one article about Matrix theory published a year ago and a supplement about membranes in Matrix theory that was added a week later.
But now we want to talk about matrix string theory. It's a version of Matrix theory. Much like Matrix theory – or M(atrix) Theory – describes M-theory in 11 dimensions (which has no strings), matrix string theory describes type IIA or heterotic \(E_8\times E_8\) string theory in \(d=10\). So it's a stringy version of Matrix theory; or string theory formulated in a matrix form.
The discovery of matrix string theory was important for several reasons. First, it was an important confirmation of the ability of the Matrix theory concept to define the dynamics of string/M-theory in many situations; and it was the first time when we had a complete, non-perturbative definition of a string theory.
What do I mean by this comment? Before Matrix theory, all calculations in string theory would be organized as Taylor expansions in \(g_s\), the string coupling. All amplitudes would be written as \(A_0 + A_1 g_s + A_2 g_s^2\dots\), and so on. However, not every function may be expanded in this way and the general amplitudes in quantum field theory or string theory can't. For example, \(\exp(-C/g_s^2)\) has a Taylor expansion whose terms vanish (because all higher-order derivatives of this function at \(g_s=0\) vanish) even though the function was non-vanishing.
In this sense, a complete definition was absent. One could have even believed that the existence or consistency of string theory was just a perturbative illusion. Matrix string theory was the first "constructive proof" that string theory is well-defined even non-perturbatively. In the type IIA case, one had a definition for any \(g_s\). In the \(g_s\to\infty\) limit, one could easily show that the theory reduces to Matrix theory, the matrix model for M-theory; in the \(g_s\to 0\) limit, one could prove – and this is the main achievement of the matrix string theory founding papers – that the dynamics reproduces the states and interactions of type IIA string theory as we had known them from the perturbative approaches.
Formal and informal derivations of the matrix string Lagrangian
Matrix theory is formulated in terms of the following Hamiltonian\[
H = P^- = \frac{N}{2} {\rm Tr}\zav{ \Pi_i^2 - [X_i,X_j]^2 +{\rm fermionic} }
\] which is interpreted as a light-cone component \(P^- = (P^0-P^{10})/\sqrt{2}\) of the spacetime energy-momentum vector. Well, the original Matrix theory paper by BFSS (Banks, Fischler, Shenker, Susskind) talked about the "infinite momentum frame" and various "highly boosted limits". But one could easily go to the limit and rewrite the quantities in the light-cone gauge. I was always baffled how a paper by Lenny could have become well-known just because it made this self-evident point. My papers (written before Susskind) always took the light-cone gauge as an obvious fact, for granted, and I am confident that everyone who followed the Green-Schwarz machinery from the early 1980s (these physicists preferred to calculate things in the light-cone gauge at that time) had to immediately see that the more natural and more right way to interpret the BFSS model was the light-cone gauge and not just some half-baked "infinite momentum frame".
But let me avoid these discussions. I will assume that the reader has no problem with null combinations of spacelike and timelike components of the energy-momentum vector and realizes that they are often natural combinations to consider.
The Hamiltonian above also contains fermionic, Yukawa-like terms of the form \({\rm Tr}(\theta\gamma_i [X_i,\theta])\) needed for supersymmetry (and various related crucial cancellations) and all the fields are \(N\times N\) matrices chosen for the matrix model to respect the \(U(N)\) gauge symmetry; yes, all physically allowed states must be invariant under the whole \(U(N)\) group.
In the previous articles, I tried to explain why this quantum mechanical model whose fields are "large matrices", generalizations of the usual non-relativistic operators \(X_i,P_i\), contains multi-graviton states, their superpartners, and large membranes: it has all the objects it needs to agree with the physical spectrum of M-theory in 11 dimensions.
Now, we want to compactify M-theory on a circle. M-theory on \(S^1\times \RR^{10}\) has been known to be equivalent to type IIA string theory in 10 dimensions (from the very first paper by Witten that introduced M-theory: the equivalence of the low-energy limits had been known for 10 years before that Witten's paper). What do we have to do with the matrix model to see all the physics of type IIA string theory?
There was some confusion about this question in the original BFSS paper on Matrix theory. The authors tended to believe that their exact Hamiltonian contains "the whole Hilbert space" of string/M-theory in all of its backgrounds. However, it wasn't the case. The moduli are modes with \(P^-=0\) and they correspond to excitations of the \(U(0)\) matrix model. The BFSS matrix model has no degrees of freedom for \(N=0\) so there are no ways to change the moduli. Consequently, the model may only describe one particular superselection sectors – the states of string/M-theory that respect the asymptotic form of the spacetime that looks like one in 11-dimensional M-theory (with one light-like direction compactified on a "long" circle).
To see type IIA string theory, i.e. the states in a different superselection sector of string/M-theory, we need to construct a different matrix model. What is it?
At the end of 1997, Ashoke Sen and especially Nathan Seiberg proposed a straightforward way to derive the BFSS matrix model and its compactifications from a limiting procedure combined with some widely believed dualities in string/M-theory. It's a clever (and superior) derivation that allows us to derive matrix models that are gauge theories; as well as matrix models that aren't just "ordinary" gauge theories but their novel UV completions such as the \((2,0)\) theory in \(d=6\) and little string theory.
However, if we want to find a matrix model for a compactification of M-theory on \(T^k\) and the dimension \(k\) of the torus isn't greater than three, it's enough to use the formal "gauge theory assuming" derivation I used at the beginning of 1997. How does it work?
One develops (your humble correspondent developed) a more general procedure to "orbifold a matrix model". The compactification on a circle is an orbifold by the group isomorphic to \(\ZZ\) composed of translations by \(2\pi R n\) in the direction of the circular dimension. To find the matrix description of the orbifold, we need to enhance \(N\) sufficiently and constrain the matrices of this "enhanced BFSS model" in a way that says that "the matrices transformed by elements of the orbifold group are gauge conjugations of the original ones".
This may sound complicated but the example of the compactification, an important one, makes it rather clear what I mean. The BFSS model has matrices with elements such as \(X^i_{mn}\) where \(m,n=1,2,\dots N\) are the gauge indices. We need the set of values of these indices to be infinitely greater. So we replace these matrix degrees of freedom by \(X^i_{mn}(\sigma,\sigma')\) where \(\sigma\in(0,2\pi)\) with periodic boundary conditions (a circular set of possible values of this "index") is a continuous counterpart of the index \(m\) and similarly for \(\sigma'\) and \(n\).
Now the group \(\ZZ\) of the translations in the direction \(X^9\) has a generator, a translation by \(2\pi R_{9}\), and we identify it with the conjugation by \(\exp(i\sigma)\), a gauge transformation matrix that only acts on the continuous \(\sigma\) indices. Because the translation doesn't physically act on the bosons \(X^1\dots X^8\) and their momenta \(\Pi^i\), the condition "physical transformation equals gauge transformation" says that these matrices are simply functions of one \(\sigma\) because they impose \(\sigma=\sigma'\), or demand \(\delta(\sigma-\sigma')\) in the kernel, along the way. Similarly, \(X^9\) has an extra \(\delta'(\sigma-\sigma')\) term on the right hand side so this matrix gets promoted to the covariant derivative \(D_\sigma\). Again, what used to be the degrees of freedom in \(X^9(\sigma)\) get reinterpreted as the component \(A_\sigma\) of a gauge field.
It may sound incomprehensible or difficult or abstract but I don't find it constructive to spend too much time with that. When you do these operations properly, you will find out that the matrix model for type IIA string theory is a 1+1-dimensional gauge theory with the same group \(U(N)\) as the BFSS model compactified on \(S^1\times\RR\) where the \(S^1\) part of the infinite cylinder arises from the \(\sigma\) "continuous index" we had to add. This 1+1-dimensional gauge theory has a dimensionful parameter \(g_{YM}^2\). The formal procedure "physical transformation defining the orbifold equals gauge transformation of the matrices" even tells us how the coupling \(g_{YM}^2\) depends on the length of the circle \(2\pi R_9\) in the compactification of M-theory. Together with some analyses of the interactions in the resulting matrix model, we may derive that \(R_9/l_{Pl,11}\sim g_s^{3/2}\).
But let's not be too acausal. So far, we have derived the matrix model for type IIA string theory. It looks like the integral of the BFSS Hamiltonian over the circle \(\sigma\) except that the component \(X^9\) of the bosonic fields is replaced by the covariant derivative \(D_9\) involving the 1+1-dimensional gauge field. The original BFSS matrix model may be viewed as the compactification of the 10-dimensional (non-renormalizable) supersymmetric gauge theory to 0+1 dimensions. When we're compactifying the dimensions of the M-theory we want to describe by a matrix model, we must decompactify the spatial dimensions that were dimensionally reduced in the BFSS matrix model to start with. For type IIA string theory in ten dimensions, we must decompactify one (add the single "continuous index" \(\sigma\)). This operation is the opposite of dimensional reduction and because in chemistry, the opposite of reduction is oxidation, this procedure to construct higher-dimensional versions of the BFSS model to describe lower-dimensional vacua of M-theory is sometimes jokingly called the dimensional oxidation. ;-)
Minimizing the energy
Just to be sure: we have "derived" that type IIA string theory in ten dimensions at any coupling is completely equivalent to the maximally supersymmetric \(U(N)\) gauge theory in 1+1 dimensions whose "world volume" has one infinite timelike dimension and one circular, compact spacelike dimension. To get rid of the effects of the compactification of the light-like dimension, we need to take the large \(N\) limit.
In some sense, this is a very modest generalization or variation of the original BFSS claim. I became totally certain that this matrix model is the right one. This certainty is probably necessary for one to be sufficiently motivated to study its physics a bit more closely. So I started with that.
If the 1+1-dimensional gauge theory is the full type IIA string theory, including its D-branes, type IIA supergravity at low energies, black holes, and many other things, it should contain what type IIA string theory is known to contain. For example, it must contain the strings. They must also be able to split and join.
Diagonal in a basis that may change
A general Hamiltonian defines the energy in a quantum mechanical model. All states may be written as superpositions of energy eigenstates. However, some states are more interesting than others: the low-energy eigenstates of the Hamiltonian. Because energy tends to dissipates, physical systems generally like to "drop" to their low-lying states. That's why the low-lying states, starting from the ground state (lowest-eigenvalue eigenstate of the Hamiltonian), are the most important ones.
In other words, the first step in trying to understand the physics of a Hamiltonian in a quantum mechanical theory is to try to help Nature to minimize the energy. How do we do it with the matrix model for matrix string theory?
Let's consider the bosons only; the fermions add additional degrees of freedom, terms in the zero-point energy (that mostly cancel some bosonic terms that would destroy a consistent spacetime interpretation of the physics if they remained uncancelled), and other details. If you assume that fermions play this peaceful, calming, generalizing role, you may say that the important physics is already contained in the bosons.
How do we minimize the energy carried by the bosonic parts of the Hamiltonian? The matrix string Hamiltonian contains \(\int \dd \sigma\,{\rm Tr}(\Pi_i^2)\) times a coefficient. Clearly, this is minimized if the momenta \(\Pi_i(\sigma)\) are zero. More realistically, these matrices may be approximately diagonal and the diagonal entries \(\Pi^i_{nn}(\sigma)\) will behave as the degrees of freedom \(\pi_i(\sigma)\) defined on a Green-Schwarz string. Soon we will see what happens with the extra \(n\) etc.
The off-diagonal entries of \(\Pi^i\) as well as the same entries of \(X^i\) behave like W-bosons of a sort, massive degrees of freedom, and at low energies, the wave function is almost required to be proportional to the ground states wave function as a function of these off-diagonal entries.
More interestingly, we want to minimize the term \({\rm Tr}\zav{-[X_i,X_j]^2}\) in the energy, too. The minus sign has to be there because for each \(i,j\), the commutator is anti-Hermitian so its square is negatively definite, not positively definite. How do we minimize it? Clearly, it will be smaller if the eight matrices \(X^i\) commute with each other. (Quantum mechanically, the wave function will be concentrated near the points on the configuration space where they commute with each other.)
If they commute with each other, it means that we can simultaneously diagonalize them. In other words, we can write\[
X^i(\sigma) = U(\sigma) X^i_{\rm diag}(\sigma) U^{-1}(\sigma).
\] The matrix \(U\) may be assumed to be unitary because Hermitian matrices are diagonalized in an orthonormal basis. The matrix with the "diag" subscript on the right hand side is diagonal. But an important detail is that \(U(\sigma)\) must be allowed to be arbitrary because the energy minimization tells us nothing about the basis in which all the \(X^i\) matrices are diagonal.
And that makes a difference because \(U(\sigma)\) doesn't have to be periodic with the period of \(2\pi\). Only the total field \(X^i(\sigma)\) of the gauge theory has to be periodic. However, the transformation \(U(\sigma)\) to the basis in which \(X^i(\sigma)\) is diagonal may undergo a nontrivial monodromy if we change \(\sigma\) by \(2\pi\). The matrix \(X^i_{\rm diag}(0)\), for example, was constrained by our rules to be diagonal but the matrix \(U(0)\) that (via conjugation) brings a given \(X^i(\sigma)\) to the diagonal form is "almost unique" but not quite. First, one may add some \(N\) phases on the diagonal of \(U\).
Second, and this is more important here, the matrix \(U\) may be multiplied by a permutation matrix! If a matrix is diagonal in a certain basis, it is diagonal in a permutation of this basis, too! So we must consider more general matrices \(U(\sigma)\) that are continuous functions of \(\sigma\) but that obey\[
U(\sigma+2\pi) = U(\sigma) P
\] where \(P\) is a permutation matrix. In combination with some continuous but also aperiodic diagonal matrices \(X^i_{\rm diag}\), such a unitary matrix may still produce an energy-minimizing, periodic field \(X^{i}(\sigma)\). This is the key subtlety not to be overlooked if you want to understand physics of matrix string theory.
What is this fact good for?
It's easy to see how the \(U(N)\) matrix model, the two-dimensional gauge theory, contains \(N\) "short strings". The degrees of freedom of each such short string is carried by the diagonal entries of \(X^i(\sigma)\). There are \(N\) such entries along the diagonal. However, we also need "long strings"; the length of the \(\sigma\) coordinate space has been known to be proportional to the light-cone momentum \(P^+\) to everyone who was familiar with the light-cone gauge string theory.
This \(P^+\) is quantized, equal to \(N/R\), because the null coordinate \(X^-\) is compactified on a circle of radius \(R\) (we want to send \(R\to\infty\) to get rid of this semi-unphysical compactification which also forces us to send \(N\to\infty\) to keep \(P^+\) fixed). And we know how to find strings with \(P^+=1/R\) i.e. with the \(N=1\) unit of the light-like longitudinal momentum.
However, the permutation business tells us how to find the "long strings" with \(P^+=N/R\) for any positive integer \(N\). You pick an eigenvalue of \(X^i\) along the diagonal; trace it as you continuously change \(\sigma\) from \(0\) to \(2\pi\); and when you reach \(\sigma=2\pi\), this eigenvalue doesn't connect to the original one at \(\sigma=0\). Instead, it will connect to a different one and only if you increase \(\sigma\) by \(2\pi N\), you may return to the original function because \(N\) basis vectors participate in a cycle of the permutation (used in the boundary conditions for \(U(\sigma)\).
(The "long strings" were also called "screwing strings" by your humble correspondent because the monodromy bringing the eigenvalue to a new level every time you get around the circle looks like a screw. I didn't know what the verb "screw" had meant informally. But this informal meaning of "screwing" is one of the reasons why the incorrect name "matrix string theory" became more frequently used than the technically correct name "screwing string theory". Incidentally, note that "matrices" and "nuts [waiting for screws]" are translated by the same Czech word, "matice".)
Because every permutation may be decomposed into a product of circular cycles, we see that every low-energy state in matrix string theory is composed of several strings with arbitrary values of \(P^+=N/R\). The permutation defines a "sector" of matrix string theory. The decomposition into the sector is just an artifact of the low-energy approximation; there is no sharp "barrier" between the sectors as they're continuously connected on the configuration space of the 1+1-dimensional gauge theory.
One may also derive the origin of some other subtle conditions. For example, the bosonic/fermionic states of the long strings obey the right statistics because the permutations that interchange the whole long strings are elements of the \(U(N)\) gauge group that must keep all physical states invariant. However, one may also derive the \(L_0=\tilde L_0\) condition for each separate string as the gauge invariance under the generator of the \(ZZ_k\) cyclic group that defines the cyclical permutations associated with a given string. Well, this is really equivalent to \(L_0-\tilde L_0 \in k\ZZ\) but for large values \(k\), all values except for \(L_0-\tilde L_0=0\) will correspond to string states of a high energy and will not belong to the low-energy spectrum.
Merging and splitting strings: jumping in between the permutation sectors
I have already said that in the low-energy limit, it looks like the Hilbert space is composed of sectors labeled by permutations in \(S_N\subset U(N)\). Each cycle that such a permutation is composed of corresponds to one "long string" – an ordinary type IIA string – present in the configuration.
At the same time, matrix string theory allows you to continuously switch between different "sectors". This corresponds to changing the permutation or, equivalently, the decomposition of the total longitudinal momentum \(P^+\) to the individual strings.
The most elementary operation changing a permutation is the composition of this permutation with an extra transposition (of two pieces of the string; or two eigenvalues). The low-energy approximation of the gauge theory's (matrix model's) Hamiltonian will involve the list of the allowed sectors and the free Hamiltonian for the individual strings that match the free type IIA string theory. However, the gauge theory isn't quite free so there will also be corrections and those may change the sector (the permutation). Those that only add one transposition will be the leading ones and they will correspond to nothing else than the usual splitting or merging of strings, a three-closed-string vertex.
We know that the gauge theory is supersymmetric so the interactions will have to preserve the same supersymmetry. DVV showed that the form of the splitting/merging leading interaction is essentially unique. But even without knowing its form, I could have derived – using a trick using the assumption that the large \(N\) limit is universal and independent of \(R\), the light-like radius – how the coefficient of the three-string vertex depends on the radius \(R_9\) of the coordinate we compactified to get the matrix model of type IIA string theory out of the BFSS model for M-theory. (There are two radii compactified here which are often labeled as \(R_9\) and \(R_{11}\). People who don't understand the logic of matrix string theory may confuse them. The exchange of these two radii that is effectively used in the construction was also called the 9/11 flip and be sure that it was before my PhD defense on 9/11/2001.)
The DVV description of the permutations
In March 1997, DVV who were much more familiar with the standard machinery of two-dimensional conformal field theories described the free-string limit of the gauge theory by a concise term: the symmetric orbifold CFT. It means a CFT – a linear (not non-linear, in this case) sigma model on \(\RR^{8N}/S_N\) where \(S_N\) is the permutation group exchanging the \(N\) copies of the 8-dimensional transverse space.
They also wrote down the explicit form of the three-string interaction vertex (leading interaction) emerging in this limit in terms of spin fields and twist fields, fixed a mistake in my not quite correct derivation of the level-matching \(L_0=\tilde L_0\) condition, and added some comments about the appearance of the D0-branes (short strings with the electric field etc.).
Higher-order terms in the Hamiltonian
The transposition of two eigenvalues is just the simplest among the extra permutations that may change the sector. In reality, the matrix model for string theory predicts all the complicated permutations (cycles with 3 elements or any number of elements), too. One may guess a natural Ansatz how these terms look like at any order in \(g_s\). We wrote these formulae with Dijkgraaf – a paper showing that the matrix string Hamiltonian is corrected at every order and how (these extra high-order terms produce contact terms interactions that are needed for the consistency of the light-cone gauge string theory but they may be largely circumvented in the usual covariant calculations based on moduli spaces of Riemann surfaces). This particular paper remained almost unknown, one of the numerous testimonies of the fact that in the 21st century, the interest in technical things such as "filling the gaps in the only non-perturbative definition of type IIA string theory we have" was dropping to zero. In 2003, people were already much more excited with philosophical gibberish such as the anthropic lack of principle and fabricated "technical evidence" that it applies in string theory.
I won't proof-read this text because I am afraid that its technical character will shrink its readership close to an infinitesimal number that can't justify the extra work needed for proofreading.
Ways to discover matrix string theory
Reviewed by DAL
on
May 18, 2013
Rating:
No comments: