Short questions often require long answers and proofs

Several debaters as well as complexity theorist Boaz Barak religiously worship their belief that it must be that \(P\neq NP\) and that the question whether the proposition holds is extremely deep because \(P=NP\) would revolutionize the whole world.

Most of their would-be arguments are examples irrational hype, fabricated justifications of the limited progress in a field, and group think. I will primarily focus on a single major wrong thesis they promote, namely the idea that a mechanical or polynomially fast or efficient algorithm to solve a problem specified by a short prescription must be short, too.

So let me begin.

\(P\) and \(NP\)

\(NP\) is the class of problems whose solution, if you're told one, may be verified to be correct by \(O(C\cdot N^\ell)\) operations or fewer (the number of operations is effectively the time!) for some \(C,\ell\) if \(N\), an integer specifying the number of objects in the problem or its size, is asymptotically large.

\(P\) is the class of problems whose solution may be found, and not just verified, after \(O(C\cdot N^\ell)\) operations. Verifying a solution can't ever be harder than finding one so we see that \(P\subseteq NP\) but we don't know whether \(P=NP\) or \(P\) is a proper subset.

There are lots of examples of \(NP\) problems that don't seem to be in \(P\) – but, with some surprise, they could be. The provably "most difficult problems in \(NP\)", the so-called \(NP\)-complete problems, are those whose fast solution could be translated to a fast solution to any \(NP\) problem. So \(NP\)-complete problems are on the opposite (hard) side of the \(NP\) set than the \(P\) problems if this "polarization" can be done at all.

Examples of \(NP\)-complete problems include the the clique problem i.e. the decision whether a combinatorial graph has a complete graph with one-half of the nodes that are completely connected with each other; the travelling salesman problem (find the shortest closed route that visits all nodes/cities in a graph), and so on.

I think that a physicist would find the computation of the permanent (the determinant-like sum of products over permutations in a matrix but without the minus sign) to be a more natural representative of the not-in-\(P\), difficult, class because it's more "explicit" – it's given by a compact formula. However, it's not in \(NP\) – there's no way to quickly verify the resulting value of the permanent, either: an example of the fact that \(NP\) is a constraining, narrow-minded class. On the other hand, it's known how a fast calculation of the permanent may be used to solve \(NP\)-problems; it's known that the permanent is \(NP\)-hard (which is a larger class than \(NP\)).

Computer scientists' most favorite difficult \(NP\)-complete problem is probably 3SAT or SAT – to decide whether a proposition constructed from logical operations AND, OR, and binary variables \(x_i\) (obeying a certain constrained format, if you add the digit "3" or another one at the beginning) is a tautology (a proposition equal to TRUE regardless of the values of variables \(x_i\)) but faster than going through all the \(2^N\) possible choices of the truth values of the variables. This very preference favoring 3SAT already shows some "unnatural bias" – they're focusing on things that are easier to write on the existing computers (including the preference for binary computers instead of e.g. ternary ones that we could be using equally well) which is not really a good, invariant criterion for the mathematical depth.

A reason why all these – effectively equivalent (equally hard, convertible to each other with minimal losses) – problems probably don't have an exponentially fast resolution (why they're not in \(P\)) is that at least for some of them, you would expect that if a solution exists, it would already have been found. So you use the same "mathematical induction" – if New York skyscrapers haven't been demolished by an airplane in 100,000 previous days, chances are that it won't happen tomorrow, either; instead, there may be a law that this will never happen and this conjecture has passed some somewhat nontrivial test. Such an argument is risky but it has at least some logic. Or at least, people would say it had before 9/11.

However, many complexity theorists prefer completely different arguments – or modifications of the valid yet inconclusive argument above – which aren't rational.

The key problem why \(P=NP\) wouldn't be far-reaching even if it were true is that the algorithm to quickly solve 3SAT could be "fast enough" so that it obeys the polynomial criterion but it could still be "slow enough" to make it completely impractical. The number of operations could go like\[

10^{100}\times N\quad {\rm or}\quad N^{100}

\] or some other function. The first Ansatz above is linear – very slowly growing with \(N\) – but it has a large coefficient. The second Ansatz has a modest coefficient, namely one, but the exponent is large. Already for \(N=2\) or \(N=10\), respectively, the two expressions above give you at least a googol and no computer will ever do this many operations.

One could conjecture that mathematics isn't this malicious and both the exponent and the prefactor are likely to be of order one for all problems; a physicist would call this argument "an argument from naturalness". However, the experience with many highly natural problems in mathematics suggests that this application of "naturalness" fails much more often than many people might expect.

Let me emphasize that even if \(P=NP\) and a fast algorithm to solve the 3SAT problem or even to compute the permanent only takes \(N^4\) operations, a natural polynomial scaling, there is no contradiction. The discovery of this fast algorithm would make a significant number of calculations of the "clique" or "traveling salesman" type vastly more efficient but the rest of the world would stay the same. You couldn't automatize the search for proofs or deep discoveries in maths or physics, music, and so on because these activities haven't been reduced to 3SAT and not even to the harder permanent without "big losses".

But in the rest of the text, I want to provide you with some evidence and context showing that \(P=NP\) is still very likely to produce impractically slow algorithms in practice.

Forums: short questions, long answers

If you were ever answering questions on forums such as Physics Stack Exchange, you must know that a very short question often requires an extremely long answer. Well, questions such as "explain string theory to me" may require hundreds or thousands of pages and people usually don't try. But even if the question is realistic, a long answer is simply needed.

But we may talk about a more rigorous incarnation of the same question: the length of proofs needed to prove a theorem. Half a year ago, John Baez wrote about insanely long proofs and about the way how some of them may dramatically shrink if we're allowed to assume the consistency of an axiomatic system etc.

However, I don't want to play these Gödelian games because in most cases, you end up talking about propositions you wouldn't be interested in to start with – propositions that are not always human-readable and that are only interesting because of the length of the required theorems. On the other hand, even if we talk about very practical, important math problems, we often end up with the need to write down very long proofs and very long books on the classification of things.

Before I will mention a few examples, it is important to mention that they're examples of proofs and classifications that we have already found. There may exist many interesting theorems with even longer proofs, perhaps much longer proofs, and we haven't found them yet – partly because it becomes harder and harder to search for longer and longer proofs. So it's totally plausible that extremely long proofs of interesting theorems are almost omnipresent in maths. In fact, it seems sensible to assume that the average length of the proof that the mathematicians and others are finding or mastering is an increasing function of time.

I am talking about proofs but the same comment may apply to speedy algorithms to solve certain problems. These algorithms may still be extremely complicated and the fact that we haven't found them yet doesn't really prove that they don't exist. This caution is promoted in the Wikipedia article on \(P\) vs \(NP\). While it offers us the would-be impressive slogan by Scott Aaronson,

If \(P = NP\), then the world would be a profoundly different place than we usually assume it to be. There would be no special value in "creative leaps," no fundamental gap between solving a problem and recognizing the solution once it's found. Everyone who could appreciate a symphony would be Mozart; everyone who could follow a step-by-step argument would be Gauss...

(I've already explained that this fairy-tale suffers from many serious illnesses: one of them is that it completely overlooks the fact that the main ingenious thing about these famous men is that they can find – and not just follow – some de facto step-by-step procedures to achieve something) it also quotes two researchers as examples of those who believe that it's wrong to be biased and the experts in that field should comparably eagerly study the possibility that \(P=NP\) and try to search for proofs of that:

The main argument in favor of \(P \neq NP\) is the total lack of fundamental progress in the area of exhaustive search. This is, in my opinion, a very weak argument. The space of algorithms is very large and we are only at the beginning of its exploration. [...] The resolution of Fermat's Last Theorem also shows that very simple questions may be settled only by very deep theories.
—Moshe Y. Vardi, Rice University
Being attached to a speculation is not a good guide to research planning. One should always try both directions of every problem. Prejudice has caused famous mathematicians to fail to solve famous problems whose solution was opposite to their expectations, even though they had developed all the methods required.
—Anil Nerode, Cornell University

Very wise. Aaronson really represents the unlimited prejudices, retroactive rationalization and hyping of these prejudices, and even bullying those who are showing that the opposite possibility is also compatible with everything we know and who are working to add some progress in that direction. Aaronson's approach to \(P=NP\) is a similar skewed approach to science as the climate alarmism or Lysenkoism – after all, Aaronson has openly come out of the closet as a climate alarmist himself, too.

I need to emphasize that in physics, we also claim that various marvelous hypothetical engines – like the perpetual-motion machines – don't exist. But in physics, we actually have a much stronger case for such no-go theorems. We can really prove the energy conservation from the time-translational symmetry of Nature, via Noether's theorem. We may prove the second law of thermodynamics in the form of the H-theorem and its variations. These results only depend on assumptions that have been heavily tested in millions of experiments, assumptions that we really consider the most rock-solid foundations of all the natural science. Aaronson's (and not only his, of course) prejudices are rooted in no comparably solid foundations.

Fine, now the examples of the long proofs.

Fermat's Last Theorem

I have discussed some history of the proof last month. The statement of the theorem is extremely easy to formulate. Almost all the well-known mathematicians of the last centuries spent hundreds of hours per capita in attempts to prove the conjecture. At most, they succeeded for some individual exponents \(n\).

Suddenly, in the 1990s, Andrew Wiles succeeded. Many people had already switched to the lazy philosophy saying that "if it hasn't been found for several centuries, it can't be found". They were proved wrong. Wiles' proof was rather long and, even more importantly, it depended on highly advanced, hierarchically abstract concepts that "seemingly" have nothing to do with a simple statement about integers, their powers, and their sums.

In this random episode, Pat and Mat try to bring a painting through the door and hang it in the living room. They spent quite some time with it – more than 8 minutes on the video – and tried many methods which still doesn't prove that they have proven that no better solution exists.

In some formal language, Wiles' proof would still require at least thousands of characters. A brute force search for proofs would require at least something like \(O(256^{1,000})\) operations, if you understand me, but even if \(P=NP\) were true and allowed us to search for proofs polynomially quickly, the power could still be what it is in \(1,000^8\) which would make it impossible to find the proof in practice. The hypothetical proof of \(P=NP\) would still be very far from providing us with proofs of every theorem or a fast solution to any problem.

And of course, all the comments about Wolfgang Amadeus Mozart, Bobby Fischer, or Carl Gauss are completely silly and unrelated to the technical question above, except by verbal tricks. After all, these and other ingenious men had superior skills due to some kind of a brute force that others didn't receive from God. It may be great to believe that something divine, qualitatively different, intrinsically creative is behind these men's superiority but it ain't so. Almost everyone has gotten some memory, CPU, and creativity when he or she was born but some people just get much more of it and it – along with some training, hard work, and lucky circumstances – allows them to do things that others apparently can't do or at least don't do.

Four Color Theorem

The four-color theorem says a very simple thing: areas on a map may always be painted by four colors so that no two adjacent areas share the same color.

This is how you may paint the U.S. map with four colors.

Sometimes three colors are simply not enough. Divide a disk to three 120° wedges. They need to have 3 distinct colors. An extra area surrounding most of the disk touches all the three wedges so you need a fourth color.

And five colors is safely enough; it was proved in the 19th century and the proof is elementary.

However, four colors are enough, too. But there is almost no "wiggle room" left. Consequently, the existence of a four-colored painting of the map often looks like a good luck. Equivalently, the known proof is very complicated.

The first proof was constructed with the help of computers. They needed to verify thousands of possible "submaps". In 1996, the number of submaps that require a computer-enhanced verification was reduced to 633 but it's still large. It's been argued that no human can really check the proof by himself.

Boaz Barak tries to argue that 633 may perhaps be a bit larger than one but it will decrease to a number of order one. Well, it doesn't have to. It may be that 633 or 450 will be proved to be the minimum number of submaps that need to be verified in a similar proof. It's still an extremely large number.

Moreover, we're talking just about the four-color theorem, a kindergarten kid's playing with pastels. It seems pretty much obvious to me that more structured problems will require a much larger number of individual cases that have to be checked by the brute force. For example, there may exist an algorithm that searches for length-\(N\) proofs of a theorem in a polynomial time, \(C\times N^\ell\). Such an algorithm may still require an astronomical large memory for the algorithm itself and/or an astronomically large (either smaller or larger or the same) memory for the intermediate data. The fact that we don't know of such an algorithm today doesn't mean that it doesn't exist.

On the other hand, it's still plausible that the four-color theorem also has a different, elementary proof. Two years ago, I was sent something that looked rather promising by a chap and spent many days with that. At the end, there was nothing that looked like a proof. Today, I would probably be able to see that "it can't possibly be a seed of the proof" much more quickly – simply by seeing that too elementary a proof can't possibly know about the "wisdom" or "good luck" that are needed to four-color some difficult enough maps.

Classification of finite groups, largest sporadic groups

A group is a set with the associative multiplication\[

(fg)h = f(gh), \quad f,g,h\in G

\] and with a unit element and inverse element for everyone. A group may be a finite set. How many finite groups are there? Of course, infinitely many, e.g. the cyclic groups \(\ZZ_n\) already form an infinite subset. Moreover, you may always construct Cartesian product groups and other constructions. Can you list all the mathematically possible groups in terms of some well-understood building blocks such as \(\ZZ_n\), \(S_n\), \(D_n\), \(A_n\), and their direct or semidirect products and/or a few extra operations?

Just for two decades, we have known the answer to be Yes.

The achievement – the classification – is written as the union of hundreds of mathematical papers by a hundred of authors. Thousands and thousands of pages are needed to solve the simply formulated problem above. Aside from the easy-to-understand groups above – some of which are coming in understandable infinite families – the classification forces you to discover 26 truly unusual examples of groups, the sporadic simple groups. (Sometimes the Tits group, albeit a bit simpler, is counted as the 27th sporadic group.)

The largest sporadic group is the monster group. Its number of elements is about \(8\times 10^{53}\), almost a million trillion trillion trillion trillions. It's a very particular, fundamental, and important integer in maths that arises "out of nothing". The dimension 248 of the largest exceptional Lie group, \(E_8\), is analogous except that the number of elements of the monster group is vastly larger.

The laymen could also intuitively guess that objects described by very large integers such as this one are contrived, unimportant in practice, and so on. But if you read some articles about the monstrous moonshine, you will see that just the opposite is true. The largest sporadic group is, in some sense, the most elementary sporadic group. An algebra of vertex operators that respect this huge finite symmetry carries an explanation for the coefficients in the expansion of the \(j\)-function, a very natural function (basically unique if holomorphic) mapping the fundamental domain to a sphere. The AdS/CFT dual of the conformal field theory with the monster group symmetry is the "simplest" gravitational theory you may think of, the pure gravity in the 3-dimensional AdS space, at least at the minimal curvature radius.

The latter relationship suggests some kind of a duality. If the CFT has some group with many elements, its dual is simple and has a few fields. There is some complementarity between these two things and the product could be something like \(10^{54}\), if I describe the relationship in a "moral" way that isn't quite accurate.

When we approach some really natural structures in maths – especially the deep maths that has links to the foundations of quantum gravity etc. – we easily and naturally get prefactors that may be as large as \(8\times 10^{54}\) or other large numbers. Of course that if an algorithm gets slowed down by this factor, it doesn't matter that it's still "polynomial": it will be impractically slow and useless in practice.

And those things surely do happen in maths often. Even though our current knowledge is almost certainly biased towards knowing the shorter proofs and methods and be ignorant about the longer ones (because the latter are harder to be found), we already know that there are many important proofs and algorithms and classifications that are extremely long and hierarchically complex.

The string-theoretical landscape

String theory is the richest theory that the mankind knows and one may enter the realm through many entrances. The evidence strongly suggests that it's a completely unique structure with no deformations or siblings. However, it may have many solutions.

In the class of semi-realistic stabilized flux vacua, people have tried to estimate the number of solutions to string theory's equations. The "stabilized" adjective implies that there are no moduli left; no continuous parameters can be freely adjusted. With these extra conditions, the number of vacuum-like solutions has to be finite or countable and if one imposes some additional semi-realistic criteria, he ends up with a finite number. But the number is often quoted as \(10^{500}\) although this estimate is far from a rigorous enough calculation.

At any rate, it seems very likely that powers of a googol are a good estimate of the number of string-theoretical vacua obeying certain very natural (and sort of realistic) constraints that make them very promising as descriptions of the real Universe around us. Whether there exists a principle that picks one of them or some of them according to some non-anthropic criteria remains unknown.

But even if we ignore these vacuum-selection and anthropic questions, it seems true that string theory produces a high number of solutions. If you accept that string theory describes the laws of physics in the most general sense, you may ask whether the laws of physics allow some particular phenomena or creatures. Describe the creature in some way. To find out whether they appear anywhere, you may be ultimately forced to search through those \(10^{500}\) different universes.

The \(P\neq NP\) complexity theorists usually talk about a few dozens of their pet \(NP\)-complete algorithms, not about the problems for "mature scientists" such as the searches for some stringy vacua with some properties. So they think that the traveling salesman and equivalent problems are "almost everything" that is worth thinking about. Well, it surely ain't so. But in the "traveling salesman" type of problem, they know what \(N\) is.

What is \(N\) in the case of the problem "Find a stringy vacuum with some properties"? Needless to say, really fundamental physics problems usually don't come with instructions such as "use the label \(N\) for this and that". In some sense, we are dealing with a single string theory and the finite number of solutions is just an order-of-one extra complication. In some counting, we have \(N=1\). Nevertheless, it makes the search \(10^{500}\) or so times more time-consuming, at least if you have to rely on the brute force method (which isn't clear). It's an example of the huge prefactor that Boaz Barak believes not to occur in practice. It surely does occur – it occurs in the case of many fundamental math problems such as those above as well as in the case of the most natural and fundamental problems in physics, those asking about the basic properties of string theory.

More generally, I want to say that there are lots of "subfields in maths" that may be completely mastered – in the sense that you may turn a previously seemingly creative activity to a mechanical process – but the number of concepts and algorithms that you sometimes need to learn to make these things mechanical is often very large. Think about all the tricks and rules you need to know to integrate large classes of functions or anything else. Think about the huge amount of code hiding in Mathematica which allows you to solve so many things. Mathematica and Wolfram Alpha are examples of projects in which people actually made a significant progress in their efforts to "automatize" many things. While I find it unlikely that Wolfram or others will ever find a gadget that "automatizes everything" – e.g. the tasks "find the shortest proof of a valid proposition" or "find the fastest algorithm that solves something" etc., I don't really have a proof (and not even legitimate, strong enough evidence) and it's just wrong to promote big ideologies that try to "frame" everything so that it's compatible with the predetermined conclusions even though the truth may be different and the alternatives might be "framed" as well, if you tried a little bit.

I feel that people such as Boaz Barak are extremely narrow-minded in their focus on 3SAT and several other, comparably down-to-Earth, problems; extremely biased against the existence of more sophisticated algorithms and proofs than they know now (partly because they don't really want to learn something difficult and new and they don't want to be shown to have been unable to find out something remarkable that they should have found if they were ingenious enough); and they spend way too much time by rationalizations of the limited progress in the search for much better algorithms to solve various things.

Moreover, their experience with the really nice and important problems that have some "beef" is limited, otherwise they couldn't possibly claim that large prefactors and very long minimal proofs or algorithms can't appear in maths. They surely can and they often do. At least in this sense, "naturalness" (in the sense of the preference for order-of-one coefficients in answers to questions) fails in maths: the number of concepts and the amount of time you need to solve a short problem formulated as a length-\(N\) string is often much larger than \(N\) or a small multiple of its modest power. As Barbie correctly said, math class is tough. But it should still be taught at schools.

Mathematics is a complicated network of things that are easy, things that are hard, things that look hard but they're easy, things that look easy but they are hard, and the hardness has very many aspects. The idea that all problems deserving to be a part of maths may be divided to easy and hard ones in a binary way is a childishly naive concept. Very simple questions often require one to discover and (in the case of others) master very complicated bodies of knowledge to understand the answers. And very rigid and in this sense simple laws – like, in the most extreme case, the laws of string theory – are often able to produce an amazingly rich set of phenomena, implications, and relationships, i.e. pulsating and intelligent worlds with lots of fun. Complexity theorists who effectively assume that solutions and straightforward algorithms have to be as short and easy-to-learn as the questions themselves are effectively assuming that "what comes out had to come in". But this just isn't true in maths and science. It's the whole point of physics, for example, that we can explain diverse things with a limited canon of laws – but these in principle simple laws have many consequences and create many patterns that have to be understood if you want to master the implications of the laws.

The idea that "proofs and solving algorithms are about as short and as deep as the formulation of the problem" may be described as the "garbage in, garbage out (GIGO)" paradigm and this is what the research may easily look like if you insist on similar prejudices.

IND2906

Short questions often require long answers and proofs

You May Also Like

No comments:

Recent Posts

Facebook

Blog Archive

Popular Posts

Popular Posts

Social Widget

Random Posts

Recent Posts