The universe (which others call the Library) …

Jorge luis Borges

Borges’s famous “Library of Babel” contains every possible 410-page book, 40 lines to the page, 80 characters to the line, 25 characters to choose from. William Goldbloom Bloch has written a fascinating study of its “unimaginable mathematics” in which we are told, among many other things, that it contains 25^{1,312,000} books. To put this in perspective (if we can call it that), Goldbloom also informs us that stuffing the known universe with nothing but books would require only 10^{84} books. Perhaps we can put that into further perspective by considering that Queneau’s hundred thousand billion (10^{14}) poems would fill the pages of 2.4 x 10^{11} 410-page books. That, at least, is a *possible* arrangement of some of the 3.28 x 10^{80} particles that our real universe consists of.

Borges’s library, by contrast, is impossibly large. I agree with Goldbloom that it is in some sense “unimaginable” and that the wonder is that it is nonetheless *quantifiable.* We can put numbers on it but we simply cannot make sense of it. We can’t get our minds around it. While the library contains all the great works of literature that ever have been and ever will be written, it also contains a version of each with every imaginable combination of misprints. There are books with pages and pages of mostly As and others with mostly Bs. Borges tells us that there is no discernable order to the way the books have been arranged, which means that the odds of picking a random book off the shelf that contains the text of, say, *Hamlet*, are astronomically low. The vast majority of the books in this library will contain nonsense. In that sense, the library, which Borges calls “the universe” is absurd.

In his “intermittently philosophical dictionary,” Quine has proposed a simple way to understand this absurdity, a way to get our minds around its unthinkability, a way to see that Borges’s universe is not, properly speaking, a library at all and that what it contains are not, properly speaking, books. (To anticipate a later post, let’s say that they could not, properly speaking, be *written.*) He begins by reminding us what we’re dealing with:

The collection is finite. The entire and ultimate truth about everything is printed in full in that library, after all, insofar as it can be put in words at all. The limited size of each volume is no restriction, for there is always another volume that takes up the tale — any tale, true or false where any other volume leaves off. In seeking the truth we have no way of knowing which volume to pick up nor which to follow it with, but it is all right there.

Quine (1989), p. 224

The fact that the size of each volume is both arbitrary and unimportant suggests a way of reducing the amount of books. Instead of using every combination of 25 characters we could write all the books in Morse code, i.e., in sequences of dots and dashes. We now have 2^{1,312,000} rather than 25^{1,312,000} books. This will give us less information per page and therefore less information in each book. But, as Quine reminds us, “since for each cliff-hanging volume there is still every conceivable sequel on some shelf or other,” the library would still contain everything ever written by human hands (along with much, much more nonsense never seen by human eyes). We can go further.

There will be a great many books whose first or last halves are identical. So, if we split all the books in half, and discard all but one of the now identical ones, and then allow ourselves to serialize them when necessary to produce 410-page (and longer) works, no information is lost. And it is just as easy (i.e., it is impossible) to find what you’re looking for in this much smaller library (2^{656,000} books.)

Let us press on: the library could of course simply contain all possible *pages* of 3200 characters of Morse code (there are just 2^{3200} such possible pages). But we can do better. Remembering Queneau’s sonnets, where each line is printed on a separate slip of paper, we can also imagine a library of all possible *lines* of 80 characters (only 2^{80} lines), or even, as Quine now suggests, strips of seventeen characters. That gives us a mere 2^{17} or 131,072 strips. By combining them any which way we can produce everything that Borges’s library contained. And, still, it will be as easy to produce *Hamlet* by these random combinations as it would be to find a reasonably legible copy of it in the chaos of the universal library.

Quine now puts a button on the thought experiment:

The ultimate absurdity is now staring us in the face: a universal library of two volumes, one containing a single dot and the other a dash. Persistent repetition and alternation of the two is sufficient, we well know, for spelling out any and every truth. The miracle of the finite but universal library is a mere inflation of the miracle of binary notation: everything worth saying, and everything else as well, can be said with two characters. It is a letdown befitting the Wizard of Oz, but it has been a boon to computers.

Quine (1989), p. 225

Perhaps you can see where this is going? Perhaps you briefly saw a Library of Tokens flash before your eyes? We’ll get there. For now, I merely want to point out how truly *artificial* the Library is. It cannot occur in nature. It is what happens when you put no *natural* constraints on a model and the let the possibilities multiply, if not endlessly, then at least perfectly, imagining the instantiation of every arbitrary combination of already arbitrary signs. It is not a natural language model and its books are not displays of intelligence.

See also: “Robot Writes” and “A Hundred Thousand Billion Bots”