Ontology

“The notion of ontological commitment belongs to the theory of reference.”
Willard Van Orman Quine

Here’s a worry I’ve been expressing lately when talking about AI. When I was in school, certainly when I was at university, I understood the difference between “books” and “periodicals” as well as the difference between “fiction” and “non-fiction”. I recall being taught these things (or perhaps just reminded of them) by librarians who were also trying to teach me the Dewey decimal system and, like I say, by the time I was an undgraduate I understood that a library was a collection of materials, grouped into broad classes. Referencing was one place where the difference between these classes made a difference. You cited a book in one way and a journal article in another because finding a book or journal article in a library were subtly different processes.

That is, a “reference” was literally a way of pointing to a thing that existed in the library at a particular coordinate. A reference that got those coordinates wrong might still have named it correctly but made it harder to find the source. Knowing how to use a database, however, allowed you find a badly referenced source simply by hypothesizing errors in particular “fields”: maybe the writer got the date or wrong, or a title, or was citing a chapter rather than a whole book? But the writer could, in pricinple, also just “make up” a reference. One could construct something that looked like a source in the scholarly literature, a treatise or and article, and it might simply not exist. I’m not quite going with this where you think, but, yes, one of the main early complaints about language models were that they would “hallucinate” their references: they would construct plausible looking references that had, if you will, no referents.

Here’s the real worry I have. Now that our databases (like Scopus and JSTOR) are offering AI functions, our students (and we ourselves, yes, perhaps even we librarians) will increasingly interact with them, not by querying terms in data fields (author, date, title, volume, subject terms, etc.) but simply by asking them about facts and ideas in natural language. And they’ll give us fully referenced natural-language answers in return. Already today, I find that students, who do much of their reading on screens, are less clear about the difference between books and papers, treatises and articles, as kinds of text. In the future, we may come think that the facts — both of the world and of our scholarship — are “given” to us “immediately”. Our understanding of the ontology of the library will erode. Our view of what constitutes our knowledge will grow clouded. Our grip on reality will weaken.

Inframethodology

A weblog devoted to the underlying craft of research

Ontology

Leave a Reply Cancel reply