Statistical analyses of literature: let’s see what happens

I got some pushback to the link on what heretical things statistics can tell us about fiction, and I’ve read pushback like it before: the objections tend to say that great literature can’t be reduced to statistics; big data will never replicate the reading experience; a novel is more than the sum of the words chosen. That sort of thing. All of which is likely true, but the more interesting question is, “What kinds of things is nobody doing in the study of fiction?” (Or words, or sentences, of writers’ oeuvres). Lots and lots of people, including me, closely study individual works and connect them to a smallish body of other works and ideas.

Over centuries, if not longer, thousands, if not millions, of people have engaged this practice. Not very many people have attempted to systematically examine thousands if not millions of works simultaneously. So that may tell us something the usual methods haven’t. It’s worth exploring that domain. And just because that domain is being explored, the more usual paths via close reading aren’t closed off.

In other words, don’t think that an argument along the lines of “x is interesting” means “we should always and only do x.”

At the moment, we also appear to be at the very start of the field. Maybe it’ll become extremely important and maybe it won’t. The potential is there. People have (arguably) been doing some form of close reading and analysis, even if the practice didn’t use those specific words, for millennia. Certainly for centuries. So I’d be pretty surprised to see statistical analyses produce whatever good material they’re likely to produce in just a decade or two.

Part of what art and analysis should do is be novel. Another part is “be interesting.” We’re looking for the intersection of those two zones.

%d bloggers like this: