- Washington Free Beacon - https://freebeacon.com -

Big Data Comes to the Bookstore

This is one of those books about literature clearly not intended for people who have much experience with the subject. The target audience is, I gather, not so much people who read Faulkner or the dozens of other authors whose works are analyzed, quantified, probed, and all-around violated here as those who like the idea of having read Faulkner and, more important, like to imagine a world in which they have clever things to say about him. They certainly aren't the sorts of people who can be trusted to have heard of even very famous writers or books, as the unrelenting journalistic hum of "Charlotte's Web author E.B. White" and "Popular crime writer P.D. James" makes clear.

Ben Blatt, a former Slate columnist who once wrote a book about his and another author's "quest to go on the mathematically optimal baseball road trip," has spent goodness knows how many thousands of hours plugging Dickens and Kafka and Michael Crichton into computer programs with the goal of "parsing whether there is a noticeable difference between the best books and the rest."

You would think that there is. But an almost pathological unwillingness to offer any opinions about the merits or lack thereof of any given book is the present volume's most striking feature. Books, or rather novels—the whole world of belles lettres and the classics of history and biography are beyond his purview—seem to exist for Blatt only as data to be mined, not as aesthetic objects concerning whose merits there is anything except a vaguely defined consensus. When he produces a list of "Consensus Great Books" that includes two novels by Ayn Rand, you know where you are.

Blatt is also addicted to qualifications that few will think necessary. The authors of fan fiction are, he reminds us, "sometimes strong writers," but "on average not at the level of bestsellers or the award levels of the literary world." What kind of rhetorical work is "on average" doing there? A novel's rating on Goodreads.com, essential to many of the book's two-dozen or so charts and graphs, "is not," he admits with a thoughtful pause that the reader can supply, "a perfect metric." "It's of course a poor criterion to judge a book on nothing but its adverb rate." You don't say.

There is a sense in which Blatt has taken away a great deal from his researches. There are chapters here on adverbs, both of the frequently derided -ly variety and all the other ones, and the prevalence of masculine versus feminine pronouns, the preponderance of clichés, the differences between American and British English, the evolution of cover art, and the varieties of beginning and ending sentences. Expertise of a kind he certainly does not lack. There is probably no one living who knows more than he does about the rate of adverb usage in Steinbeck, much less anyone who has thought to plot it over the course of the author of The Red Pony's long career and against the reviewers' consensus. More striking, though, is the astonishing number of things Blatt doesn't know. For example, that pace his assertion that doing word-for-word analysis of Shakespeare and the Bible "would have been unfathomable" to scholars of the early '60s who lacked the benefit of iPads and free Project Gutenberg e-texts, Dominicans friars under the authority of Hugh of Saint-Cher completed an exhaustive verbal index of the Vulgate round about 1230 and concordances to Shakespeare have been a cottage industry since Andrew Beckett put together the first one at the end of the 18th century.

It would be interesting to learn how many of the books under analysis Blatt has himself read. Making it through even a few pages of As I Lay Dying would go a long way towards clearing up the apparent mystery of why this collection of monologues delivered by poor Southerners contains fewer -ly adverbs than, say, Absalom, Absalom!. What would you say is a minimum requirement for something to appear in a list of novels arranged according to number of exclamation points per 100,000 words? That it be a novel and contain a minimum of 100,000 words—unlike The Chimes, The Cricket on the Hearth, and A Christmas Carol, all listed here by Blatt as statistically outlying examples of exclamation compulsion rather than short sentimental novellas written at a time when that particular piece of punctuation was more common—is, I humbly submit, a good place to start. Never mind the Tolkien fanboys: millions of the world's children know that there is no character in The Hobbit called "Mrs. Bilbo Baggins," just as any number of graduate students on whose shelves Finnegans Wake is gathering dust with Walter Benjamin could attest that it is a book in which one is not likely to find many -ly adverbs or indeed many words in recognizable English. "Sure, Joseph Conrad wrote three times as much about men than [sic] women, but he wrote more than 100 years ago." True. He also wrote about the sea and the colonial ivory trade.

The result of all these accumulated howlers and canyon-sized lapses in judgment is to shred whatever limited vestiges of interest in or even patience for Blatt's project the reader might have had. No doubt one could employ statistical analysis in the hope of discovering why, for example, the word "suddenly" occurs far more often in The Lord of the Rings than it does in Pride and Prejudice. One could also note, on the basis of nothing more than having seen the respective film versions, that one is a sword and sorcery epic in which goblins and wizards and maidens fair are always popping up out of nowhere, while the other is a comedy of manners whose action scenes tend not to rise above the "Her ladyship was highly incensed" level of tension. Does it really tell us anything that novelists ranging from Douglas Adams to Don DeLillo to Ernest Hemingway "lean" on such words as "said," or that "yes" is common in both Mark Twain and Agatha Christie?

Not everything here is worthless. It is interesting to note that Tom Wolfe's favorite—or, as Blatt calls them "cinnamon"—words are "fucking," "haw," and "goddamn" and that New York-based authors of erotica are far more likely to write "subway" and "butthole" than their colleagues in Texas, in whose works references to "ranked" and "Sergeant" predominate. The staggering fact that more than one billion words of Twilight fan fiction have been written, let alone that they comprise only the sum total of Stephanie Meyer-derived prose on a single website, one with numerous competitors, is something I feel I will be quoting for the rest of my life.

Still, this was a frustrating read. Never, I think, has a purported piece of "literary criticism" been so disconnected from literature and non-suggestive of all the things that might, and very frequently do, induce people to read. Does Blatt like books, or even words? After 271 pages, including 50-some pages of notes largely consisting of lists of novels written by Dickens, Pynchon, and Nicholas Sparks, it is an open question. He leaves us begging for something—anything—suggestive of a reader's temperament: an impression, a feeling, a faintly discriminatory or mildly appreciative hint. Instead, we are forced to make do with his observation that Jane Austen's use of the word "very," in "her celebrated book Emma," is "off the charts."