Emily Brewster: Coming up on Word Matters, When dictionaries drop words. I'm Emily Brewster and Word Matters is produced by Merriam-Webster in collaboration with New England Public Media. On each episode, Merriam-Webster editors Ammon Shea, Peter Sokolowski, and I explore some aspect of the English language, from the dictionary's vantage point.

Emily Brewster: We've discussed how words come to be entered in our dictionaries before, but today we're going to talk about removing words from dictionaries—which words get dropped and why. Peter will start us off.

Peter Sokolowski: One of the most frequently asked questions of dictionary editors is, do you ever drop a word? Is a word removed from the dictionary? Because we almost always answer the opposite question, How does a word get into the dictionary? And in my experience I usually add that information to any remarks I'm giving because I know that question is coming, but there's almost always someone who wants to know, how does a word get taken out of a dictionary? And that's kind of an interesting and slightly complicated question. It's a good question.

Emily Brewster: It is. And Peter, I feel like the way you're couching it suggests that we're sort of bashful and shy about it.

Peter Sokolowski: Just a piece of dictionary publicity has always been, "Oh, here are the new words for this new edition," or this new publication or this new release online. And so we find ourselves talking about new words much more frequently than this other topic. And so we're less kind of ready for that answer.

Emily Brewster: Right, but the fact is that we do have to sometimes take words out of our dictionaries.

Peter Sokolowski: Sure.

Emily Brewster: Although, the requirement that we take them out is no longer what it was. So time was when we were working solely on print publications, the space is always at a premium as we have talked about before—you have to keep a dictionary portable and affordable. And so you can only take out the Colleges and Universities at the back so many times. You can only make the margins so narrow. You can only make the font so small, the paper so thin. And so space constraints have driven drops from dictionaries, but I would say that when I actually go and look at what words have been removed from the dictionary, space is really not the driving factor behind removing words from our dictionaries. It just really isn't.

Emily Brewster: There are other ways to make space in a dictionary. You can shorten an example. From the actual words that I have seen that have been removed from our dictionaries, there are other forces at work. So for example-

Peter Sokolowski: Is it money?

Emily Brewster: Yes, they're paying us to remove these words. An example is often we are covering a term in a different way. So the term color film used to be defined in our a dictionaries and it no longer is because color film is now understood to be what we call self-explanatory. You don't need a distinct entry for color film because the entry for color and the entry for film address the meaning of that term adequately.

Emily Brewster: Sometimes we'll take out a really technical word. It's just so technical that it no longer really qualifies for entry. So for example, the word hepatectomize was removed. It means "to excise the liver of." So we took out hepatectomize, but we have left in both hepatectomy and hepatectomized, because evidence of those two words is sufficient and the hepatectomize verb itself is no longer considered necessary for entry. That is not about space. Somebody made a careful consideration there and said, "You know what? We don't need hepatectomize but let's keep hepatectomy and hepatectomized."

Peter Sokolowski: Was that removed from the Collegiate Dictionary?

Emily Brewster: It was removed from the Unabridged, from Webster's Third...

Peter Sokolowski: Yeah.

Emily Brewster: From the dictionary that was first published in 1961.

Peter Sokolowski: Yeah.

Emily Brewster: It was removed relatively recently.

Peter Sokolowski: Okay. And recast, as you said—

Emily Brewster: Yep.

Peter Sokolowski: ... in these other entries. One example I have is from the 10th Collegiate, a word that was removed and corresponds a little bit to what you were saying. But also I think there's a piece of this, where these tend to be compound words. They tend to be pieces of words that might be able to be explained at their own entries. Ballistocardiograph was removed from the 10th. And my suspicion is that it was very technical and it had become obsolete. So basically what had been noticed by the editors at that time still pre-digital, but they had seen a drop off in the citational evidence and probably for a couple of cycles. They probably said this was an important word in the 1960s, for example, but by the 10th, 1993, they had decided they could do without this one.

Emily Brewster: There was a period of time when the criteria for entry were a little looser than they are now, when we were really in early 20th century lexicography, it was like, "Whoa, we'll just put all the words in. That's a word, we'll throw it in."

Peter Sokolowski: Yes.

Emily Brewster: And the evidence for some of these terms was quite limited.

Ammon Shea: One of the things that I think is fascinating, though, is that very often words are taken out with a machete rather than a scalpel. And I think that the biggest case of that was when we went from the 1934 edition of the Unbridged to the 1961 edition of the Unabridged. There were some kind of broad categories that were excluded. Wasn't it that any word which was not yet in common usage before 1755 was excluded? And that was a kind of arbitrary distinction because it was the date that Samuel Johnson's first edition was published.

Peter Sokolowski: You're exactly right. There were huge swaths of vocabulary that were taken out and it was determined that archaic or obsolete language was heavily covered in the [1934] Second Unbridged. And it was determined they would only keep the ones that were in very common literature, such as Shakespeare, for example, and those would be labeled archaic or obsolete, according to the 1755 date. If the word was entered without any acknowledgement of its archaic nature by Samuel Johnson, then it was considered to have been in continuous use since 1755. If Johnson either didn't enter it or himself acknowledged that it was old fashioned or archaic then we would put a label on it.

Ammon Shea: Which makes a lot more sense, but it's still really highly arbitrary because my impression was that they kept in all the words that were in Shakespeare, which is a very, very peculiar value judgment, in my opinion. I do think that it is somewhat arbitrary to say that because Shakespeare used that we're going to keep it, because it's giving too much significance to any one writer, I think.

Peter Sokolowski: What we see is it shows that the limitations of being completely attached to the written record, and that's why 1755 is of course arbitrary, and yet it was the best census of the language, if you will, in one place, Johnson's dictionary going back as far as we could, if there were a dictionary that big from earlier, for example, the French Academy had done a dictionary in the late 1600s. And that was also a very comprehensive literary list of words, but we really didn't have that in English. And so Johnson just becomes this fulcrum point in the history of the language.

Emily Brewster: Well, and in defense of keeping the terms from Shakespeare, the justification is that Shakespeare is still widely read and studied. And so in a dictionary that is written for college level students and students above that level and for the general reader who might very well be reading Shakespeare, if they're going to be reading anything that was published before the 19th century, Shakespeare is very likely to be who they might be reading. So that's the justification for putting it in there. I think that also, there's just a very human desire to have some systems in place that guide the head word choices when you're making a dictionary.

Ammon Shea: And that makes sense, of course, I just get aggrieved with what I think of as this kind of self perpetuating system of because it's Shakespeare we put it in. And so then we know it because it's in the dictionary. And so we say, "Oh well, it's Shakespeare. So everybody knows Shakespeare." We're just reinforcing this thing, which is not to say that he is not anything less than sublime, but it is to say that we have overlooked numerous other voices because of this constant ratification of him.

Peter Sokolowski: Sure.

Emily Brewster: Absolutely.

Peter Sokolowski: And bringing him up leads to another category of what Ammon was talking about, which is a categorical removal of words from the dictionary. And going back to Emily, what you said, which is absolutely true, in the beginning of the 20th century, the idea was everything but the kitchen sink. Just keep adding words, and the dictionary got bigger and bigger and bigger. That's absolutely true. And sometimes you wonder at the criteria, I'm not sure that there really were specific criteria as there later would become in the way that we understand it. A couple of the categories were foreign terms that were just thrown in the dictionary from probably menus and travel guides or whatever. And these obsolete, archaic terms, almost all of which were removed when this dictionary, Webster's Second was updated to Webster's Third, but there's another category, which is proper nouns because there were an enormous number of proper nouns in what had been considered an encyclopedic dictionary and what was really supposed to be the home library.

Peter Sokolowski: The Big [inaudible 00:08:47] which was the idea of Webster's Second [inaudible 00:08:49]. And the book at its largest, I think was 18 pounds by the end. And it just kept getting bigger and bigger. And among the categories of words that were in there were lists of gods of Greek and Roman mythology and the characters of Shakespeare. And so that brings us to a sort of nuanced rule. How do you decide what you keep? So for example, in Webster's Second, you might have an entry for Hamlet and for Gertrude and for Romeo and for Juliet, but in Webster's Third, you would only have an entry for Romeo because it can be used generically to mean "a young male lover." It's not in there because he's a character in Shakespeare, he's in there because he's a word in English.

Emily Brewster: That's right. Although the Shakespearean meaning is also covered—

Peter Sokolowski: Of course.

Emily Brewster: ... at the entry, because—

Peter Sokolowski: Because you've got the rationale and then the rest is there. And that's true for Galahad, for example. And it is actually true, I think if you look up Hamlet, it says "an indecisive young man." And that means that there had been found enough evidence of its generic use. However, you won't find many of the other characters or the other gods, for example. And that was just a kind of a wholesale way of making space in that case, for other words.

Emily Brewster: Right. It's a distinction that we still hew to, and that is the idea between lexical and encyclopedic information. And so we consider in modern lexicography our job is to provide lexical information, not encyclopedic information. So we are providing information about words as they are used as words, not about characters or about plot devices.

Peter Sokolowski: It's a fascinating subject that I'm sure we'll talk about again, but it can be a thin line. And of course, as Ammon knows and can tell us more about, encyclopedias and dictionaries were really almost the same thing for hundreds of years, until they finally kind of landed where they did, in different places. Of course, encyclopedic means different things in different parts of the entry, because there are encyclopedic details within the definition that are essential in order to distinguish one breed of dog from another. For example, if you say that this has longer fur or shorter legs, that's technically encyclopedic and not lexical, but it's essential to the meaning.

Ammon Shea: In the early encyclopedias like George Harris's Lexicon Technicum and Ephraim Chambers' Encyclopedia, a lot of them were called a "dictionary of arts and sciences." Their actual title was "dictionary," but they weren't functioning in the same lexicographic vein that we are, but there was certainly a lot of overlap between the two.

You're listening to Word Matters. I'm Emily Brewster. More on dropping words ahead.

We continue our conversation about taking words out of dictionaries.

Ammon Shea: One of the other areas that we haven't touched on here is when a word gets removed because it's just a mistake. And we've of course talked about the famous ghost word Dord, which cropped up in the 1934 Unabridged Dictionary and was just put in there by accident.

Emily Brewster: Yep. A consultant on the project had written an entry for capital D or lowercase d and defined it as "density." This was to mean that capital D and lowercase d were both used as abbreviations for the word density, but it was misread as being capital D-O-R-D as a word, Dord, and it was mistakenly entered as such and then years later was finally removed.

Peter Sokolowski: And there are other examples. And I know that plantsman, meaning "a gardener" and crossbowman, meaning "one who wields a cross bow," were both removed from the Collegiate Dictionary for the reason that, on the one hand they are compounds that are self-explanatory. And on the other hand, it was determined that these words are not frequent enough in the language anymore to keep in the dictionary.

Ammon Shea: I disagree with that, by the way. I wish to raise an objection as an occasional card carrying member of the gardening community, plantsman is quite common, as is plantswoman, but is not self-explanatory. And I think that as a dictionary, one of my greatest gripes with Merriam-Webster as an institution is that we too often view things as self-explanatory. And I think very few things really are as self-explanatory as we want them to be. And in gardening, it doesn't just refer to a gardener, it's somebody who's particularly skilled, somebody who has usually a fair amount of experience and knowledge in the field. It's not just somebody who's putting some mulch around your tree.

Peter Sokolowski: No, that decision was made, I'm going to say 10 years ago for the print edition, but I see that we have updated our online dictionary with an definition that corresponds very much to what you just said, a person skilled with plants, an expert gardener or horticulturist. This is a case of a word that was removed, but with better information and new citations, it came back.

Ammon Shea: It's really just with gardeners complaining. That's what it is.

Peter Sokolowski: There are others. Vitamin G was removed from the 11th.

Emily Brewster: What?

Peter Sokolowski: Yeah. Vitamin G, I believe that was because riboflavin is the term that most people use now. And there were a couple of others. Frutescent was removed. Another example that we have to admit to is the word snollygoster

Emily Brewster: Yes.

Peter Sokolowski: ... which was removed from the dictionary and then returned to the dictionary.

Emily Brewster: Yes. Snollygoster is a word that in hindsight should not have been removed. It had fallen into obscurity, but it is still such a colorful and evocative term. And Bill O’Reilly using it on his show, on Fox News, brought the word back into prominence and we had to put it back in.

Peter Sokolowski: Yeah. He would use words like this without explaining what they meant. He would use Pecksniffian and snollygoster. And there were kind of fun words. And we would see in our data at whatever that was—9:00 PM Eastern—we would see look ups that corresponded exactly with his program. And so we knew where this interest came from. And with that last group of words, clearly those were decisions made when print was the only consideration. And clearly these are interesting words that have a history and they should be in the dictionary.

Emily Brewster: Yeah, but most of the time, the words that get dropped are less interesting really.

Peter Sokolowski: Yes. Much less.

Emily Brewster: So for example, excentrical spelled E-X-C-E-N-T-R-I-C-A-L, was formerly entered as an archaic variant of eccentric. It has since been removed, but eccentrical spelled with E-C-C-E-N-T-R-I-C-A-L, also archaic, is still included. We also used to have an entry for rubber chicken circuit. We did. We did, but now we still have an entry for rubber-chicken and rubber chicken circuit is covered at the entry as a phrase.

Peter Sokolowski: Okay. What does that mean?

Emily Brewster: We define rubber-chicken as "of relating to, or being a series of social gatherings, such as fundraising dinners at which speeches are given."

Peter Sokolowski: But that definition was not for the circuit. Right?

Emily Brewster: "Rubber chicken circuit" is an example at that entry.

Peter Sokolowski: Oh okay. There's another word that was removed from the Collegiate, and I'm pretty sure it was removed between the 10th and the 11th Edition. And it was, tattle-tale gray. And that was a marketing line going back I think to the 1960s, kind of the Mad Man era, for laundry detergent, like ring around the collar is what tattle-tale gray was. And it was so commonly used in the culture that it was added to the dictionary and ultimately removed. And that's the kind of thing that gets dropped.

Emily Brewster: But if you go to merriam-webster.com and you look up tattle-tale gray, you will find an entry there—

Peter Sokolowski: Yeah.

Emily Brewster: ... because the online dictionary is a different animal than the print Collegiate Dictionary, or then the print Unabridged Dictionary.

Peter Sokolowski: Right.

Emily Brewster: The merriam-webster.com dictionary includes entries from the Unabridged Dictionary. And they are marked as being from the Unabridged Dictionary, so there is an entry for tattle-tale gray at merriam-webster.com, but not in Merriam-Webster's Collegiate Dictionary.

Emily Brewster: There was a controversy, a number of years ago about a dictionary that is not published by Merriam-Webster. In 2015, there was a campaign calling on Oxford University Press to put words back in a dictionary that they had published years earlier: the 2007 Oxford Junior Dictionary was noted to have a series of words related to nature removed from it and replaced with words that have to do with our modern times. So according to Robert McFarlane, who is a writer who was involved in this campaign to get these words put back in.

Emily Brewster: The words that had been deleted, included such words as acorn, adder, ash, beech, bluebell, buttercup, catkin, conker, cowslip, mistletoe, nectar, newt, otter, pasture and willow. And the words that had been put in included words such as blog, broadband, bullet-point, celebrity, chatroom, committee, cut-and-paste, MP3 player and voice-mail.

Peter Sokolowski: And this is a dictionary for children?

Emily Brewster: That's right. This is a student dictionary. This is the Oxford Junior Dictionary. It was an interesting response to very carefully made decisions made by lexicographers about what words they were going to cover in their dictionaries. The public did not like the fact that these words related to the natural world had been removed, and that words related to the unnatural world had been put in, but Oxford's response was that, this was based on their data. They were looking at corpora, and what words students that this dictionary was written for, what words they would be encountering in their reading.

Peter Sokolowski: And that's an interesting perspective. I mean, I believe them, of course. We've talked about corpus-based research in the past. From our experience in a dictionary for people learning English that we use a corpus, of course, everything we do today has some basis in data and corpus-based searches. In some ways a corpus is dumb. And what I mean by that is the frequency with which you use household language, words like elbow or fingernail or sink or drain— those are words that don't show up in the New York Times all that often.

Peter Sokolowski: And so it's important to balance real life and real experiences, especially with parts of the body, things that aren't newsy or frequently injured by sports stars. And I remember talking with Steve Perrault, our director of defining: he made a very careful study of household words that aren't newsy words, that aren't likely to show up in the big corpus searches. And I just think that this might be a case where maybe that kind of consideration wasn't made as well. I mean, I'm not going to second guess. I'm sure they were right about the literature that they were looking at.

Emily Brewster: What you're talking about is the Learner's Dictionary. He was not deciding whether a word like fingernail or elbow was going to actually be in that dictionary, but we were highlighting words that were especially important for learners to know.

Peter Sokolowski: Right.

Emily Brewster: And so that was informed by the evidence in the corpora that we were using, but it was also informed by a native speaker's sense—

Peter Sokolowski: Yes. Exactly.

Emily Brewster: ... of how important a word is. And so they were highlighted in a particular way in that dictionary.

Peter Sokolowski: Yeah.

Ammon Shea: What I think also draws our attention, and this is one of my favorite topics. There is no such thing as "the dictionary," there are many dictionaries, and there are of course, many types of dictionaries. And in particular with Oxford, the Oxford English Dictionary is unquestionably the most well known. And one of the things that is at least in lexicographic circles well known about the OED is that they really don't take words out because they have the luxury of just growing ever larger and larger. It is a very, very large dictionary. That is what happens when you don't take things out. The last print edition was 21,730 pages. And the estimate from people I've spoken to working on the Third Edition is that if and when they finish the Third Edition and print it, it will be twice the size.

Peter Sokolowski: Mm.

Ammon Shea: So we're talking about a book that will end up being 40 odd thousand pages long. That's one of the things that happens when you don't ever take anything out. Every once in a while, they seemingly remove a word, but actually all that they're doing is they're moving it to a different place. So they'll decide that a word is perhaps a variant of another head word and they put it there.

Emily Brewster: Yes.

Ammon Shea: So I think there is a certain expectation among people that use Oxford dictionaries, that things can stay like this, but not all of Oxford's dictionaries operate on these principles or have the luxury of just growing bigger and bigger. So you do have to take things out in certain circumstances. And while I share a lot of the same concerns and perhaps skepticism regarding corpora that Peter has, Oxford has been working with corpora for a very, very long time. And I think that they would probably [inaudible 00:22:02] it very, very carefully in terms of how they represent certain areas.

Peter Sokolowski: And also dictionaries are a business. And I know that there's pressure, for example, for school dictionaries, to really represent the books that the students are going to read in schools. And so if the sales staff for a dictionary publisher can say to a bookstore chain, for example, that "we have read the most frequently assigned books for the fifth grade and the vocabulary from those books is in our dictionary," that's an important sales point. I don't say sales point as an insult. I say it as a professional reality that they really are respecting the specific vocabulary of a specific subset of English, which is the literature that is assigned to students in a certain place at a certain time. That's I think the big misunderstanding here, is that this was clearly a dictionary that was made for a specific purpose. And I certainly have lots of respect for that. There are lots of dictionaries made for specific purposes that are not just a general vocabulary.

Emily Brewster: One of the truly beautiful things that came out of this controversy is a book that Robert McFarlane in collaboration with an artist named Jackie Morris put together. They have a book called The Lost Words: A Book of Spells that is a beautifully illustrated collection of the words that were dropped from the Oxford Junior Dictionary. And it is a truly beautiful book, but the Oxford Junior Dictionary, as you were saying, Peter, meets the needs of the teachers and their students who are the audience for this particular dictionary. And lexicography is a tricky business-

Peter Sokolowski: Yes.

Emily Brewster: ... you got to make the judgment calls that you have to make.

Ammon Shea: In the early 20th century, one of the big selling points of dictionaries, this is the thing that no longer happens. In advertising copy for dictionaries, it often used to be that they would advertise the weight of the dictionary and the more it weighed, the better it was viewed to be. And you would see people talking about, "This dictionary weighs 34 pounds." "This dictionary weighs 16 pounds." "This dictionary weighs 40 pounds." And that's really a vestige of a bygone era. Nobody is saying, "Our dictionary is bigger and heavier." So it is a business consideration. And a lot of that is going to be bound by space constraints and weight constraints too.

Peter Sokolowski: Oh yeah. And cost. It's a really difficult puzzle to make a bigger dictionary small. And it's one that I've grappled with a number of times, and it can seem like an easy exercise. It's actually really difficult to do it consistently and well.

Ammon Shea: I think the main takeaway here is that we want people to know is that you cannot remove a word in the dictionary by giving us money, so you should stop asking.

Emily Brewster: That's the truth.

