Advertise here with Carbon Ads

This site is made possible by member support. ❀️

Big thanks to Arcustech for hosting the site and offering amazing tech support.

When you buy through links on kottke.org, I may earn an affiliate commission. Thanks for supporting the site!

kottke.org. home of fine hypertext products since 1998.

πŸ”  πŸ’€  πŸ“Έ  😭  πŸ•³οΈ  🀠  🎬  πŸ₯”

kottke.org posts about language

The distribution of letters in English words

David Taylor analyzed a corpus of English words to see where each letter of the alphabet fell and graphed the results.

Letter Distribution

No surprise that “q” and “j” are found mostly at the beginnings of words and “y” and “d” at the ends. More interesting are the few letters with more even distribution throughout words, like “l”, “r”, and even “o”. Note that this analysis is based on a corpus of words in use, not on a dictionary:

I used a corpus rather than a dictionary so that the visualization would be weighted towards true usage. In other words, the most common word in English, “the” influences the graphs far more than, for example, “theocratic”.

Taylor explained his methodology in a second geekier post. (via @tedgioia)


This incredible State Word Map explains everything about America

State Word Map

No, not that one. Or this one. Or any of these. This one.

State Word Map


Translating Music Into American Sign Language

Amber Galloway Gallego is a ASL interpreter who has developed new techniques and expressions to translate popular music into a richer experience for deaf and hard-of-hearing people than just simply translating the lyrics.

If you frequent music festivals and concerts, you might see her β€” or an interpreter like her β€” grooving to the music, mirroring the emotions and physicality of the artists onstage, interpreting their imaginative lyrics for concert-goers who rely on visual accommodations. She’s interpreted for more than 400 artists at this point, and has a special knack for interpreting hip-hop acts.

Part of the challenge here, particularly with fast-moving rap or hip hop, is combine or abbreviate signs in order to keep time with the lyrics. See Shelby Mitchusson performing Eminem’s Lose Yourself or Gallego doing a faster song of his, Rap God:


A dialect coach demonstrates 12 different accents

Sammi Grant is a dialect coach and voiceover artist for television and theater. In this video, she demonstrates her expertise in speaking English with several different accents, including Irish, Scottish, German, the American midwestern accent, and the Transatlantic accent, an accent invented to sound both American and British simultaneously.

No, really. That’s not a real accent. It’s a now-abandoned affectation from the period that saw the rise of matinee idols and Hitchcock’s blonde bombshells. Talk like that today and be the butt of jokes (see Frasier). But in the ’30s and ’40s, there are almost no films in which the characters don’t speak with this faux-British elocution-a hybrid of Britain’s Received Pronunciation and standard American English as it exists today. It’s called Mid-Atlantic English (not to be confused with local accents of the Eastern seaboard), a name that describes a birthplace halfway between Britain and America. Learned in aristocratic finishing schools or taught for use in theater to the Bergmans and Hepburns who were carefully groomed in the studio system, it was class for the masses, doled out through motion pictures.

This short video has some more examples of the Transatlantic (or Mid-Atlantic) accent:


How MLK’s I Have Dream speech was composed

Martin Luther King’s I Have a Dream speech is one of the greatest examples of American oratory. In this video, Evan Puschak looks at how King’s speech was constructed and delivered, examining King’s references to Lincoln’s Gettysburg Address & Shakespeare, his use of lyrical techniques like alliteration and anaphora (the repetition of a word or phrase at the beginning of successive clauses), and the mixture of plain and poetic language throughout the speech. Spoiler: King was a rhetorical genius and there’s a lot going on in that speech.

(Fair warning: Trump comes in abruptly at the 6:00 mark. I get the point he’s trying to make with the contrast, but I wish Puschak would have done without it.)


Auto-Generated Maps of Fantasy Worlds

Uncharted Atlas

Uncharted Atlas

Uncharted Atlas

Martin O’Leary is a research scientist who studies glaciers, but in his spare time, he built Uncharted Atlas, a program that auto-generates maps of fantasy lands (like from Game of Thrones or LOTR) and posts them to a Twitter account. The explanation of how the terrain is generated is quite interesting and includes embedded map generators that you can play around with (i.e. prepare to lose about 20 minutes to this).

There are loads of articles on the internet which describe terrain generation, and they almost all use some variation on a fractal noise approach, either directly (by adding layers of noise functions), or indirectly (e.g. through midpoint displacement). These methods produce lots of fine detail, but the large-scale structure always looks a bit off. Features are attached in random ways, with no thought to the processes which form landscapes. I wanted to try something a little bit different.

There are a few different stages to the generator. First we build up a height-map of the terrain, and do things like routing water flow over the surface. Then we can render the ‘physical’ portion of the map. Finally we can place cities and ‘regions’ on the map, and place their labels.

And here’s how the languages for the place names are generated; each map has its own generated language so all of the place names are consistant with each other and different from those regions shown on other maps.

I wanted to produce something which was a step above the usual alphabetic soup of generated placenames, and which was capable of producing recognisably distinct languages. The initial idea was that different regions of each map would have different languages, but I abandoned this because it was too hard to make it clear that this was what was going on, while still having the languages themselves be interesting.

The problem is to generate something like what the constructed languages (conlang) community call a ‘naming language’. This is a light sketch of a language, focusing purely on the parts which are necessary to produce names. So there’s little to no grammar, but a good sense of what the language sounds like, and how it’s written.


Honor and Respect: how to address President Obama and Donald Trump

Robert Hickey is the deputy director of The Protocol School of Washington, which provides etiquette and protocol training. In his book Honor & Respect, he covers the “correct written and oral forms of address for everyone from local officials to foreign heads of state”. For The President of the United States, the proper forms of address are:

Letter salutation: Dear Mr. President:
Complimentary close: Most respectfully,
Announced: The President of the United States
Introduction: Mr. President, may I present …
Conversation: Mr. President

And contrary to how many media outlets refer to former US Presidents, they should not be referred to as “President” (e.g. “President Bush”):

“While it is common practice in the media and elsewhere to address and identify former presidents as ‘President (Name),’ this is a mistake,” said Hickey. “Serving as President of the United States does not grant one the personal rank of ‘President’ for life. The office of President is a one-person-at-a-time role that a specific individual holds and then hands off to the next person.”

“Courtesies, honors, and special forms of address are symbols of the power of the office. They belong to the office and to the citizens, not former office holders.”

Hickey recommends “The Honorable” as an official title (e.g. “The Honorable Jimmy Carter”) and “Mr./Ms.” for conversation or salutation (e.g. “Mr. Clinton”).

While Donald Trump was officially sworn in as the President on Friday, this site will continue to refer to Trump as “Trump” or “Donald Trump”1 and not as “President Trump”. Again and again, almost to a pathological degree, Trump has demonstrated, in word and deed, that he has not earned and does not deserve our respect and the title of his office. It’s a small protest by a small “media outlet”, perhaps petty, but as long as the First Amendment still applies, I will publish what I like on my own damn website.

And since I am all for the “one-person-at-a-time” rule, this site will also continue to refer to Barack Obama as “President Obama”. He’s earned it many times over.

  1. Or even “Fuckface Von Clownstick”. We don’t stand on ceremony here. But I won’t call him just “Donald”…that would be disrespectful to greater Donalds like Sutherland, Glover, and Duck.↩


AI Hemingway’s The Snows of Kilimanjaro

In the NY Times Magazine, Gideon Lewis-Kraus reports on Google’s improving artificial intelligence efforts. The Google Brain team (no, seriously that’s what the team is called) spent almost a year overhauling Google’s translate service, resulting in a startling improvement in the service.

The new incarnation, to the pleasant surprise of Google’s own engineers, had been completed in only nine months. The A.I. system had demonstrated overnight improvements roughly equal to the total gains the old one had accrued over its entire lifetime.

Just after the switchover, Japanese professor Jun Rekimoto noticed the improvement. He took a passage from Ernest Hemingway’s The Snows of Kilimanjaro, translated it into Japanese, and fed it back into Google Translate to get English back out. Here’s how Hemingway wrote it:

Kilimanjaro is a snow-covered mountain 19,710 feet high, and is said to be the highest mountain in Africa. Its western summit is called the Masai “Ngaje Ngai,” the House of God. Close to the western summit there is the dried and frozen carcass of a leopard. No one has explained what the leopard was seeking at that altitude.

And here’s the AI-powered translation:

Kilimanjaro is a mountain of 19,710 feet covered with snow and is said to be the highest mountain in Africa. The summit of the west is called “Ngaje Ngai” in Masai, the house of God. Near the top of the west there is a dry and frozen dead body of leopard. No one has ever explained what leopard wanted at that altitude.

Not bad, especially when you compare it to what the old version of Translate would have produced:

Kilimanjaro is 19,710 feet of the mountain covered with snow, and it is said that the highest mountain in Africa. Top of the west, “Ngaje Ngai” in the Maasai language, has been referred to as the house of God. The top close to the west, there is a dry, frozen carcass of a leopard. Whether the leopard had what the demand at that altitude, there is no that nobody explained.


The Great American Word Mapper

US Map Language

It’s that time of year again. No, not Christmas or Hanukkah. As the year winds down, it’s an opportunity for Americans to investigate how differently they use words in different parts of the country. In December 2013, for example, people lost their damn minds over the NY Times’ dialect quiz. This year, you can play around with The Great American Word Mapper which uses Twitter data from 2014 to plot geographic usage patterns.

For instance, you can see where people use “supper” vs. “dinner” (see above). The map indicates mixed usage where I grew up, which checks out…we mostly said “supper” but “dinner” was not uncommon, particularly as I got older. Other results are less useful…the Twitter-based “soda” vs. “coke” vs. “pop” doesn’t tell you as much as directly asking people what they call soft drinks.

US Map Language

The swearing maps are always fun (see also the United States of Swearing)…I wonder why “shit” is so relatively popular in the South?

Some other interesting searches: “moma” (alternate spelling of “momma” in the South with a small pocket of usage around NYC for MoMA), “city” doesn’t give the result you might expect, the distribution of “n***er” vs “n***a” suggests they are two different words with two different meanings, and in trying to find a search that would isolate just urban areas, the best I could come up with was “kanye” (or maybe “cocktails” or “traffic”). And harsh, map! Geez. (via @fromedome)


Timeline maps of the most popular baby names in each state, 1910-2014

This is cool and a little mesmerizing: animated US maps showing the most popular baby name in each state from 1910 to 2014 for boys and girls. There are three separate visualizations. The first just shows the most popular baby name in each state. Watch as one dominant name takes over for another in just a couple of years…the Mary to Lisa to Jennifer transition in the 60s and 70s is like watching an epidemic spread. Celebrity names pop up and disappear, like Betty (after Betty Boop and Betty Grable?) and Shirley (after Shirley Temple) in the 30s. The boy’s names change a lot less until you start getting into the Brandons, Austins, and Tylers of the 90s.

The next visualization shows the most particularly popular name for each state, e.g. Brandy was the most Louisianan name for female newborns in 1975. And the third visualization shows each name plotted in the averaged geographical location of births β€” so you can see, for example, the northward migration of Amanda during the 80s.

P.S. Guess what the most popular boy’s name in the state of my birth was the year I was born? And the most particularly popular boy’s name in the state I moved to just a year later? Jason. I am basic af.

Update: From Flowing Data, some graphs of the most unisex names in US history. (thx, paul)


Actors’ movie accents, rated

Erik Singer is a dialect coach who works with actors to perfect different accents and dialects. In this video, he quickly analyzes the performances of 32 actors based on their use of accents. Pretty fascinating to watch. He singles out Philip Seymour Hoffman’s portrayal of Truman Capote as an exemplary use of the proper accent. High marks also go to Kate Winslet doing a Polish accent, Idris Elba’s South African accent while portraying Nelson Mandela (and his Bal’more accent in The Wire), and Cate Blanchett playing Katherine Hepburn in The Aviator.

Nicolas Cage in Con Air and Tom Cruise in Far and Away? Well, let’s just say they couldn’t pahk the cah in Hahvahd Yahd.

Update: Actress Sarah Jones takes a slightly different approach to speaking in different accents. Instead of aiming for a particular generalized dialect, she picks out a particular person to impersonate.

Let’s say you want to sound like a Trinidadian woman, as Ms. Jones does in her show. She recommends you watch YouTube clips of speakers at council meetings in Trinidad until you find the person you most want to sound like. If you can meet your subject in person, it will help make your goal much easier to reach.

“I ask them to speak something very slowly three times in a row and then I have them say it at normal speed the way they’d say it three times in a row,” she said. “I have them say it the way they’d say it in school as compared to how they’d say it to a friend.”

Be sure to play the embedded audio clips of Jones speaking as her different characters. And you can watch her in action in this TED Talk:


Isis Hair Salon

Carrie Banks has owned and operated Isis Hair Salon in LA for more than 20 years. But because of recent events in the Middle East and jokers on social media, the name has become a liability in recent years. Banks has even had difficulty finding someone to replace the sign outside the salon in the event of a name change.


Harry Potter and the Translator’s Nightmare

Due to its popularity, the Harry Potter series of books has been translated into dozens of different languages from around the world. Given the books’ setting in Britain, heavy use of wordplay & allusions, and all of the invented words, translating the books accurately and faithfully was difficult.

Translators weren’t given a head-start β€” they had to wait until the English editions came out to begin the difficult and lengthy task of adapting the books. Working day and night, translators were racing against intense deadlines. Harry Potter and the Order of the Phoenix, the longest book in the series at 870 pages for the US edition, was originally published on June 21, 2003. Its first official translation appeared in Vietnamese on July 21, 2003. Not long after, the Serbian edition was released in early September 2003.

Also not mentioned in the video is all of the foreshadowing Rowling uses in the first few books in the series that pays off in later books. Since the translators probably didn’t know the plot details of the later books, some of that foreshadowing might have been edited out, downplayed, or misinterpreted.


Stutterer

Stutterer by Benjamin Cleary won the 2016 Oscar for Best Live Action Short Film and is now available to view online for free courtesy of the New Yorker.

It’s a thirteen-minute movie about a young London typographer named Greenwood (Matthew Needham). Greenwood stutters, to the extent that verbal conversation is difficult. When he tries to resolve an issue with a service representative over the phone, he can’t get the words out; the operator, gruff and impatient, hangs up. (For surliness, she rivals the operator in the old Yaz song.) When a woman approaches Greenwood on the street, he uses sign language to avoid talking. But in his thoughts, which we hear, he does not stutter.

Great little film…my heart broke three separate times watching it.


The NY Times and the truth of profanity

When the story about Donald Trump bragging about sexually assaulting women broke, the NY Times took the unusual step of publishing exactly what the presidential candidate said.

In the three-minute recording, which was obtained by The Washington Post, Mr. Trump recounts to the television personality Billy Bush of “Access Hollywood” how he once pursued a married woman and “moved on her like a bitch, but I couldn’t get there,” expressing regret that they did not have sex. But he brags of a special status with women: Because he was “a star,” he says, he could “grab them by the pussy” whenever he wanted.

“You can do anything,” Mr. Trump says.

He also said he was compulsively drawn to kissing beautiful women “like a magnet” β€” “I don’t even wait” β€” and talked about plotting to seduce the married woman by taking her furniture shopping. Mr. Trump, who was 59 at the time he made the remarks, went on to disparage the woman, whom he did not name, saying, “I did try and fuck her. She was married,” and saying, “She’s now got the big phony tits and everything.”

It was unusual because of the Times’ policy of not printing profanity, even if the profane words themselves are newsworthy. In this case, the editors felt they had no choice but to print the actual words spoken by Trump.

In piece published earlier the same day the Trump story broke, Blake Eskin, who has been tracking the Times’ non-use of profanity at Fit to Print, highlights the racial and classist implications of the policy.

As I’ve noticed over the years while documenting how the Times writes around profanity, a lot of the expletives the Times avoids come up around race: in stories about hip-hop, professional sports, and police shootings. (I’m getting all the data into a spreadsheet so I can back up this assertion.)

The Times seems compelled both to tell readers that people curse in these contexts and to frown upon it.

It’s as if profanity is like a sack dance or a bat flip, a classless flourish that the archetypal Times reader, who is presumably white, can take vicarious pleasure in without having to perform it himself.

You cannot tell authentic stories about people who are systematically discriminated against in our society without using their actual words and the actual words spoken against them related to that discrimination. Full stop.

Update: Eskin wrote a follow-up about his data analysis of the NY Times profanity avoidance for Quartz.

The “other” category includes faux-folksy formulations such as “a word more pungent than ‘slop,’” and “a stronger version of the phrase ‘gol darn,’” as well as the straightforward, “He swore.” When I began the Fit to Print project, I could enjoy the cleverness of some of these contortions. But after reading through hundreds of examples over several years, expletive avoidance no longer strikes me as an interesting puzzle for a writer to solve. The policy just seems prissy, arbitrary, and delusional.

The more I think about the Times’ policy, the more absurd it becomes. There seems to be a relatively simple solution: if the profanity does work in the service of journalism β€” particularly if the entire article is about the profanity in question β€” print it. It is doesn’t, don’t. I mean, are Times editors afraid their reporters will start handing in articles with ledes like “Well, this fucking election is finally winding down, thank Christ.”?


Monty Python’s Argument Sketch performed by two vintage speech synthesizers

In a sketch from Monty Python called Argument Clinic, a character played by Michael Palin goes to a facility and pays to have an argument with John Cleese. That argument was recreated for this video using a pair of old school speech synthesizers. Palin’s part is played by the DECTalk Express (aka the voice of Stephen Hawking) and Cleese’s part is played by the Intex Talker. As you can probably hear, the Talker predates the DECTalk by a few years and is more difficult to understand. (via @303)


The internet of trees: how trees talk to each other underground

In June, ecologist Suzanne Simard gave a talk at TED about her 30 years of research into how trees talk to each other. Underneath the forest floor, there is a communications network on which trees β€” even those from different species β€” trade carbon with each other, send warnings, and trade messages. Simard described one of her first experiments (from the transcript):

I pulled on my white paper suit, I put on my respirator, and then I put the plastic bags over my trees. I got my giant syringes, and I injected the bags with my tracer isotope carbon dioxide gases, first the birch. I injected carbon-14, the radioactive gas, into the bag of birch. And then for fir, I injected the stable isotope carbon-13 carbon dioxide gas. I used two isotopes, because I was wondering whether there was two-way communication going on between these species.

The idea was to use the isotopes to track whether the trees were trading carbon when some of them were shaded and less able to make their own energy.

The evidence was clear. The C-13 and C-14 was showing me that paper birch and Douglas fir were in a lively two-way conversation. It turns out at that time of the year, in the summer, that birch was sending more carbon to fir than fir was sending back to birch, especially when the fir was shaded. And then in later experiments, we found the opposite, that fir was sending more carbon to birch than birch was sending to fir, and this was because the fir was still growing while the birch was leafless. So it turns out the two species were interdependent, like yin and yang.

Fascinating. German forester Peter Wohlleben came out with a book this week called The Hidden Life of Trees: What They Feel, How They Communicate (Simard contributed a note to the book). From a Guardian review:

Trees have friends, feel loneliness, scream with pain and communicate underground via the “woodwide web”. Some act as parents and good neighbours. Others do more than just throw shade β€” they’re brutal bullies to rival species. The young ones take risks with their drinking and leaf-dropping then remember the hard lessons from their mistakes. It’s a hard-knock life.

The Monthly, Maclean’s, and Scribd all have excerpts of Wohlleben’s book if you’re interested.

Update: See also this episode of Radiolab, From Tree to Shining Tree. (via @Chan_ing)


The Auction Chant

Did you ever wonder what the auctioneer is saying when they’re trying to get people to bid on things? They talk so quickly it’s hard to tell sometimes, so Barry Baker of Ohio Real Estate Auctions slows down the action to describe exactly what auctioneers are saying.

See also one of my favorite silly things from recent months: Auctioneer beats.


A Valley Girl contest from 1982

This Valley Girl contest that aired on Real People1 in 1982 is quite the time capsule of Reagan-era America. BAG YOUR FACE!!!! I had totally (LIKE, TOTALLY!!) forgotten about that super-80s insult. Is the Valley Girl thing the reason we, like, all say “like” and uptalk all the time now?

  1. Along with the Dukes of Hazzard, Real People and That’s Incredible were my favorite shows in the early 80s. I probably watched this episode on TV when it first aired.↩


The Adjective Word Order We All Follow Without Realizing It

From Mark Forsyth’s The Elements of Eloquence, a reminder of the rules of adjective order that fluent English speakers follow without quite knowing why.

…adjectives in English absolutely have to be in this order: opinion-size-age-shape-colour-origin-material-purpose Noun. So you can have a lovely little old rectangular green French silver whittling knife. But if you mess with that word order in the slightest you’ll sound like a maniac. It’s an odd thing that every English speaker uses that list, but almost none of us could write it out.

The Cambridge Dictionary lists a slightly different order: opinion, size, physical quality, shape, age, colour, origin, material, type, purpose. A poem by Alexandra Teague explores the topic in a creative way:

That summer, she had a student who was obsessed
with the order of adjectives. A soldier in the South
Vietnamese army, he had been taken prisoner when

Saigon fell. He wanted to know why the order
could not be altered. The sweltering city streets shook
with rockets and helicopters. The city sweltering

streets.

Did anyone learn this in school? I sure didn’t. How do we all know then? My daughter’s kindergarten teacher had a great phrase she used when things got a bit tricky as her students learned to read: “the English language is a rascal”. (via @MattAndersonBBC)

Update: Language Log’s post on adjective order is worth reading. (thx, stephen & margaret)


The Kingdom of Speech

Kingdom Of Speech

In his new book The Kingdom of Speech, Tom Wolfe argues that speech and not evolution is responsible for the many achievements of humans. Wolfe, the author of The Right Stuff and The Electric Kool-Aid Acid Test, went on NPR the other day to talk about the book. This comment about Darwin’s view of speech stuck out (emphasis mine):

He could not figure out what it was. He assumed, because of his theory, that everything evolved from animals. And didn’t even include it in his theory, language, until he decided that it came from our imitation of the cries of birds. And I think it’s misleading to say that human beings evolved from animals β€” actually, nobody knows whether they did or not. There are very few physical signs, aside from the general resemblance of apes and humans. The big evolution, if you want to call it that, is that this one species, Homo sapiens, came up with this ingenious trick, which is language.

It’s one thing to say that speech did not evolve from the utterances of previous animals and was instead invented by humans, but it’s quite another to assert that humans did not evolve from animals at all.1 Gonna be fun to sit back and watch the controversy roil on this one. (via @JossFong who said “lazy saturday, just listening to @NPR when ….. WHAT”)

  1. Q: Where does Tom Wolfe get his water?

    A: From a “Well, actually…”↩


Boston Dynamics tests new swearing robot

In addition to robots that run fast, can’t be knocked over, launch themselves 30 feet into the air, and climb up walls, Boston Dynamics also makes robots who move like people. Now, imagine if that robot swore like a longshoreman while going about its duties. This made me laugh super hard. (via @nickkokonas)


Christoph Niemann, Words

Christoph Niemann, Words

Ace illustrator Christoph Niemann has a new book coming out called Words, an illustrated compilation of 300+ sight words

What can you do with a word? Read it, spell it, say it, picture it, understand it, make a sentence with it, tell a story with it, share it with a friend. Everything starts with a love of words! More than 300 words inspired by Dr. Edward Fry’s list of sight words are paired with striking and playful illustrations by internationally renowned designer and artist Christoph Niemann to deepen understanding, to enrich, and to enlighten those learning to read and write English, whether they be children or adults.


How happy is Twitter?

Using a 5000-word dictionary of words rated on their happiness, the Hedonometer measures the average happiness on Twitter.

Happy Twitter

Christmas is always the happiest day of the year (“merry”, “happy”, and “joy” are all pretty positive) while shootings and terrorist attacks are Twitter’s saddest events. The recent mass shooting in Orlando seems to be the least happy Twitter has been over the past 7+ years.

The Hedonometer also analyzes the overall happiness of movies based on their scripts. The happiest movie is Sex in the City while the saddest is Omega Man (followed by The Bourne Ultimatum). Somehow, the fourth happiest movie is Lost in Translation, which might be reason for some overall skepticism about the project’s sensitivity to context.

The happiness over time of individual movie scripts has been analyzed by the Hedonometer too. Pulp Fiction’s happiest moment is when Vincent and Mia go to Jackrabbit Slim’s and the low point is “Bring out the Gimp”.

Happy Pulp Fiction

The system has analyzed books as well…the low point of the entire Harry Potter series seems to be the event at the end of The Half-Blood Prince.

Update: Grain of salt and all that, but the shootings of Alton Sterling, Philando Castile, and the Dallas police officers have pushed the happiness quotient on Twitter lower again so that the two least happy days have both occurred in the past month. There’s been a general feeling that 2016 has been a bad year, like George RR Martin is writing it. I wish the data were available for a closer analysis, but if you look at the chart, you can see that Twitter’s overall happiness starts to rise around the end of 2012 but starts to fall again right around the beginning of 2016…the effect is quite clear, even just from eyeballing it.


A collection of maps of the languages and ethnic groups of Asia

Language Map China

Language Map India

Language Map Indochina

Tim Merrill is using Pinterest to collect maps showing where ethnic groups live and what languages are spoken in Asia.


This and only this is a sandwich

In today’s post on “What is barbecue?” I skipped past “is a hot dog a sandwich?” so quickly that I forgot to answer the question. So in the same spirit in which someone can boldly declare that only smoked, slow-cooked pork is barbecue, here is my minimal definition of a sandwich:

A sandwich is any solid or semi-solid filling between two or more slices of bread. Not a roll, not a wrap, not a leaf of lettuce: sliced bread. What is inside far less than the container.

Consequently:

  • A hot dog is not a sandwich.
  • A burrito is not a sandwich.
  • A wrap is not a sandwich.
  • A cheeseburger on a roll is not a sandwich. Sliced bread only.
  • A lobster roll is not a sandwich.
  • A hoagie is not a sandwich.
  • An ice cream sandwich is not a sandwich.
  • A hot turkey sandwich is not a sandwich.
  • An open-faced sandwich is not a sandwich.
  • If you make a sandwich using one end of the bread and one proper slice, it’s kind of a sandwich still, but not really. See also folding over a single slice of bread for a half-sandwich.
  • If you make a sandwich using both ends of the bread, it is no longer a sandwich at all.
  • A peanut butter or grilled cheese sandwich is a sandwich.
  • A mayonnaise, butter, or ketchup sandwich is probably a sandwich β€” I’m not sure whether those fillings are solid enough β€” just not a very good one.
  • A sandwich made with crackers instead of bread is not a sandwich, but an imitation of a sandwich.
  • A sandwich made with crackers between two slices of bread is a sandwich, but not a very good one.

Alternatively, “sandwich” is a family-resemblance concept and we can’t appeal to definitional consistency to get away from the fact that language is a complex organism and its rules don’t always make perfect sense.

(PS: I do not speak for Jason or Kottke.org on this matter, please do not argue with him about sandwiches)

Update (from Jason): Boy, you leave Tim to his own devices for a few hours and he establishes the official kottke.org stance on sandwiches. [That new emoji of the yellow smiley face grabbing its chin and looking skeptical that you might not have on Android IDK I’m Apple Man] I was just talking to my kids the other day about this important issue and Ollie, who is almost 9, told me that both hamburgers and hot dogs are sandwiches because “the meat is sandwiched in between the bread; it’s right there in the word”. When Ollie and Minna take over the family business in 2027, they can revisit this, but for now, Tim’s definition stands.

Update: Tim’s definition has been weakened further. In talking with Stephen Colbert, Justice Ruth Bader Ginsberg asserted that a hot dog is a sandwich.


What is barbecue?

At Eater, Chris Fuhrmeister hits on another topic near to my amateur linguist heart: policing the word “barbecue”:

When it comes to American barbecue β€” I certainly won’t attempt to set ground rules for other barbecue cultures across the globe β€” there are absolute rights and wrongs. Sure, there’s some room for interpretation, but good-intentioned “barbecue” lovers across this country are blaspheming day in and day out. Before declaring what barbecue isn’t, it’s best to define what it is: pork that’s slow-cooked with smoke.

This is controversial, because “barbecue” is also used to mean:

  • n. other slow-cooked smoked meats, e.g., beef
  • v. the act of cooking or eating such meats,
  • v. grilling anything outdoors,
  • n. an outdoor grill
  • a. a type or flavor of sauce, potato chips, and other foods
  • and so forth.

It’s also odd because, as Fuhrmeister notes, it’s an American controversy, and Americans tend to play faster and looser with food words than people elsewhere. Cognac has to be from Cognac, champagne from Champagne, and so on. Americans have lots of different regional words and practices when it comes to food (soda vs pop, sub vs hoagie, etc.), and we’re definitely competitive when it comes to where and how food is made best, but we’re generally pretty pluralist about definitions. Which is probably why “barbecue” has metastasized to mean so many different but related things.

I tried to come up with a shortlist of honest-to-goodness American food word debates.

  1. What is barbecue?
  2. Is a hot dog a sandwich?
  3. Is Chicago-style pizza really pizza?
  4. Is it donut or doughnut?
  5. Is a wrap a burrito?
  6. Why do we say “chai tea” when “chai” means “tea”?

From here you start to get into all the ways Americans abuse imported food words, which is a much longer list. British English also has a debated distinction between cake and biscuit that I don’t fully understand. Some of us like “is a patty melt a hamburger?,” because the ontology of hamburger is pretty complex stuff. But this is enough to get started.

Donut/doughnut is a straight-up style dispute, and doesn’t have anything to do with definitions. “Are hot dogs sandwiches?” is almost too much about definitions β€” there’s no history, no implied values, or real stakes. Chicago vs NYC pizza is a regional value rivalry posing as a definitional one: press people, and they’ll say, “yeah, what they make is pizza, it’s just not as good as ours.”

Barbecue is the debate that has everything. It’s a regional rivalry with value attached to it, that’s making definitional claims. And there are so many possible distinctions! Texas and Carolina partisans might unite to reject “barbecue” to mean “cookout,” but fall apart again over the merits of beef vs pork. You can even vote on it; the voting will decide nothing. It is an infinite jewel.


X marks gender-neutral

“Mx.” (pronounced “mix” or “mux”) is a gender-neutral honorific. It’s used by people who don’t want to be identified by gender, whether their gender identity isn’t well-represented by the older forms, or they just don’t want to offer that information or assume it when addressing someone else. “Mx.” was added to Merriam Webster’s unabridged dictionary in April, has begun to be used on official forms in the UK (the Royal Bank of Scotland has been an early adopter), and appeared in two recent stories in the New York Times, once as a preferred honorific for a Barnard College student who doesn’t identify as male or female, and once in a story about “Mx.” itself.

Linguistic experts say it is harder to change usage habits of words uttered frequently in speech, such as “she” and “he.” But a realignment in honorifics may be more quickly achieved because courtesy titles are less often spoken than written, like in the completion and mailing of government, health care and financial documents, as well as in newspapers and other media publications.

This second story, quoting Oxford University Press’s Katherine C. Martin, also notes that some of the earliest uses of “Mx.” were in the 1980s, “when some people engaged in nascent forms of digital communication and did not know one another’s gender.”

Likewise, “Latinx” aims to be more comprehensive and more inclusive than the older terms Latino and Latina. “The ‘x’ makes Latino, a masculine identifier, gender-neutral,” writes Raquel Reichard. “It also moves beyond Latin@ - which has been used in the past to include both masculine and feminine identities - to encompass genders outside of that limiting man-woman binary.”

This lights up my amateur linguist brain in all sorts of ways, but here’s one of them: the telescoping (maybe kaleidoscoping?) between usage, in all its messiness, and forms, in their desire for clear standards and finite options.

You can break that down further into usage within a community or group versus usage outside that community, and the formal protocols a publication like a newspaper or dictionary might follow versus paperwork or a database run by a business or government office. They all interplay with each other, and linguistic change happens or doesn’t happen through all of them.

And I guess the last thought is about how digital culture, by expanding and transforming the kinds of communities, identities, forms, and publications that are possible, can accelerate those changes or hold them back.

This tweet by NBC News is a good example: the tweet uses “Latinx” (and “Hispanic”) β€” the linked story, like the name of the news vertical and twitter account, overwhelmingly uses “Latino,” in both the body and the headline.

Or take Planned Parenthood. Many of the health provider’s affiliates have updated their intake forms and other paperwork and communication. The new language is more gender-neutral, gender-inclusive, and more specific, separating anatomy, sexual activity, and gender identity. The national office is working on a new style guide to help other affiliates make their own changes.

Language about certain kinds of birth control has changed as well. “Male condoms” and “female condoms” are now referred to as internal and external condoms at Planned Parenthood of New York City.

“The language we’re using today reflects the fact that gender is a spectrum and not a simple system, a binary system of male and female,” says [PPNYC’s Lauren] Porsch. “We really talk about having sexual and reproductive health services: women who have penises, men who have vaginas, and there are people with all different types of anatomy that may not identify with a binary gender at all,” she says.

Again, while the changes eventually get reflected in Planned Parenthood’s intake forms and other official language, it was implemented early in digital and social media β€” specifically, in response to users on Tumblr.

“The Tumblr audience is smart. They understand feminism. They understand that sex ed isn’t one-size-fits-allβ€”even though that’s what they were taught in school,” says Perugini. “And they know that words matter. They didn’t see themselves reflected in the language we were using on our social media pages or our website, and they let us know.”

This is happening. It’s happening in progressive, diverse, digital communities first. And for all their fractiousness, and the inherent difficulty in dealing with areas as complex and personal as identity, gender, and sexuality, it does feel like some standards are emerging. These are words worth watching. If you work with digital technology and people (and yeah, that’s almost everyone), I hope you’re paying attention.


Today on the grammar rodeo: that vs. which

New Yorker copy editor Mary Norris explains when the magazine uses “which” and when it uses “that”, a distinction I confess I had little knowledge of until just now.1 A cheeky example of the difference by E.B. White:

The New Yorker is a magazine, which likes “that.”
The New Yorker is the magazine that likes “which.”

(via df)

  1. This is why, when anyone asks me what I do for a living, the answer is never “writer”. Writing for me is a brute-force operation; I’ll use whatever is necessary to make it sound like I’m talking with you in person. (Wait, is a semicolon appropriate there? Should I have used “as though” instead of “like”? Who gives a shit!) I use too many commas (but often not between multiple adjectives in front of nouns), too many “I”s, too many “that”s (OMG, the thats), too many weirdo pacing mechanisms like ellipses, dashes, & parentheses, mix tenses, overuse the passive voice, and place unquoted periods and commas outside quotation marks like the Brits, although I was doing it before I learned they did because it just seemed to make sense. So, anyway, hi, I’m not a writer…who writes a lot.↩


In a game of big words, brevity triumphs

If you love reading about unusual Moneyball-esque strategies in sports, check out how a player on the Nigerian team won the Scrabble World Championship last year.

So while Scrabblers still fancy bingos, they increasingly hold off on other high-scoring moves, such as six-letter words, or seven-letter terms that only use six tiles from the rack. Instead, by spelling four- or five-letter words, a player can keep their most useful tiles β€” like E-D or I-N-G β€” for the next round, a strategy called rack management. The Nigerians rehearse it during dayslong scrimmage sessions.

Also, thanks to a design quirk, the board is oddly generous to short words. Most of the bonus squares are just four or five letters apart.

“The geometry of the Scrabble board rewards five-letter words,” said Mr. Mackay, who lost to Mr. Jighere in the world championship final, during which the Nigerian nabbed a triple word score with the antiquated adjective KATTI, meaning “spiteful.” “It’s a smart tactic.”