Zipf’s Law

https://commons.wikimedia.org/wiki/File:Zipf_30wiki_en_labels.png
Image: Wikimedia Commons

In any language, the most frequently used word occurs about twice as often as the second most frequent word, three times as often as the third most frequent word, and so on.

In American English text, the word the occurs most frequently, accounting for nearly 7% of all word occurrences. The second most frequent word, of, accounts for slightly over 3.5% of words, and so on.

This pattern obtains even in non-natural languages like Esperanto. It’s named for American linguist George Kingsley Zipf, who popularized it.

12/24/2021 UPDATE: Apart from languages, the law is observed in measurements of the citations of scientific papers, web hits, copies of books sold, telephone calls, the magnitude of earthquakes, the diameter of moon craters, the intensity of solar flares, the intensity of wars, and the populations of cities. See this paper. (Thanks, Snehal.)

Unwinese

English comedian Stanley Unwin invented his own language, “Basic Engly Twenty Fido,” a playfully twisted version of English that he said had been inspired by his mother, who once told him that she had “falolloped over” and “grazed her kneeclabbers.”

After that, he said, he became “a masterlode of the verbally thrips oratory.” Asked his opinion of Elvis Presley, he said, “Well, from across the herring-pole where harth the people has produced some waspwaist and swivel-hippy, I must say the rhythm contrapole sideways with the head and tippy tricky half fine on the strings.”

The Small Faces asked him to narrate the story of Happiness Stan on their 1968 album Ogdens’ Nut Gone Flake. He starts, “Are you all sitty comforty bolt two square on your botty? Then I’ll begin. Like all real-life experience story this also begins once upon a polly-ti-to. Now after little lapse of time Stan became deep hungry in his tumload. After all he struggly trickly half several mileode, and anyone would suffer under this.”

This might recall Lewis Carroll’s “Jabberwocky” or such fictional languages as Nadsat in Anthony Burgess’ A Clockwork Orange. The difference is that, for the most part, Unwin wasn’t preparing his utterances in advance but improvising them on the spot.

In 2002 he was laid to rest beside his wife, Frances, under an epitaph that read “Reunitey in the heavenly-bode – Deep Joy!” And his family arranged a thanksgiving service with a valediction in his own style: “Goodly byelode loyal peeploders! Now all gatherymost to amuse it and have a tilty elbow or a nice cuffle-oteedee — oh yes!”

In a Word

https://commons.wikimedia.org/wiki/File:John_Warwick_Smith_-_Lake_Windermere_from_Calgarth_with_Belle_Isle_-_Google_Art_Project.jpg

hydronym
n. the name of a river, lake, sea, or any other body of water

A bizarre exchange from E.S. Turner’s 2012 What the Butler Saw, a social history of servants in English society:

Vain young gentlemen had a way of summoning their valets to answer questions to which they well knew the answer. [Beau] Brummell, when asked by a bore which of the Lakes he liked best, rang for Robinson. ‘Which of the lakes do I admire most, Robinson?’ he asked; and was informed, ‘Windermere, sir.’ ‘Ah, yes, Windermere, so it is. Thank you, Robinson.’

Palindromic Substrings

What’s unusual about this passage from Great Expectations?

It is not much to the purpose whether a gate in that garden wall which I had scrambled up to peep over on the last occasion was, on that last occasion, open or shut. Enough that I saw no gate then, and that I saw one now. As it stood open, and as I knew that Estella had let the visitors out,– for she had returned with the keys in her hand,– I strolled into the garden, and strolled all over it.

It contains a string of 15 letters that reads the same forward and backward:

It is not much to the purpose whether a gate in that garden wall which I had scrambled up to peep over on the last occasion was, on that last occasion, open or shut. Enough that I saw no gate then, and that I saw one now. As it stood open, and as I knew that Estella had let the visitors out,– for she had returned with the keys in her hand,– I strolled into the garden, and strolled all over it.

Reader Eric Harshbarger has been searching for such strings in literary texts. Here are his finds, and here’s a nifty tool he made that will find the longest palindromic substring in a given passage.

(Thanks, Eric.)

12/11/2021 UPDATE: Eric wonders what’s the longest sensible text one might construct that doesn’t contain any such substrings (an example: “We view uncopyrightable material on Wednesdays”). Add your ideas here.

A Perfect Alphabet

https://commons.wikimedia.org/wiki/File:1518_Thomas_More_Utopia_(Alphabet_November_edition)_(Biblioteca_nacional_de_Portugal).jpg

Thomas More’s Utopia gives us not only a description of that imaginary land but the actual alphabet used there: More’s friend Peter Giles wrote an addendum to the book that presents the letters and gives an example of the Utopian writing system:

Vtopos ha Boccas peu la chama polta chamaan.
Bargol he maglomi Baccan foma gymno sophaon.
Agrama gymnosophon labarembacha bodamilomin.
Voluala barchin heman la lauoluola dramme pagloni.

This is translated into Latin as

Utopus me dux ex non insula fecit insulam.
Una ego terrarum omnium absque philosophia
Civitatem philosophicam expressi mortalibus
Libenter impartio mea, non gravatim accipio meliora.

And in English this becomes

The commander Utopus made me into an island out of a non-island.
I alone of all nations, without philosophy,
Have portrayed for mortals the philosophical city.
Freely I impart my benefits; not unwillingly I accept whatever is better.

Working backward, this makes it possible to divine the meaning of a few Utopian words: boccas is commander, chama is island, voluala is willingly, and gymnosophaon is philosophy. You couldn’t really talk about much beyond perfect societies, but maybe that’s the point.

Progress

George Bernard Shaw argued passionately for the reform of English spelling, which he found bewildering and inconsistent. When opponents objected that imposing changes would be too disruptive, he suggested that we might alter or delete just one letter per year, to give the reading public time to adapt. In 1971 writer M.J. Shields sent a letter to the Economist imagining the consequences:

For example, in Year 1, that useless letter ‘c’ would be dropped to be replased by either ‘k’ or ‘s’, and likewise ‘x’ would no longer be part of the alphabet. The only kase in which ‘c’ would be retained would be in the ‘ch’ formation, which will be dealt with later. Year 2 might well reform ‘w’ spelling, so that ‘which’ and ‘one’ would take the same konsonant, wile Year 3 might well abolish ‘y’, replasing it with ‘i’, and Iear 4 might fiks the ‘g/j’ anomali wonse and for all.

Jeneralli, then, the improvement would kontinue iear bai iear, with Iear 5 doing awai with useless double konsonants, and iears 6-12 or so modifaiing the vowlz and the rimeining voist and unvoist konsonants. Bai ier 15 or sou, it wud fainali be posible tu meik ius ov thi ridandant letez ‘c’, ‘y’ and ‘x’ — bai now jast a memori in the maindz ov ould doderez — tu riplais ‘ch’, ‘sh’ and ‘th’ rispektivli.

Fainali, xen, aafte sam 20 iers of orxogrefkl riform, wi wud hev a lojikl, kohirnt speling in ius xrewawt xe Ingliy-spiking werld. Haweve, sins xe Wely, xe Airiy, and xe Skots du not spik Ingliy, xei wud hev to hev a speling siutd tu xer oun lengwij. Xei kud, haweve, orlweiz lern Ingliy az a sekond lengwij at skuul!

Iorz feixfuli,

M. J. Yilz

Blissymbols

https://commons.wikimedia.org/wiki/File:Bliss_cinema.png

Semiotician Charles K. Bliss was born in Czernowitz, in Austria-Hungary, a city with a confluence of nationalities that “hated each other, mainly because they spoke and thought in different languages.” So Bliss invented a new language to encourage communication between speakers of different languages — “Blissymbols” were ideographic, meaning they conveyed ideas or concepts, and so were not beholden to any spoken language.

For example, the sentence above reads “I want to go to the cinema”:

  • The symbol for “person” is attended by the number 1, indicating the first person.
  • The heart indicates a feeling, modified by a serpentine line indicating “fire,” topped a caret, indicating that it’s a verb in this sentence.
  • The symbol for “leg” also gets a caret, as it’s to be interpreted as a verb here.
  • The symbol for “house” is modified by the symbol for “film,” and the arrow indicates movement.

The language never fulfilled its potential as a bridge among cultures, but it became popular in the 1970s in teaching disabled people to communicate, and an organization known as Blissymbolics Communication International oversees its applications around the world.

(Thanks, Zach.)

In a Word

bafflegab
n. official or professional jargon which confuses more than it clarifies; gobbledegook

This is such a useful word that its coiner actually received an award. Milton A. Smith, assistant general counsel for the American Chamber of Commerce, invented it to describe one of the incomprehensible price orders published by the Chamber’s Office of Price Stabilization. His comment, published in the Chamber’s weekly publication Washington Report in January 1952, was lauded in an editorial in the Bellingham [Wash.] Herald, which sponsored a plaque.

Smith said he’d considered several words to describe the OPS order’s combination of “incomprehensibility, ambiguity, verbosity, and complexity.” He’d rejected legalfusion, legalprate, gabalia, and burobabble.

At the award presentation, he was asked to define his word briefly. He answered, “Multiloquence characterized by consummate interfusion of circumlocution or periphrasis, inscrutability, and other familiar manifestations of abstruse expatiation commonly utilized for promulgations implementing Procrustean determinations by governmental bodies.”