Monday, November 13, 2017

Can you measure which pangrams are natural

A Pangram is a sentence that contains every letter of the alphabet

The classic is:

The quick brown fox jumps over the lazy dog.

(NOTE- I had jumped' but a reader pointed out that there was no s, and that jumps' is the correct word)

which is only 31 letters.

I could give a pointer to lists of such, but you can do that yourself.

My concern is:

a) are there any pangrams that have actually been uttered NOT in the context of here is a pangram'

b) are there any that really could.

That is- which pangrams are natural?  I know this is an ill defined question.

Here are some candidates for natural pangrams

1) Pack my box with five dozen liquor jugs

2) Amazingly few discotheques provide jukeboxes

3) Watch Jeopardy! Alex Trebek's fun TV quiz game

4) Cwm fjord bank glyphs vext quiz
(Okay, maybe that one is not natural as it uses archaic words. It means
Carved symbols in a mountain hollow on the bank of an inlet irritated an
eccentric person'  Could come up in real life. NOT. It uses every letter
exactly once.)

How can you measure how natural they are?

For the Jeopardy one I've shown it to people and asked them

What is unusual about this new slogan for the show Jeopardy?''

and nobody gets it. more important- they believe it is the new slogan.

So I leave to the reader:

I) Are the other NATURAL pangrams?

II) How would you test naturalness of such?

Pinning down natural' is hard. I did a guest post in 2004 before I was an official co-blogger, about when a problem (a set for us) is natural, for example the set all regular expressions with squaring (see here).

1. just as you suggest in your linked blog post that natural hard problems are ones that are studied before they're determined to be hard, we might want to search through some large set of (english) text and filter out the pangrams and take the shortest ones. say search through shakespeare, or tweets or some collection of english literature.

of course there *are* natural pangrams, but how short can they be?
if we consider, say, biblical hebrew we see that deuteronomy 4:34 and zephaniah 3:8 are pangrams (the latter containing even final forms) so they exist and are natural. but they're significantly longer than the shortest modern hebrew pangrams by a factor of like 3. perhaps if they were taken from a larger body of texts they would be shorter.

1. AH, excellent point.
What we really want are SHORT NATURAL pangrams.

There could be a theorem about a tradeoff between short and natural :-)

2. How about defining their naturalness score as the likelihood from some language model trained on a large English corpus? Maximizing this product would also encourage brevity, as probabilities are all < 1.

3. > The quick brown fox jumped over the lazy dog

This is a common misrendering of the panagram. It has to be "jumps" not "jumped", else the sentence has no 's'.

1. Thanks.
Fixed

It is a common mistake: I googled

"The quick brown fox jumped over the lazy dog" -jumps

and got approx 77,000 hits

4. I think you might be interested in https://twitter.com/pangramtweets if you have not seen it. There is an article from 2014 about some of the shortest pangrams it has discovered at http://languagelog.ldc.upenn.edu/nll/?p=14975

5. Typo: Title misspells pangram as panagram

1. Thanks/fixed.
I caught that error in the main text but didn't look
at the title.

6. "Ella Minnow Pea" is a fun little book about, among other things, pangrams whose plot culminates with the accidental utterance of a pangram.

7. I always like "Mr. Jock, TV quiz PhD, bags few lynx" :)

8. Here's a new tool to find natural pangrams https://wordsmith.org/pangram/