Guardian Crossword Trivia

If, like me, you are somewhat obsessed by the Guardian's excellent daily cryptic crosswords, you may find some of the assorted trivia here of interest. The statistics here are based on the daily crosswords and prize crosswords from mid-1999 to the present Feburary 2010. (The graphs on this page are generated with the excellent Flot Javascript graphing library, apart from the final bar chart which is generated with Google Charts and the "Wordle" from wordle.net.) Update (2013): My script that generated these statistics broke in early 2010 when the Guardian revamped their crossword site, and unfortunately I haven't (so far!) got around to updating it. I hope some of this is still of interest.

The "difficulty" of the past two Guardian daily crosswords

Defining the difficulty of a crossword automatically would be an extremely difficult problem in general, but one dimension of this which we can easily measure is how unusual the vocabulary in the answers is.

To take this approach we need some way to score each word which might appear in the crossword. A possible way of doing this might be to look for how many times each word appears in Project Gutenberg (or some other large text corpus) but this doesn't really reflect the difficulty of words in crosswords. For example, because of its useful checking letters the word "OKAPI" is one of the most frequently used in the Guardian crossword (see below for the top 25) but would occur relatively rarely in most texts. So, for the graphs below I'm measuring the easiness of a word by how often it has appeared in the Guardian crossword previously. The distribution of words according to this score is quite interesting; about 35% of the answers in the Guardian crossword hadn't been clued in the previous 10 years.

The graphs below show on the Y-axis the proportion of words either in this particular crossword, all crosswords by that setter or all Guardian crosswords. The X-axis shows the "easiness" of the words as defined above - the number of occurences in the Guardian crossword overall. In other words, flatter graphs represent puzzles with easier vocabulary for the experienced crossword solver.

The next two graphs are updated each day shortly after 3am and 9am. If it's at least a day since the publication of the daily crossword then you will be able to hover your mouse over the yellow graph to show the words which make up the data for that point.

Guardian Crossword 24926 by Rover on 05 February 2010

[Answers corresponding to these data points will not be available until a day after publication.]

Guardian Crossword 24925 by Orlando on 04 February 2010

[Hover your mouse over a point on the yellow graph to see the answers.]

Compare the "difficulty" of setters

You can compare the "difficulty" curves (as explained above) of the top 20 Guardian crosswords setters using the checkboxes below. The setter who used the most surprising vocabulary by this metric was Bunthorne (the late Bob Smithies) while the least surprising is used by Chifonie.

Araucaria
Rufus
Paul
Gordius
Shed
Bunthorne
Chifonie
Orlando
Rover
Taupi
Pasquale
Quantum
Logodaedalus
Janus
Brummie
Enigmatist
Brendan
Audreus
Crispa
Mercury
Auster

The most often clued words or phrases in the Guardian crossword

The following are the most clued words in the Guardian daily and prize crosswords. This is a marvellous list, I think - the popularity of the words seems to be largely a function of:

I have been very careful on this page only to present data which setters or the editor hopefully wouldn't be upset to see posted here, so I have omitted including any of the clues for these words. However, I will say the list of clues for the top word (EXTRA) is remarkable because of the incredible variation and inventiveness of the clues. As someone who struggles rather to set a single good clue for a given word, I'm incredibly impressed by this...

Number of OccurrencesAnswer
38EXTRA
25STUD
25STYE
24ISLE
24ANON
24ECHO
23ESTATE
23STUN
23BLUE
23OUNCE
22ISSUE
22REIGN
21UNIT
21ERROR
21EDGE
21ETERNAL
21USED
20ADDRESS
20RATIO
20NIECE
20ACHE
20SCAR
20ARCH
20ERATO
20IRIS

The following image is a Wordle visualization of the most frequently clued words, generated using data between 1995 and 2009:

Number of crosswords set by the most frequent setters

The bar graph below shows how many crosswords have been set by each of the 20 most prolific setters, both in the daily and prize crosswords. (If we limit this to just daily crosswords then Rufus is the top setter, since Araucaria sets a much larger proportion of the prize crosswords than anyone else.)


You can email me (Mark Longair) at: My email address in a graphic: the part before the at sign
is mark followed by hyphen and random and the
bit afterwards is mythic hyphen beasts dot com