The Most Common Words in Latin and Ancient Greek

The Dickinson College Commentaries Core Vocabulary lists represent the thousand most common words in Latin and the 500 most common words in ancient Greek. They were originally composed in 2012–13 by a team at Dickinson College led by Christopher Francese. The Chinese translations of the definitions and grammatical terms were created by members of the DCO editorial board in June 2015.

Sources of frequency data for the Latin core list

1. L. Delatte, Et. Evrard, S. Govaerts and J. Denooz, Dictionnaire fréquentiel et index inverse de la langue latine (Liège: Laboratoire d'Analyse Statistique des Langues Anciennes, 1981). The "LASLA" list is available in .pdf form here.

2. Paul B. Diederich “The Frequency of Latin Words and Their Endings,” (Dissertation, University of Chicago, 1939), as digitized by Carolus Raeticus in 2011.

English definitions and vowel quantities were adapted from various sources, including Gonzales Lodge, The Vocabulary of High School Latin (New York, 1922), and the Oxford Latin Dictionary. The frequency rankings are derived from LASLA, and do not take Diederich's counts into consideration.

Sources of frequency data for the Greek core list

1. Word frequency data based on a subset of the comprehensive Thesaurus Linguae Graecae database, kindly provided by Maria Pantelia of TLG. The subset on which the frequency data was based included all texts in the database up to AD 200, for a total of 20.003 million words. Of this total, the period AD 100-200 accounts for about half, 10.235 million. The point of the chronological limit of AD 200 was to minimize any possible distortions that would be caused by the large amount of later Christian and Byzantine Greek in the TLG, texts that are not typically read by most students of ancient Greek. I would like to express my warm thanks to Prof. Pantelia for providing this valuable data.

2. Word frequency data based on the corpus of Greek authors at Perseus under PhiloLogic, which at the time our list was developed (summer 2012) contained approximately 5 million words. This frequency data was kindly provided by Helma Dik of the University of Chicago. I would like to express my heartfelt thanks to Prof. Dik for providing this valuable data, and for her very helpful advice about the list and related matters.

Definitions were adapted from various sources, including W. Major, "It’s Not the Size, It’s the Frequency: The Value of Using a Core Vocabulary in Beginning and Intermediate Greek,” CPL Online no. 4 (2008): 1–24; Liddell and Scott's Intermediate Greek Lexicon; Logeion; and W.R. Harper and J. Wallace, Xenophon’s Anabasis: Seven Books (New York: American Book Co., 1893).

Credits

For both lists many judgment calls had to be made about which words to include and how to list them. This work was carried out by Chris Francese in 2012-13, with valuable help from the following: Wilfred Major of Louisiana State University was of enormous help at an early stage of the development of the Greek list, especially in analyzing the TLG frequency data and spotting innumerable pitfalls in it. Eric Casey of Sweet Briar College proof read the Greek list at a later stage and is responsible for a great many improvements. Meghan Reedy and Marc Mastangelo, both of Dickinson College, improved the Latin and the Greek lists by helping to decide which words to include and how to list them. Dickinson students Alice Ettling, James Martin, Meredith Wilson, and Derek Frymark edited and proof read the lists in the summer of 2012, creating the semantic groupings and part of speech lists. They also helped compare the TLG and PhiloLogic Greek data, and digitized the LASLA Latin frequency data. Derek Frymark and Dickinson student Qingyu Wang created the searchable database versions in Drupal in the summer 2013 with help from web developer Ryan Burke. Alex Lee, a graduate student at the University of Chicago, improved both lists with his careful proof reading. The Greek principal parts are based on the lists of Evan Hayes and Stephen Nimis in their edition of Lucian's True Story, though I also consulted the TLG itself to determine which principal parts were actually in common use (more details on this can be found here). The Portuguese translation was made in 2014 by Caio Camargo. The Polish version was made by Statek Feaków and the group of classical teachers called Ship of Phaeacians. The Chinese version was made at Shanghai Normal University in June of 2015 by the editorial board of Dickinson Classics Online. Special thanks are due to Michele Ferrero of Beijing Foreign Studies University, for his help with the Chinese version of the Latin list (maneo - respicio).

Terms of Use

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Please drop me an email to let me know how you use the list: francese@dickinson.edu.

Christopher Francese, September 25, 2015