Welcome to the Austronesian Basic Vocabulary Database.
This database contains 233,518 lexical items from 1,179 languages spoken throughout the Pacific region. Most of these languages belong to the Austronesian language family, which is the largest family in the world containing around 1,000 and 1,200 languages.
Each language in our database has around 210 words associated with it. These words correspond to basic items of vocabulary, such as simple verbs like 'to walk', or 'to fly', the names of body parts like hand or mouth, colors like red, numbers (1, 2, 3, 4) and kinship terms such as Mother, Father and Person. The full list is here.
We have a new paper out in Science: Language Phylogenies Reveal Pulses and Pauses in Pacific Settlement.
Warning: Please do not cite or publish anything using this database without the authors' permission.
We would welcome assistance from professional linguists to correct any mistakes, improve the cognacy judgments and to add further languages.
We only have between 98.3% and 117.9% of all of the Austronesian languages represented in this database, and would love to expand this number. If you have data for an Austronesian, Trans New Guinea / Papaun / Indo-Pacific or Australian language or dialect that are not here, please help us out.
To cite this database, please reference the following paper. You should also include the date that you obtained the data, as we are constantly improving and correcting the information contained here.
Greenhill, S.J., Blust. R, & Gray, R.D. (2008). The Austronesian Basic Vocabulary Database: From Bioinformatics to Lexomics. Evolutionary Bioinformatics, 4:271-283.
The commercial use of these data is forbidden unless under the express permission of the authors.
Contact us by email with any queries, comments or suggestions: Simon Greenhill
Added Lelepa from Sébastien Lacrampe. Thanks to Sébastien Lacrampe
Added Uma (Kantewu dialect) from Martens, Michael. Thanks to Michael P. Martens
Added Kara East. Thanks to Matthew Dryer
Added Andio from Busenitz, Robert. 2013 . Andio word list and sentences. Thanks to Robert Busenitz and David Mead
Added Moor from Kamholz. Thanks to David Kamholz