Welcome to the Austronesian Basic Vocabulary Database.
This database contains 261,027 lexical items from 1,339 languages spoken throughout the Pacific region. Most of these languages belong to the Austronesian language family, which is the largest family in the world containing around 1,000 and 1,200 languages.
Each language in our database has around 210 words associated with it. These words correspond to basic items of vocabulary, such as simple verbs like 'to walk', or 'to fly', the names of body parts like hand or mouth, colors like red, numbers (1, 2, 3, 4) and kinship terms such as Mother, Father and Person. The full list is here.
We have a new paper out in Science: Language Phylogenies Reveal Pulses and Pauses in Pacific Settlement.
Warning: Please do not cite or publish anything using this database without the authors' permission.
We would welcome assistance from professional linguists to correct any mistakes, improve the cognacy judgments and to add further languages.
We only have between 111.6% and 133.9% of all of the Austronesian languages represented in this database, and would love to expand this number. If you have data for an Austronesian, Trans New Guinea / Papaun / Indo-Pacific or Australian language or dialect that are not here, please help us out.
To cite this database, please reference the following paper. You should also include the date that you obtained the data, as we are constantly improving and correcting the information contained here.
Greenhill, S.J., Blust. R, & Gray, R.D. (2008). The Austronesian Basic Vocabulary Database: From Bioinformatics to Lexomics. Evolutionary Bioinformatics, 4:271-283.
The commercial use of these data is forbidden unless under the express permission of the authors.
Contact us by email with any queries, comments or suggestions: Simon Greenhill
Added Ujir (Schapper) from Antoinette Schapper.
Added Batuley from Daigle 2015.
Added Wogeo from Malcolm Ross, fieldnotes supplemented by Anderson & Exter (2005).
Added Tami from Malcolm Ross, fieldnotes, supplemented Bamler 1990.
Added Mumeng (Patep dialect) from Malcolm Ross, fieldnotes & Adams (n.d.).