Welcome to the Austronesian Basic Vocabulary Database.
This database contains 284,626 lexical items from 1,467 languages spoken throughout the Pacific region. Most of these languages belong to the Austronesian language family, which is the largest family in the world containing around 1,000 and 1,200 languages.
Each language in our database has around 210 words associated with it. These words correspond to basic items of vocabulary, such as simple verbs like 'to walk', or 'to fly', the names of body parts like hand or mouth, colors like red, numbers (1, 2, 3, 4) and kinship terms such as Mother, Father and Person. The full list is here.
We have a new paper out in Science: Language Phylogenies Reveal Pulses and Pauses in Pacific Settlement.
Warning: Please do not cite or publish anything using this database without the authors' permission.
We would welcome assistance from professional linguists to correct any mistakes, improve the cognacy judgments and to add further languages.
We only have between 122.3% and 146.7% of all of the Austronesian languages represented in this database, and would love to expand this number. If you have data for an Austronesian, Trans New Guinea / Papaun / Indo-Pacific or Australian language or dialect that are not here, please help us out.
To cite this database, please reference the following paper. You should also include the date that you obtained the data, as we are constantly improving and correcting the information contained here.
Greenhill, S.J., Blust. R, & Gray, R.D. (2008). The Austronesian Basic Vocabulary Database: From Bioinformatics to Lexomics. Evolutionary Bioinformatics, 4:271-283.
The commercial use of these data is forbidden unless under the express permission of the authors.
Contact us by email with any queries, comments or suggestions: Simon Greenhill
Added Mindoro Tagalog from Native Tagalog speaker. Thanks to Dario Marticio
Added Proto-Sama-Bajaw from Pallesen (1985). Thanks to Andrew Hsiu
Added Proto-Manobo from Richard E. Elkins (1974). Thanks to Andrew Hsiu
Added Proto-Subanen from Jason Lobel (2013). Thanks to Andrew Hsiu
Added Isinay (Dupax) from Sarah Eve Perlawan. Thanks to Andrew Hsiu