Welcome to PHOIBLE

PHOIBLE is a repository of cross-linguistic phonological inventory data, which have been extracted from source documents and tertiary databases and compiled into a single searchable convenience sample. Release 2.0 from 2019 includes 3020 inventories that contain 3183 segment types found in 2186 distinct languages.

A bibliographic record is provided for each source document; note that some languages in PHOIBLE have multiple entries based on distinct sources that disagree about the number and/or identity of that language’s phonemes.

Two principles guide the development of PHOIBLE, though it has proved challenging both theoretically and technologically to abide by them:

  1. Be faithful to the language description in the source document (now often called ‘doculect’, for reasons indicated above)
  2. Encode all character data in a consistent representation in Unicode IPA

In addition to phoneme inventories, PHOIBLE includes distinctive feature data for every phoneme in every language. The feature system used was created by the PHOIBLE developers to be descriptively adequate cross-linguistically. In other words, if two phonemes differ in their graphemic representation, then they necessarily differ in their featural representation as well (regardless of whether those two phonemes coexist in any known doculect). The feature system is loosely based on the feature system in Hayes 2009 with some additions drawn from Moisik & Esling 2011.

However, the final feature system goes beyond both of these sources, and is potentially subject to change as new languages are added in subsequent editions of PHOIBLE.

The data set also includes additional genealogical and geographical information about each language from Glottolog.

The PHOIBLE project also integrates the theoretical model of distinctive features from an extended phonological feature set based on International Phonetic Alphabet (Association 2005) and on Hayes 2009. This is accomplished by creating a mapping relationship from each IPA segment to a set of features (Moran 2012). In this way, the IPA is a pivot for interoperability across all resources in PHOIBLE because their contents are encoded in Unicode IPA.

For a detailed description of PHOIBLE, see Moran 2012. For examples of some of the research we are doing with PHOIBLE, see: Moran et al. 2012}, Cysouw et al. 2012}, McCloy et al. 2013 and Moran & Blasi, Cross-linguistic comparison of complexity measures in phonological systems, forthcoming.

How to use PHOIBLE

Users can browse or search PHOIBLE's inventories by clicking on the tabs "Inventories", "Languages" or "Segments" above. Data can be downloaded by clicking the download button . If you use PHOIBLE in your research, please cite appropriately, following our recommended citation format.

How to cite PHOIBLE

If you are citing the database as a whole, or making use of the phonological distinctive feature systems in PHOIBLE, please cite as follows:

Moran, Steven & McCloy, Daniel (eds.) 2019.
Jena: Max Planck Institute for the Science of Human History.
(Available online at http://phoible.org, Accessed on 2019-05-21.)

If you are citing phoneme inventory data for a particular language or languages, please use the name of the language as the title, and include the original data source as an element within PHOIBLE:

UCLA Phonological Segment Inventory Database. 2019. Lelemi sound inventory (UPSID).
In: Moran, Steven & McCloy, Daniel (eds.)
Jena: Max Planck Institute for the Science of Human History.
(Available online at http://phoible.org/inventories/view/441, Accessed on 2019-05-21.)