Contributor UPSID: UCLA Phonological Segment Inventory Database

In the early 1980's, Ian Maddieson developed the UCLA Phonological Segment Inventory Database (UPSID), a computer-accessible database of contrastive segment inventories (Maddieson 1984). The initial sample of 317 languages drew on the work of the Stanford Phonology Archive (Crothers et al 1979), but decisions regarding the phonemic status and phonetic descriptions of some segments do not coincide between the compilers of the two databases and were therefore updated in UPSID (Maddieson 1984, pg 6). Maddieson and Precoda (1990) expanded the sample of languages from 317 to 451; both datasets have been based on a quota sampling technique that aims to include one language from each small language family. UPSID inventories contain no descriptions of tone. The UPSID-451 data used in PHOIBLE Online were extracted from a DOS software package. Each segment description, originally given in an ASCII encoding (e.g. XW9:) was mapped to Unicode IPA and each inventory was assigned an ISO 639-3 language name identifier. For details, see Moran 2012, chp 4; the UPSID-to-Unicode mappings are given in Moran 2012, appendix F.

The UPSID folder contains data from the UCLA Phonological Segment Inventory Database:

Maddieson, I., & Precoda, K. (1990). Updating UPSID. UCLA Working Papers in Phonetics, 74, 104–111.

The contents and extraction pipeline for these data are described in (chapter 4):

Moran, Steven. (2012). Phonetics Information Base and Lexicon. PhD thesis, University of Washington. Online: https://digital.lib.washington.edu/researchworks/handle/1773/22452.

The data are available in several files in this directory from the original ASCII dump. These inventories contain only phonemes, with no information on allophones or linguistic tone.

We have converted IPA symbols in the raw data in line with the phoible conventions and Unicode IPA as described in the UPSID_IPA_correspondences.tsv file.

Note that Henning Reetz has put online a simple user interface to the UPSID data, which can be used for browsing and quick queries.