Contributor SPA: Stanford Phonology Archive

The Stanford Phonology Archive (SPA) was the first computerized database of phonological segment inventories. It was inspired by Joseph Greenberg's research on universals and his personal archive of data from notebooks and his memory (Crothers et al 1979, i-ii). The inventories in PHOIBLE Online come from the Handbook of Phonological Data From a Sample of the World's Languages, compiled and edited by Crothers et al 1979, and kindly provided to the Phonetics Lab (University of Washington) by Marilyn M. Vihman. The inventories in SPA include descriptions of phonemes, allophones and comments on phonological contexts for 197 languages. The inventory descriptions were digitized and each phoneme was mapped from its original written description, e.g. d-pharyngealized, to a Unicode IPA representation. Each inventory was also assigned an ISO 639-3 language name identifer. Details are given in Moran 2012, chp 4, and the SPA-to-Unicode IPA mappings are given in Moran 2012, appendix E.

Inventory Language Segments Vowels Consonants Tones
Sinhalese (SPA 178) Sinhala 53 16 37 0
Tigre (SPA 130) Tigre 54 7 47 0
Lahu (SPA 23) Lahu 40 9 26 5
Ticuna (SPA 115) Ticuna 38 11 18 9
Angas (SPA 125) Ngas 44 11 30 3
Chipewyan (SPA 63) Chipewyan 61 18 39 4
Garo (SPA 24) Garo 26 5 21 0
Ewe (SPA 142) Ewe 45 14 28 3
Kanuri (SPA 150) Central Kanuri 35 6 25 4
Maung (SPA 48) Mawng 24 5 19 0
Basque (SPA 179) Basque 28 5 23 0
Korean (SPA 1) Korean 40 18 22 0
Jivaro (SPA 113) Shuar 24 8 16 0
Hupa (SPA 65) Hupa 42 7 35 0
Mundari (SPA 10) Mundari 37 5 32 0
Telugu (SPA 8) Telugu 68 21 47 0
Kaliai (SPA 40) Kaliai 28 10 18 0
Amharic (SPA 131) Amharic 59 7 52 0
Iraqw (SPA 129) Iraqw 48 12 33 3
Island Carib (SPA 107) Island Carib 29 12 17 0
Nez Perce (SPA 81) Nez Perce 36 10 26 0
Even (SPA 192) Even 36 18 18 0
Zuni (SPA 91) Zuni 54 10 44 0
Logbara (SPA 151) Lugbara 42 13 26 3
Campa (SPA 110) Asháninka 21 4 17 0
Digueno (SPA 90) Tipai 34 9 25 0
Yakut (SPA 190) Sakha 49 16 33 0
Tarascan (SPA 87) Purepecha 32 6 26 0
Chacobo (SPA 121) Chácobo 21 4 17 0
Otomi (SPA 97) Mezquital Otomi 58 13 42 3
Maltese (SPA 134) Maltese 55 10 45 0
Salish (SPA 69) Straits Salish 43 7 36 0
Paez (SPA 102) Páez 39 10 29 0
Kunjen (SPA 46) Kunjen 28 5 23 0
Yukaghir (SPA 193) Northern Yukaghir 32 12 20 0
Atayal (SPA 30) Atayal 27 8 19 0
Katcha (SPA 136) Katcha 32 7 23 2
English (SPA 160) English 40 13 27 0
Tiwa (SPA 92) Southern Tiwa 41 12 26 3
Nama (SPA 155) Nama 33 8 19 6
Nyangumata (SPA 45) Nyangumarta 27 7 20 0
Wichita (SPA 74) Wichita 29 8 21 0
Persian (SPA 172) Western Farsi 30 6 24 0
Carib (SPA 116) Galibi Carib 28 12 16 0
Sa'ban (SPA 34) Sa'ban 46 8 38 0
Malay (SPA 32) Standard Malay 27 6 21 0
Kharia (SPA 11) Kharia 41 10 31 0
Hawaiian (SPA 43) Hawaiian 19 10 9 0
Icelandic (SPA 158) Icelandic 39 16 23 0
Totonac (SPA 84) Papantla Totonac 26 6 20 0
Finnish (SPA 180) Finnish 42 17 25 0
Sundanese (SPA 35) Sundanese 26 7 19 0
Ojibwa (SPA 72) Eastern Ojibwa 27 11 16 0
Awiya (SPA 128) Awngi 39 7 28 4
Chuvash (SPA 189) Chuvash 39 9 30 0
Chontal (SPA 85) Tabasco Chontal 37 12 25 0
Russian (SPA 166) Russian 38 5 33 0
Tewa (SPA 93) Rio Grande Tewa 57 19 34 4
Tzeltal (SPA 86) Tzeltal 30 5 25 0
Apinaye (SPA 122) Apinayé 49 35 14 0
Yuchi (SPA 78) Yuchi 64 26 38 0
Japanese (SPA 197) Japanese 40 11 27 2
Komi (SPA 182) Komi-Zyrian 36 7 29 0
Moxo (SPA 111) Ignaciano 25 4 21 0
Navaho (SPA 66) Navajo 54 16 36 2
Alawa (SPA 50) Alawa 28 5 23 0
Seneca (SPA 76) Seneca 27 16 11 0
Cambodian (SPA 14) Central Khmer 42 21 21 0
Spanish (SPA 164) Spanish 25 5 20 0
Zulu (SPA 147) Zulu 43 5 35 3
Hungarian (SPA 183) Hungarian 65 15 50 0
Kunimaipa (SPA 58) Kunimaipa 20 5 15 0
Chamorro (SPA 38) Chamorro 36 6 30 0
Sedang (SPA 13) Sedang 53 14 39 0
Gadsup (SPA 56) Gadsup 19 6 9 4
Somali (SPA 127) Somali 52 20 30 2
Pima (SPA 96) Tohono O'odham 29 10 19 0
Swahili (SPA 145) Swahili 36 5 31 0
Gbeya (SPA 148) Gbaya-Bossangoa 45 12 31 2
Karen (SPA 25) S'gaw Karen 39 9 27 3
Yao (SPA 20) Iu Mien 49 8 35 6
Luo (SPA 153) Luo (Kenya and Tanzania) 35 9 23 3
Cheremis (SPA 181) Eastern Mari 34 9 25 0
Ket (SPA 2) Ket 32 14 18 0
Dagbani (SPA 139) Dagbani 37 11 24 2
Goajiro (SPA 109) Wayuu 46 18 28 0
Ga (SPA 141) Ga 47 12 32 3
Azerbaijani (SPA 187) North Azerbaijani 34 9 25 0
Burmese (SPA 22) Burmese 50 13 34 3
Aleut (SPA 61) Aleut 36 6 30 0
Zoque (SPA 83) Copainalá Zoque 32 6 26 0
Oneida (SPA 77) Oneida 22 12 10 0
Kashimiri (SPA 174) Kashmiri 56 29 27 0
Nootka (SPA 70) Nuu-chah-nulth 47 10 37 0
Haida (SPA 67) Northern Haida 50 3 47 0
Nasioi (SPA 60) Naasioi 18 10 8 0
Javanese (SPA 36) Javanese 30 8 22 0
Irish Gaelic (SPA 156) Irish 68 24 44 0
Lak (SPA 3) Lak 69 9 60 0
Luiseno (SPA 94) Luiseno 34 11 23 0
Hopi (SPA 95) Hopi 49 18 31 0
Alabama (SPA 79) Alabama 21 6 15 0
Modern Greek (SPA 170) Modern Greek 26 5 21 0
Inuit (SPA 62) Kalaallisut 38 6 32 0
Chasta Costa (SPA 64) Tolowa-Chetco 44 15 29 0
Lakkia (SPA 26) Lakkia 59 22 31 6
Amahuaca (SPA 120) Amahuaca 22 8 14 0
Moroccan Arabic (SPA 133) Moroccan Arabic 78 4 74 0
Maori (SPA 42) Maori 20 10 10 0
Aymara (SPA 105) Jaqaru 42 6 36 0
Mandarin Chinese (SPA 16) Mandarin Chinese 43 12 27 4
Luvale (SPA 146) Luvale 36 10 22 4
Squamish (SPA 68) Squamish 37 7 30 0
Khalkha (SPA 191) Halh Mongolian 35 14 21 0
Cantonese (SPA 19) Yue Chinese 32 5 22 5
Batak (SPA 33) Batak Toba 21 5 16 0
Mazateco (SPA 99) Chiquihuitlán Mazatec 38 8 26 4
Malagasy (SPA 31) Plateau Malagasy 36 5 31 0
Wapishana (SPA 108) Wapishana 29 12 17 0
Kota (SPA 9) Kota (India) 34 11 23 0
Kpelle (SPA 137) Liberia Kpelle 49 14 32 3
Telefol (SPA 55) Telefol 27 8 17 2
Chukchi (SPA 195) Chukchi 22 7 15 0
Ostyak (SPA 184) Kazym-Berezover-Suryskarer Khanty 32 13 19 0
German (SPA 161) German 39 16 23 0
Modern Hebrew (SPA 135) Modern Hebrew 30 6 24 0
Dakota (SPA 75) Lakota 36 9 27 0
Cayapa (SPA 101) Cha'palaa 28 4 24 0
Margi (SPA 126) Marghi Central 44 6 36 2
Breton (SPA 157) Breton 42 17 25 0
Beembe (SPA 144) Beembe 41 20 19 2
Adzera (SPA 41) Adzera 25 7 18 0
Amuesha (SPA 112) Yanesha' 29 6 23 0
Albanian (SPA 169) Northern Tosk Albanian 35 7 28 0
Western Desert (SPA 44) Antakarinya 23 6 17 0
Iai (SPA 39) Iaai 52 19 33 0
Mixtec (SPA 100) San Miguel El Grande Mixtec 32 10 19 3
Hakka (SPA 18) Hakka Chinese 31 6 21 4
Delaware (SPA 73) Unami 39 17 22 0
Sentani (SPA 52) Sentani 17 7 10 0
Igbo (SPA 143) Igbo 65 16 46 3
Rumanian (SPA 165) Romanian 31 9 22 0
Portuguese (SPA 163) Portuguese 38 15 23 0
Turkish (SPA 186) Turkish 40 16 24 0
Selepet (SPA 59) Selepet 21 6 15 0
Khasi (SPA 12) Khasi 28 9 19 0
Bulgarian (SPA 167) Bulgarian 49 12 37 0
Akan (SPA 140) Akan 40 15 22 3
Kirghiz (SPA 188) Kirghiz 36 16 20 0
Quechua (SPA 104) South Bolivian Quechua 37 5 32 0
Tagalog (SPA 37) Tagalog 28 10 18 0
Wolof (SPA 138) Wolof 40 14 26 0
Maranungku (SPA 47) Maranunggu 24 5 19 0
Pashto (SPA 173) Central Pashto 38 7 31 0
Karok (SPA 88) Karok 41 11 27 3
Shilha (SPA 123) Tarifiyt-Beni-Iznasen-Eastern Middle Atlas Berber 49 4 45 0
Maasai (SPA 154) Masai 43 18 21 4
Itonama (SPA 103) Itonama 25 6 19 0
Asmat (SPA 54) Tamnim Citak 19 5 14 0
Bengali (SPA 177) Bengali 53 17 36 0
Mazahua (SPA 98) Central Mazahua 63 15 45 3
Punjabi (SPA 175) Eastern Panjabi 73 20 50 3
Pomo (SPA 89) Southeastern Pomo 38 11 27 0
Gilyak (SPA 194) Nivkh 44 10 34 0
Tunica (SPA 80) Tunica 25 7 18 0
Maidu (SPA 82) Northeast Maidu 24 6 18 0
Georgian (SPA 5) Georgian 35 6 29 0
Cham (SPA 29) Western Cham 35 9 24 2
Lithuanian (SPA 168) Lithuanian 47 11 36 0
Ocaina (SPA 117) Ocaina 38 9 27 2
Norwegian (SPA 159) Norwegian Bokmål 48 19 27 2
Wik-Munkan (SPA 51) Wik-Mungkan 23 10 13 0
Siriono (SPA 118) Sirionó 28 10 18 0
Armenian (SPA 171) Eastern Armenian 37 7 30 0
Wu (SPA 17) Wu Chinese 41 9 29 3
Hausa (SPA 124) Hausa 45 10 32 3
French (SPA 162) French 40 17 23 0
Songhai (SPA 149) Koyraboro Senni Songhai 48 15 33 0
Vietnamese (SPA 15) Vietnamese 39 12 21 6
Burushaski (SPA 6) Burushaski 53 12 38 3
Guarani (SPA 119) Paraguayan Guaraní 36 12 24 0
Kwakiutl (SPA 71) Kwak'wala 55 9 46 0
Washkuk (SPA 53) Kwoma 31 7 24 0
Yurak (SPA 185) Tundra Nenets 34 8 26 0
Araucanian (SPA 106) Mapudungun 26 6 20 0
Ainu (SPA 196) Hokkaido Ainu 17 5 12 0
Hindi-Urdu (SPA 176) Hindi 94 23 71 0
Egyptian Arabic (SPA 132) Egyptian Arabic 67 9 58 0
Kabardian (SPA 4) Kabardian 56 7 49 0
Barasano (SPA 114) Waimaha 25 12 11 2
Kurukh (SPA 7) Kurukh 68 22 46 0
Nunggubuyu (SPA 49) Wubuy 24 5 19 0
Yay (SPA 28) Bouyei 38 8 24 6
Dafla (SPA 21) Nyishi-Hill Miri 27 9 16 2
Auyana (SPA 57) Awiyaana 20 6 11 3
Mahas-Fiyadikka (SPA 152) Nobiin 45 10 33 2
Thai (SPA 27) Thai 45 18 21 6

The SPA folder contains data from the Stanford Phonology Archive:

Crothers, J. H., Lorentz, J. P., Sherman, D. A., & Vihman, M. M. (1979). Handbook of phonological data from a sample of the world’s languages: A report of the Stanford Phonology Archive. Palo Alto, CA: Department of Linguistics, Stanford University.

The contents and extraction pipeline for these data are described in (chapter 4):

Moran, Steven. (2012). Phonetics Information Base and Lexicon. PhD thesis, University of Washington. Online: https://digital.lib.washington.edu/researchworks/handle/1773/22452.

The data are available in phoible long format in SPA_Phones.tsv. The inventories contain the language name given in the source and their phonemes, allophones and tones. The inventories file also contains additional information in the form of footnotes, which are explained in detail in the original source (Crothers et al., 1979)

We have converted IPA symbols in the raw data in line with the phoible conventions and Unicode IPA as described in the SPA_IPA_correspondences.tsv file.

Note that the ISO 639-3 codes in the SPA source may be out of date with the current ISO 639-3 standard. For more info, see: https://iso639-3.sil.org/.

For up-to-date language codes for each inventory, we maintain a phoible index here: InventoryID-LanguageCodes.tsv.