aspell: Supported
1
1 B.1 Supported
1 =============
1
1 Aspell 0.60 should be able to support the following languages:
1
1 Code Language Name Script Dictionary Gettext
1 Available Translation
1
1 aa Afar Latin - -
1 af Afrikaans Latin 0.50 -
1 ak Akan Latin Maybe -
1 am Amharic Ethiopic 0.60 -
1 ar Arabic Arabic 0.60 -
1 as Assamese Bengali - -
1 av Avar Cyrillic - -
1 ay Aymara Latin - -
1 az Azerbaijani Cyrillic, Latin 0.60 -
1
1 ba Bashkir Cyrillic - -
1 be Belarusian Cyrillic 0.50 Incomplete
1 bg Bulgarian Cyrillic 0.50 -
1 bh Bihari Devanagari - -
1 bm Bambara Latin - -
1 bn Bengali Bengali 0.60 -
1 bo Tibetan Tibetan - -
1 br Breton Latin 0.50 -
1 bs Bosnian Latin Maybe -
1
1 ca Catalan / Valencian Latin 0.50 Yes
1 ce Chechen Cyrillic - -
1 co Corsican Latin Maybe -
1 cop Coptic Greek Maybe -
1 cs Czech Latin 0.50 Yes
1 csb Kashubian Latin 0.60 -
1 cv Chuvash Cyrillic - -
1 cy Welsh Latin 0.50 -
1
1 da Danish Latin 0.50 Incomplete
1 de German Latin 0.50 Yes
1 dyu Dyula - Maybe -
1
1 ee Ewe Latin - -
1 el Greek Greek 0.50 -
1 en English Latin 0.50 Yes
1 eo Esperanto Latin 0.50 -
1 es Spanish Latin 0.50 Incomplete
1 et Estonian Latin 0.60 -
1 eu Basque Latin Maybe -
1
1 fa Persian Arabic 0.60 -
1 ff Fulah Latin Maybe -
1 fi Finnish Latin 0.60 -
1 fj Fijian Latin Maybe -
1 fo Faroese Latin 0.50 -
1 fr French Latin 0.50 Yes
1 fur Friulian Latin Maybe -
1 fy Frisian Latin 0.60 -
1
1 ga Irish Latin 0.50 Yes
1 gd Scottish Gaelic Latin 0.50 -
1 gl Gallegan Latin 0.50 -
1 gn Guarani Latin Maybe -
1 gu Gujarati Gujarati 0.60 -
1 gv Manx Gaelic Latin 0.50 -
1
1 ha Hausa Latin Maybe -
1 he Hebrew Hebrew 0.60 -
1 hi Hindi Devanagari 0.60 -
1 hil Hiligaynon Latin 0.50 -
1 ho Hiri Motu Latin - -
1 hr Croatian Latin 0.50 -
1 hsb Upper Sorbian Latin 0.60 -
1 ht Haitian Creole Latin Maybe -
1 hu Hungarian Latin 0.60 -
1 hy Armenian Armenian 0.60 -
1 hz Herero Latin - -
1
1 ia Interlingua (IALA) Latin 0.50 -
1 id Indonesian Arabic, Latin 0.50 -
1 ig Igbo Latin Maybe -
1 ii Sichuan Yi Yi - -
1 io Ido Latin - -
1 is Icelandic Latin 0.50 -
1 it Italian Latin 0.50 Yes
1
1 jv Javanese Javanese, Latin Maybe -
1
1 ka Georgian Georgian - -
1 kg Kongo Latin Maybe -
1 ki Kikuyu / Gikuyu Latin - -
1 kj Kwanyama Latin - -
1 kk Kazakh Cyrillic - -
1 km Khmer Khmer Maybe -
1 kn Kannada Kannada Planned -
1 kr Kanuri Latin - -
1 ks Kashmiri Arabic, Devanagari - -
1 ku Kurdish Arabic, Cyrillic, 0.50 -
1 Latin
1 kv Komi Cyrillic - -
1 ky Kirghiz Arabic, Cyrillic, Maybe -
1 Latin
1
1 la Latin Latin 0.60 -
1 lb Luxembourgish Latin Maybe -
1 lg Ganda Latin Maybe -
1 li Limburgian Latin Maybe -
1 ln Lingala Latin Maybe -
1 lt Lithuanian Latin 0.60 -
1 lu Luba-Katanga Latin - -
1 lv Latvian Latin 0.60 -
1
1 mg Malagasy Latin 0.50 -
1 mi Maori Latin 0.50 -
1 mk Macedonian Cyrillic 0.50 -
1 ml Malayalam Latin, Malayalam 0.60 -
1 mn Mongolian Cyrillic, Mongolian 0.60 Incomplete
1 mo Moldavian Cyrillic - -
1 mos Mossi - Maybe -
1 mr Marathi Devanagari 0.60 -
1 ms Malay Arabic, Latin 0.50 -
1 mt Maltese Latin 0.50 -
1 my Burmese Myanmar - -
1
1 nb Norwegian Bokmal Latin 0.50 -
1 nd North Ndebele Latin Maybe -
1 nds Low Saxon Latin 0.60 -
1 ne Nepali Devanagari Maybe -
1 ng Ndonga Latin Maybe -
1 nl Dutch Latin 0.50 Yes
1 nn Norwegian Nynorsk Latin 0.50 -
1 nr South Ndebele Latin Maybe -
1 nso Northern Sotho Latin Maybe -
1 nv Navajo Latin Maybe -
1 ny Nyanja Latin 0.50 -
1
1 oc Occitan / Provencal Latin Maybe -
1 om Oromo Ethiopic, Latin - -
1 or Oriya Oriya 0.60 -
1 os Ossetic Cyrillic - -
1
1 pa Punjabi Gurmukhi 0.60 -
1 pl Polish Latin 0.50 -
1 ps Pushto Arabic - -
1 pt Portuguese Latin 0.50 Incomplete
1
1 qu Quechua Latin 0.60 -
1
1 rn Rundi Latin Maybe -
1 ro Romanian Latin 0.50 Incomplete
1 ru Russian Cyrillic 0.50 Yes
1 rw Kinyarwanda Latin 0.50 -
1
1 sc Sardinian Latin 0.50 -
1 sd Sindhi Arabic - -
1 sg Sango Latin Maybe -
1 si Sinhalese Sinhala - -
1 sk Slovak Latin 0.50 Yes
1 sl Slovenian Latin 0.50 Yes
1 sm Samoan Latin Maybe -
1 sn Shona Latin Maybe -
1 so Somali Latin Maybe -
1 sq Albanian Latin Maybe -
1 sr Serbian Cyrillic, Latin 0.60 Incomplete
1 ss Swati Latin Maybe -
1 st Southern Sotho Latin Maybe -
1 su Sundanese Latin Maybe -
1 sv Swedish Latin 0.50 Incomplete
1 sw Swahili Latin 0.50 -
1
1 ta Tamil Tamil 0.60 -
1 te Telugu Telugu 0.60 -
1 tet Tetum Latin 0.50 -
1 tg Tajik Arabic, Cyrillic, Maybe Incomplete
1 Latin
1 ti Tigrinya Ethiopic Maybe -
1 tk Turkmen Arabic, Cyrillic, 0.50 -
1 Latin
1 tl Tagalog Latin, Tagalog 0.50 -
1 tn Tswana Latin 0.50 -
1 to Tonga Latin Maybe -
1 tr Turkish Arabic, Latin 0.50 -
1 ts Tsonga Latin Maybe -
1 tt Tatar Cyrillic - -
1 tw Twi Latin - -
1 ty Tahitian Latin Maybe -
1
1 ug Uighur Arabic, Cyrillic, - -
1 Latin
1 uk Ukrainian Cyrillic 0.50 Yes
1 ur Urdu Arabic Maybe -
1 uz Uzbek Cyrillic, Latin 0.60 -
1
1 ve Venda Latin Maybe -
1 vi Vietnamese Latin 0.60 Yes
1
1 wa Walloon Latin 0.50 Incomplete
1 wo Wolof Latin Maybe -
1
1 xh Xhosa Latin Maybe -
1
1 yi Yiddish Hebrew 0.60 -
1 yo Yoruba Latin Maybe -
1
1 za Zhuang Latin - -
1 zu Zulu Latin 0.50 -
1
1 Dictionaries marked as "0.50" are available for Aspell 0.50. Ones
1 marked as "0.60" are available for Aspell 0.60 only. Ones marked as
1 "Planned" should eventually be available. Ones marked as "Maybe" might
1 be available in the future. ⇒Planned Dictionaries, for more
1 info.
1
1 B.1.1 Notes on Latin Languages
1 ------------------------------
1
1 Any word that can be written using one of the Latin ISO-8859 character
1 sets (ISO-8859-1,2,3,4,9,10,13,14,15,16) can be written, in decomposed
1 form, using the ASCII characters, the 23 additional letters:
1
1 U+00C6 LATIN CAPITAL LETTER AE
1 U+00D0 LATIN CAPITAL LETTER ETH
1 U+00D8 LATIN CAPITAL LETTER O WITH STROKE
1 U+00DE LATIN CAPITAL LETTER THORN
1 U+00DE LATIN SMALL LETTER THORN
1 U+00DF LATIN SMALL LETTER SHARP S
1 U+00E6 LATIN SMALL LETTER AE
1 U+00F0 LATIN SMALL LETTER ETH
1 U+00F8 LATIN SMALL LETTER O WITH STROKE
1 U+0110 LATIN CAPITAL LETTER D WITH STROKE
1 U+0111 LATIN SMALL LETTER D WITH STROKE
1 U+0126 LATIN CAPITAL LETTER H WITH STROKE
1 U+0127 LATIN SMALL LETTER H WITH STROKE
1 U+0131 LATIN SMALL LETTER DOTLESS I
1 U+0138 LATIN SMALL LETTER KRA
1 U+0141 LATIN CAPITAL LETTER L WITH STROKE
1 U+0142 LATIN SMALL LETTER L WITH STROKE
1 U+014A LATIN CAPITAL LETTER ENG
1 U+014B LATIN SMALL LETTER ENG
1 U+0152 LATIN CAPITAL LIGATURE OE
1 U+0153 LATIN SMALL LIGATURE OE
1 U+0166 LATIN CAPITAL LETTER T WITH STROKE
1 U+0167 LATIN SMALL LETTER T WITH STROKE
1
1 and the 14 modifiers:
1
1 U+0300 COMBINING GRAVE ACCENT
1 U+0301 COMBINING ACUTE ACCENT
1 U+0302 COMBINING CIRCUMFLEX ACCENT
1 U+0303 COMBINING TILDE
1 U+0304 COMBINING MACRON
1 U+0306 COMBINING BREVE
1 U+0307 COMBINING DOT ABOVE
1 U+0308 COMBINING DIAERESIS
1 U+030A COMBINING RING ABOVE
1 U+030B COMBINING DOUBLE ACUTE ACCENT
1 U+030C COMBINING CARON
1 U+0326 COMBINING COMMA BELOW
1 U+0327 COMBINING CEDILLA
1 U+0328 COMBINING OGONEK
1
1 Which is a total of 37 additional Unicode code points.
1
1 All ISO-8859 character leaves the characters 0x00 - 0x1F, and 0x80 -
1 0x9F unmapped as they are generally used as control characters. Of
1 those, 0x01 - 0x0F, 0x11 - 0x1F and 0x80 - 0x9F may be mapped to
1 anything in Aspell. This is a total of 62 characters which can be
1 remapped in any ISO-8859 character set. Thus, by remapping 37 of the 62
1 characters to the previously specified Unicode code-points, any modified
1 ISO-8859 character set can be used for any Latin languages covered by
1 ISO-8859. Of course decomposing every single accented character wastes
1 a lot of space, so only characters that cannot be represented in the
1 precomposed form should be broken up. By using this trick it is
1 possible to store foreign words in the correctly accented form in the
1 dictionary even if the precomposed character is not in the current
1 character set.
1
1 Any letter in the Unicode range U+0000 - U+0249, U+1E00 - U+1EFF
1 (Basic Latin, Latin-1 Supplement, Latin Extended-A, Latin Extended-B,
1 and Latin Extended Additional) can be represented using around 175
1 basic letters, and 25 modifiers which is less than 210 and can thus fit
1 in an Aspell 8-bit character set. Since this Unicode range covers any
1 possible Latin language this special character set can be used to
1 represent any word written using the Latin script if so desired.
1
1 B.1.2 Syllabic
1 --------------
1
1 Syllabic languages use a separate symbol for each syllable of the
1 language. Even thought most of them have more than 210 distinct
1 symbols Aspell can still support them by breaking them up.
1
1 B.1.2.1 The Ethiopic Syllabary
1 ..............................
1
1 Even though the Ethiopic script has more than 210 distinct characters
1 Aspell can still handle it. The idea is to split each character into
1 two parts based on the Consonant and Vowel parts. This encoding of the
1 syllabary is far more useful to Aspell than if they were stored in UTF-8
1 or UTF-16. In fact, the exiting suggestion strategy of Aspell will work
1 well with this encoding without any additional modifications. However,
1 additional improvements may be possible by taking advantage of the
1 consonant-vowel structure of this encoding.
1
1 In fact, the split consonant-vowel representation may prove to be so
1 useful that it may be beneficial to encode other syllabary in this
1 fashion, even if they are less than 210 of them.
1
1 The code to break up a syllabary into the consonant-vowel part is
1 part of the Unicode normalization process.
1
1 B.1.2.2 The Yi Syllabary
1 ........................
1
1 A very large syllabary with 819 distinct symbols. However, like
1 Ethiopic, it should be possible to support this script by breaking it
1 up.
1
1 B.1.2.3 The Ojibwe Syllabary
1 ............................
1
1 With only 120 distinct symbols, Aspell can actually support this one as
1 is. However, as previously mentioned, it may be beneficial to break it
1 up into the consonant-vowel representation anyway.
1