Skip to main content
Version: 2.0.0

Languages Supported

LLMWhisperer supports 300+ languages across its various processing modes, making it one of the most comprehensive multilingual text extraction tools available.

Quick Overview

Different LLMWhisperer modes offer varying levels of language support:

ModeMultilingual SupportSupported LanguagesBest For
Form✅ Yes300+ printed, 12 handwrittenStructured documents with forms
Table✅ Yes300+ printed, 12 handwrittenDocuments with tables and data
High Quality✅ Yes300+ printed, 12 handwrittenComplex layouts, maximum accuracy
Native Text❌ LimitedEnglish-focusedDigital PDFs with selectable text
Low Cost❌ LimitedEnglish-focusedSimple scanned documents

Language Support Overview

Important

Native Text and Low Cost modes do not support multiple languages. For multilingual document processing, please use Form, Table, or High Quality modes.


Form Mode

The Form mode is optimized for structured documents with forms and fields.

Printed Text Support

Form mode supports an extensive list of languages for printed text extraction:

LanguageCode (optional)
Abazaabq
Abkhazianab
Achineseace
Acoliach
Adangmeada
Adygheady
Afaraa
Afrikaansaf
Akanak
Albaniansq
Algonquinalq
Angika (Devanagari)anp
Arabicar
Asturianast
Asu (Tanzania)asa
Avaricav
Awadhi-Hindi (Devanagari)awa
Aymaraay
Azerbaijani (Latin)az
Bafiaksf
Baghelibfy
Bambarabm
Bashkirba
Basqueeu
Belarusian (Cyrillic)be, be-cyrl
Belarusian (Latin)be, be-latn
Bemba (Zambia)bem
Bena (Tanzania)bez
Bhojpuri-Hindi (Devanagari)bho
Bikolbik
Binibin
Bislamabi
Bodo (Devanagari)brx
Bosnian (Latin)bs
Brajbhabra
Bretonbr
Bulgarianbg
Bundelibns
Buryat (Cyrillic)bua
Catalanca
Cebuanoceb
Chamlingrab
Chamorroch
Chechence
Chhattisgarhi (Devanagari)hne
Chigacgg
Chinese Simplifiedzh-Hans
Chinese Traditionalzh-Hant
Choctawcho
Chukotckt
Chuvashcv
Cornishkw
Corsicanco
Creecr
Creekmus
Crimean Tatar (Latin)crh
Croatianhr
Crowcro
Czechcs
Danishda
Dargwadar
Dariprs
Dhimal (Devanagari)dhi
Dogri (Devanagari)doi
Dualadua
Dungandng
Dutchnl
Efikefi
Englishen
Erzya (Cyrillic)myv
Estonianet
Faroesefo
Fijianfj
Filipinofil
Finnishfi
Fonfon
Frenchfr
Friulianfur

Handwritten Text Support

Form mode supports the following languages for handwritten text extraction:

LanguageCode (optional)LanguageCode (optional)
EnglishenJapaneseja
Chinese Simplifiedzh-HansKoreanko
FrenchfrPortuguesept
GermandeSpanishes
ItalianitRussianru
ThaithArabicar

Table Mode & High Quality Mode

Both Table and High Quality modes share the same extensive language support for optimal text extraction.

Printed Text Support

LanguageCode (optional)
Abazaabq
Abkhazianab
Achineseace
Acoliach
Adangmeada
Adygheady
Afaraa
Afrikaansaf
Akanak
Albaniansq
Algonquinalq
Angika (Devanagari)anp
Arabicar
Asturianast
Asu (Tanzania)asa
Avaricav
Awadhi-Hindi (Devanagari)awa
Aymaraay
Azerbaijani (Latin)az
Bafiaksf
Baghelibfy
Bambarabm
Bashkirba
Basqueeu
Belarusian (Cyrillic)be, be-cyrl
Belarusian (Latin)be, be-latn
Bemba (Zambia)bem
Bena (Tanzania)bez
Bhojpuri-Hindi (Devanagari)bho
Bikolbik
Binibin
Bislamabi
Bodo (Devanagari)brx
Bosnian (Latin)bs
Brajbhabra
Bretonbr
Bulgarianbg
Bundelibns
Buryat (Cyrillic)bua
Catalanca
Cebuanoceb
Chamlingrab
Chamorroch
Chechence
Chhattisgarhi (Devanagari)hne
Chigacgg
Chinese Simplifiedzh-Hans
Chinese Traditionalzh-Hant
Choctawcho
Chukotckt
Chuvashcv
Cornishkw
Corsicanco
Creecr
Creekmus
Crimean Tatar (Latin)crh
Croatianhr
Crowcro
Czechcs
Danishda
Dargwadar
Dariprs
Dhimal (Devanagari)dhi
Dogri (Devanagari)doi
Dualadua
Dungandng
Dutchnl
Efikefi
Englishen
Erzya (Cyrillic)myv
Estonianet
Faroesefo
Fijianfj
Filipinofil
Finnishfi
Fonfon
Frenchfr
Friulianfur

Handwritten Text Support

Table and High Quality modes support the following languages for handwritten text:

LanguageCode (optional)LanguageCode (optional)
EnglishenJapaneseja
Chinese Simplifiedzh-HansKoreanko
FrenchfrPortuguesept
GermandeSpanishes
ItalianitRussianru
ThaithArabicar