Pure Javascript OCR for more than 100 Languages 📖🎉🖥
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 

10 KiB

Tesseract Languages

The lang property of the options object passed to Tesseract.recognize can have one of the following values (the default is 'eng'.):

Lang Code Language 4.0 traineddata
afr Afrikaans afr.traineddata.gz
amh Amharic amh.traineddata.gz
ara Arabic ara.traineddata.gz
asm Assamese asm.traineddata.gz
aze Azerbaijani aze.traineddata.gz
aze_cyrl Azerbaijani - Cyrillic aze_cyrl.traineddata.gz
bel Belarusian bel.traineddata.gz
ben Bengali ben.traineddata.gz
bod Tibetan bod.traineddata.gz
bos Bosnian bos.traineddata.gz
bul Bulgarian bul.traineddata.gz
cat Catalan; Valencian cat.traineddata.gz
ceb Cebuano ceb.traineddata.gz
ces Czech ces.traineddata.gz
chi_sim Chinese - Simplified chi_sim.traineddata.gz
chi_tra Chinese - Traditional chi_tra.traineddata.gz
chr Cherokee chr.traineddata.gz
cym Welsh cym.traineddata.gz
dan Danish dan.traineddata.gz
deu German deu.traineddata.gz
dzo Dzongkha dzo.traineddata.gz
ell Greek, Modern (1453-) ell.traineddata.gz
eng English eng.traineddata.gz
enm English, Middle (1100-1500) enm.traineddata.gz
epo Esperanto epo.traineddata.gz
est Estonian est.traineddata.gz
eus Basque eus.traineddata.gz
fas Persian fas.traineddata.gz
fin Finnish fin.traineddata.gz
fra French fra.traineddata.gz
frk Frankish frk.traineddata.gz
frm French, Middle (ca. 1400-1600) frm.traineddata.gz
gle Irish gle.traineddata.gz
glg Galician glg.traineddata.gz
grc Greek, Ancient (-1453) grc.traineddata.gz
guj Gujarati guj.traineddata.gz
hat Haitian; Haitian Creole hat.traineddata.gz
heb Hebrew heb.traineddata.gz
hin Hindi hin.traineddata.gz
hrv Croatian hrv.traineddata.gz
hun Hungarian hun.traineddata.gz
iku Inuktitut iku.traineddata.gz
ind Indonesian ind.traineddata.gz
isl Icelandic isl.traineddata.gz
ita Italian ita.traineddata.gz
ita_old Italian - Old ita_old.traineddata.gz
jav Javanese jav.traineddata.gz
jpn Japanese jpn.traineddata.gz
kan Kannada kan.traineddata.gz
kat Georgian kat.traineddata.gz
kat_old Georgian - Old kat_old.traineddata.gz
kaz Kazakh kaz.traineddata.gz
khm Central Khmer khm.traineddata.gz
kir Kirghiz; Kyrgyz kir.traineddata.gz
kor Korean kor.traineddata.gz
kur Kurdish kur.traineddata.gz
lao Lao lao.traineddata.gz
lat Latin lat.traineddata.gz
lav Latvian lav.traineddata.gz
lit Lithuanian lit.traineddata.gz
mal Malayalam mal.traineddata.gz
mar Marathi mar.traineddata.gz
mkd Macedonian mkd.traineddata.gz
mlt Maltese mlt.traineddata.gz
msa Malay msa.traineddata.gz
mya Burmese mya.traineddata.gz
nep Nepali nep.traineddata.gz
nld Dutch; Flemish nld.traineddata.gz
nor Norwegian nor.traineddata.gz
ori Oriya ori.traineddata.gz
pan Panjabi; Punjabi pan.traineddata.gz
pol Polish pol.traineddata.gz
por Portuguese por.traineddata.gz
pus Pushto; Pashto pus.traineddata.gz
ron Romanian; Moldavian; Moldovan ron.traineddata.gz
rus Russian rus.traineddata.gz
san Sanskrit san.traineddata.gz
sin Sinhala; Sinhalese sin.traineddata.gz
slk Slovak slk.traineddata.gz
slv Slovenian slv.traineddata.gz
spa Spanish; Castilian spa.traineddata.gz
spa_old Spanish; Castilian - Old spa_old.traineddata.gz
sqi Albanian sqi.traineddata.gz
srp Serbian srp.traineddata.gz
srp_latn Serbian - Latin srp_latn.traineddata.gz
swa Swahili swa.traineddata.gz
swe Swedish swe.traineddata.gz
syr Syriac syr.traineddata.gz
tam Tamil tam.traineddata.gz
tel Telugu tel.traineddata.gz
tgk Tajik tgk.traineddata.gz
tgl Tagalog tgl.traineddata.gz
tha Thai tha.traineddata.gz
tir Tigrinya tir.traineddata.gz
tur Turkish tur.traineddata.gz
uig Uighur; Uyghur uig.traineddata.gz
ukr Ukrainian ukr.traineddata.gz
urd Urdu urd.traineddata.gz
uzb Uzbek uzb.traineddata.gz
uzb_cyrl Uzbek - Cyrillic uzb_cyrl.traineddata.gz
vie Vietnamese vie.traineddata.gz
yid Yiddish yid.traineddata.gz