You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
1.8 KiB
1.8 KiB
Tesseract.js Parameters
When initializing
In the 3rd argument of ecognize()
, you can pass a params object to customize the result of OCR, below are supported parameters in tesseract.js so far.
Example:
import { createWorker, OEM, PSM } from 'tesseract.js';
const { TesseractWorker, OEM, PSM } = Tesseract;
const worker = new TesseractWorker();
worker
.recognize(image, 'eng', {
tessedit_ocr_engine_mode: OEM.LSTM_ONLY,
tessedit_pageseg_mode: PSM.SINGLE_BLOCK,
})
.then(result => console.log(result.text));
name | type | default value | description |
---|---|---|---|
tessedit_ocr_engine_mode | enum | OEM.LSTM_ONLY | Check HERE for definition of each mode |
tessedit_pageseg_mode | enum | PSM.SINGLE_BLOCK | Check HERE for definition of each mode |
tessedit_char_whitelist | string | '' | setting white list characters makes the result only contains these characters, useful the content in image is limited |
tessjs_create_hocr | string | '1' | only 2 values, '0' or '1', when the value is '1', tesseract.js includes hocr in the result |
tessjs_create_tsv | string | '1' | only 2 values, '0' or '1', when the value is '1', tesseract.js includes tsv in the result |
tessjs_create_box | string | '0' | only 2 values, '0' or '1', when the value is '1', tesseract.js includes box in the result |
tessjs_create_unlv | string | '0' | only 2 values, '0' or '1', when the value is '1', tesseract.js includes unlv in the result |
tessjs_create_osd | string | '0' | only 2 values, '0' or '1', when the value is '1', tesseract.js includes osd in the result |