4.5 KiB
Overview
This guide contains tips and strategies for getting the fastest performance from Tesseract.js. While some of the tips below involve avoiding pitfalls and should be universally implemented, other strategies (changing the language data or recognition model) may harm recognition quality. Therefore, whether these strategies are appropriate depends on the application, and users should always benchmark performance and quality before changing important settings from their defaults.
Reducing Setup Time
Within certain applications, the majority of runtime may be attributable to setup steps (createWorker
) rather than recognition (worker.recognize
). Implementing the strategies in this section should reduce the time spent on these steps.
Notably, the time spent on setup for first-time users may not be apparent to developers, as Tesseract.js caches language data after it is downloaded for the first time. To experience Tesseract.js as a first-time user, set cacheMethod: 'none'
in the createWorker options. Be sure to remove this setting before publishing your app.
Reuse Workers
When recognizing multiple images, some users will create/load/destroy a new worker for each image. This is never the correct option. If the images are being recognized one after the other, all of the extra steps required to create/load/destroy a new worker are wasted runtime, as worker.recognize
could be run with the same worker
. Workers do not break after one use.
Alternatively, if images are being recognized in parallel, then creating a new worker for each recognition job is likely to cause crashes due to resource limitations. As each Tesseract.js worker uses a high amount of memory, code should never be able to create an arbitrary number of workers
. Instead, schedulers should be used to create a specific pool for workers (say, 4 workers), and then jobs assigned through the scheduler.
Set Up Workers Ahead of Time
Rather than waiting until the last minute to load code and data, you can set up a worker ahead of time. Doing so greatly reduces runtime the first time a user run recognition. An example where a worker is prepared ahead of time can be found here.
The appropriate time to load Tesseract.js workers and data is application-specific. For example, if you have an web app where only 5% of users need OCR, it likely does not make sense to download ~15MB in code and data upon a page load. You could consider instead loading Tesseract.js when a user indicates they want to perform OCR, but before they select a specific image.
Do Not Disable Language Data Caching
Language data is one of the largest downloads required to run Tesseract.js. While most language data files (including the default English file) are ~2MB, in a worst-case scenario they can be much larger. For example, setting the recognition model (oem
) to Tesseract Legacy and language to Chinese (simplified) results in a ~20MB file being downloaded.
To avoid downloading language data multiple times, Tesseract.js caches .traineddata
files. In past versions of Tesseract.js, this caching behavior contained bugs, so some users disabled it (setting cacheMethod: 'none'
or cacheMethod: 'refresh'
). As these bugs were fixed in v4.0.6, it is now recommended that users use the default cacheMethod
value (i.e. just ignore the cacheMethod
argument).
Reducing Recognition Runtime
Use the Latest Version of Tesseract.js
Old versions of Tesseract.js are significantly slower. Notably, v2 (now depreciated) takes 10x longer to recognize certain images compared to the latest version.
Do Not Set corePath
to a Single .js
file
If you set the corePath
argument, be sure to set it to a directory that contains all 4 of these files:
tesseract-core.wasm.js
tesseract-core-simd.wasm.js
tesseract-core-lstm.wasm.js
tesseract-core-simd-lstm.wasm.js
Tesseract.js needs to be able to pick between these files—setting corePath
to a specific .js
file will significantly degrade performance or compatibility.
Consider Using "Fast" Language Data
By default, Tesseract.js uses language data that is optimized for quality rather than speed. You can also experiment with using language data that is optimized for speed by setting langPath
to https://tessdata.projectnaptha.com/4.0.0_fast
. We have not benchmarked the impact this has on performance or accuracy, so feel free to open a Git Issue if you do so and wish to share results.