Tesseract provides an OCR engine and a command line program. It includes a new neural net (LSTM) based OCR engine which is focused on line recognition, but also still provides a legacy OCR engine which works by recognizing character patterns. Tesseract has Unicode (UTF-8) support, and can recognize more than 100 languages "out of the box". Tesseract can be trained to recognize other languages. It supports various output formats: plain text, hOCR (HTML), PDF, invisible-text-only PDF, and TSV.
Binary packages can be installed with the high-level tool pkgin (which can be installed with pkg_add) or pkg_add(1) (installed by default). The NetBSD packages collection is also designed to permit easy installation from source.
The pkg_admin audit command locates any installed package which has been mentioned in security advisories as having vulnerabilities.
Please note the vulnerabilities database might not be fully accurate, and not every bug is exploitable with every configuration.
Problem reports, updates or suggestions for this package should be reported with send-pr.