Tesseract is an open-source OCR engine that was
developed at HP between 1984 and 1994. Like a supernova,
it appeared from nowhere for the 1995 UNLV
Annual Test of OCR Accuracy [1], shone brightly with
its results, and then vanished back under the same
cloak of secrecy under which it had been developed.
Now for the first time, details of the architecture and
algorithms can be revealed.