TesseractOcrJobRequestImagePreProcessing Properties |
The TesseractOcrJobRequestImagePreProcessing type exposes the following members.
Name | Description | |
---|---|---|
AutoCleanBlackBorders |
Automatically eliminates black borders from page images and helps to reduce image complexity.
| |
AutoDeskew |
Deskew filter, also called auto straighten, is the automatic rotation of an image such that the text is vertically aligned.
This is great for straighten up scanned documents.
| |
AutoDespeckle | This filter is used to automatically remove small defects due to dust, or scratches, on a scanned image, and also moiré effects on image scanned from a magazine. Please note that this filter assumes that the text characters height is greater than 20 pixels by assuming the page image was scanned at least at 250 dpi. If there are character height lower than the previous mentioned height then this filter should not be used because it will probably remove good text parts. | |
AutoInvert |
Negative documents are documents that have a reverse color photometry: text is white, and the background is black.
Since the OCR document recognition assumes the opposite, this filter automatically detects and inverts color photometry in a image.
| |
ImageResize |
Resizes a page image according to the TesseractOcrImageResizeSettings | |
LocalAdaptiveThresholding |
Thresholding is the simplest way to segment objects from a background.
If that background is relatively uniform, then you should leave this setting to null
or set its Enabled property to false (which is the default behaviour)
and a global threshold value will be used to binarize the image by pixel-intensity.
If there’s large variation in the background intensity (like a camera image), however,
adaptive thresholding (a.k.a. local or dynamic thresholding) may produce better results.
|