|
Features |
| Expand All |
|
Technical Notes |
- Deploys within .NET as a managed control and is fully compliant with .NET 2.0 and above (see "Building Robust Imaging Components for the Microsoft .NET Platform" white paper)
- ActiveX COM control available for most other development environments
- Sample code included for: VB.NET, C#, VB, Delphi, VC++, HTML
- Object-oriented API for .NET users
- Can be used in a multi-threaded environment, performing thread-safe processing (more)
- Supports user-specified debug logging levels
- Suitable for client-server Web-based applications
- For documents up to 999 pages
- Free full-featured trial version available for download (trial version watermarks output files)
|
|
Language Recognition |
- Supports English, French, German, Italian, Spanish, Portuguese, Danish, Dutch, Swedish, Norwegian, Hungarian, Polish, and Finnish
- Includes dictionaries for all supported languages
- Custom dictionary applies user-defined words
|
|
File Output Formats |
Output can contain unformatted text, formatted text, or formatted text plus images, in these file formats:
|
- PDF version 1.4 files (Professional Edition only)
- PDF – Original image over hidden text
- PDF – Formatted Text and Graphics (Normal)
- Microsoft Word-compatible RTF
- Excel v2.x (compatible with later versions)
- WordPerfect 5.0 (compatible with later versions)
- HTML, with a sub-folder containing images
- ASCII
- ASCII with no line breaks
- ASCII with line breaks
- ASCII, smart-formatted with spaces
- ASCII, comma- or tab-delimited
|
|
Pattern Matching using Approximate Regular Expressions |
- Search OCR output for occurrences of any defined pattern (such as social security numbers, phone numbers, or dates)
- Approximate matching allows inexact matches to be located
- Located strings can be redacted or highlighted using the included NotateXpress component
- POSIX-compliant regular expression syntax
|
|
Image Input and Pre-processing |
- ImagXpress Document is included with OCR Xpress
- Opens TIFF, JPEG, GIF, PNG, JBIG2, and many other image formats
(read the full ImagXpress Document v9 product description)
- Advanced auto binarization evaluates color images to optimize conversion
- Deskew, despeckle, and many other image cleanup functions
- Accepts uncompressed in-memory image data for high performance
|
|
Auto Rotation |
- Automatically rotates 0, 90, 180, or 270 degrees to correct text orientation
- Returns the applied rotation angle
- Highly optimized for speed
|
|
Character Position Information |
- Returns character position for all characters
- Can be used to redact or highlight text in the original image
|
|
Character Confindence Values |
- Returns confidence for all recognized characters
- Confidence values can be used for combining voting engines
- Alternate suggested characters provided
- Add text proofing and character replacement functions to applications
- Character reinsertion enables text correction prior to document creation
|
|
Segmentation |
- Automatically or manually locate regions of the input image and identify them as graphics whose color can be preserved or areas containing recognizable text
- Access each region separately, or recombine into fully-formatted documents such as PDF or RTF files
|
|
Font Generation |
Formatted output based upon recognized text:
|
- Serif, sans serif, or monospaced font
- Normal, bold, italic, or bold-italic
- Scaled to the closest font size
|
|
Edition Descriptions |
- Professional Edition creates PDF output, as well as all other supported formats
- Standard Edition supports all output formats except PDF
|
|
|
|
|
|
|
|
|