version 7.7.x | Download Pending |
Audience
This article is intended for PSIcapture Administrators.
Overview
Regular Expressions, sometimes shortened to Regex, are used throughout many areas of PSIcapture to customize and fine-tune the results recognized through OCR, Character Filtering, and more. Regular expressions can assist in a large variety of tasks that relate to sifting through document metadata for the information that suits your business needs. In many cases, well-crafted Regular Expressions can save time and energy when processing workflows, since they can account for variables such as page distortion, similar character recognition, and even ensure that only specific character sets can be matched, such as Alphanumeric or Numeric filters. This also prevents a build up of excessive exceptions batches, increases Classification matching rates, and generally tweaks a Capture Profile to produce the best possible results. The article below states how to store global Regular Expressions through the self-named tab on the Configuration root of PSIcapture.
Table of Contents
The Regular Expressions tab opens to the Regular Expressions Global list, which is a collection of commonly used Regular Expressions that can be easily used across multiple capture profiles. Regular expressions saved within a capture profile are only available to that profile unless exported to this Regular Expressions Global List.
Add or Edit
Users can add new Regular Expressions or edit existing ones. If either is selected the Editor window will be opened.
Delete
Users can delete existing Regular Expressions permanently.
Usage
Users can check which Capture Profile(s) are using the selected Regular Expression.
Import/Export
Users can import or export Regular Expressions via an XML file.
Load
Users are able to load a Regular Expression(s) into the Global List from any Capture Profile.
Regular Expression Options
Match Literal Text
Generates a regular expression that will match the selected literal text (Control+Click to generate a text pattern)
Match Text Pattern
Generates a regular expression that will match the pattern of the selected text (Control+Click to generate a literal pattern)
Case Sensitivity options:
-
Do not add case sensitivity prefix
-
Generate a case insensitive patter (prefixes the generated pattern with (?i))
-
Generate a case sensitive pattern (prefixes the generated patter with (?-i))
Add Word Boundaries to the pattern
Surrounds the expression with word boundaries (/b)
Mark All White Space as Optional
When selected all whitespace will be marked optional (*)
Mark All Punctuation as Optional
When selected all punctuation will be marked optional (?)
Regular Expression Builder
Clicking Add or Edit opens the Regular Expression Editor window where an expression can be defined and validated. NOTE: Only user-defined expressions can be edited.
Users can either entered the desired Regular Expression or type into the window what data needs an expression. Once the data is there users can click the Generate button and the builder will make the expression for you as shown below.
Entered Text: Invoice #494736RT |
Generated Expression: (?i)\b[Il1|]nv[o0][il1|]ce\s*#4A]9[4A]736RT\b |
![]() |
![]() |
NOTE: Please see http://www.regular-expressions.info/ for more detailed information and tutorials on regular expressions.
Keywords: PSIcapture Regular Expressions Configuration, PSIcapture Configuration Regular Expressions, How to Configure Regular Expressions PSIcapture, PSIcapture Configuration Regex
Comments
Article is closed for comments.