PSIcapture Administrator Guide: Configuration: Regular Expressions

version 7.7.x   Download Pending

 

 Audience

This article is intended for PSIcapture Administrators.

 

Overview

Regular Expressions, sometimes shortened to Regex, are used throughout many areas of PSIcapture to customize and fine-tune the results recognized through OCR, Character Filtering, and more. Regular expressions can assist in a large variety of tasks that relate to sifting through document metadata for the information that suits your business needs. In many cases, well-crafted Regular Expressions can save time and energy when processing workflows, since they can account for variables such as page distortion, similar character recognition, and even ensure that only specific character sets can be matched, such as Alphanumeric or Numeric filters. This also prevents a build up of excessive exceptions batches, increases Classification matching rates, and generally tweaks a Capture Profile to produce the best possible results. The article below states how to store global Regular Expressions through the self-named tab on the Configuration root of PSIcapture.

 

Table of Contents

 

 

The Regular Expressions tab opens to the Regular Expressions Global list, which is a collection of commonly used Regular Expressions that can be easily used across multiple capture profiles. Regular expressions saved within a capture profile are only available to that profile unless exported to this Regular Expressions Global List.

 

Add or Edit

Users can add new Regular Expressions or edit existing ones. If either is selected the Editor window will be opened.

Delete

Users can delete existing Regular Expressions permanently.

Usage

Users can check which Capture Profile(s) are using the selected Regular Expression.

 

 

Import/Export

Users can import or export Regular Expressions via an XML file.

Load

Users are able to load a Regular Expression(s) into the Global List from any Capture Profile.

 

 

Regular Expression Options

 

 

Match Literal Text

Generates a regular expression that will match the selected literal text (Control+Click to generate a text pattern)

Match Text Pattern

Generates a regular expression that will match the pattern of the selected text (Control+Click to generate a literal pattern)

Case Sensitivity options:

  • Do not add case sensitivity prefix

  • Generate a case insensitive patter (prefixes the generated pattern with (?i)) 

  • Generate a case sensitive pattern (prefixes the generated patter with (?-i))

Add Word Boundaries to the pattern

Surrounds the expression with word boundaries (/b)

Mark All White Space as Optional

When selected all whitespace will be marked optional (*)

Mark All Punctuation as Optional

When selected all punctuation will be marked optional (?)

Regular Expression Builder

Clicking Add or Edit opens the Regular Expression Editor window where an expression can be defined and validated. NOTE: Only user-defined expressions can be edited. 

 


Users can either entered the desired Regular Expression or type into the window what data needs an expression. Once the data is there users can click the Generate button and the builder will make the expression for you as shown below.

 

Entered Text:
Invoice #494736RT
Generated Expression:
(?i)\b[Il1|]nv[o0][il1|]ce\s*#4A]9[4A]736RT\b

 

NOTE: Please see http://www.regular-expressions.info/ for more detailed information and tutorials on regular expressions.

 

 

Keywords: PSIcapture Regular Expressions Configuration, PSIcapture Configuration Regular Expressions, How to Configure Regular Expressions PSIcapture, PSIcapture Configuration Regex

 

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Article is closed for comments.