PSIcapture Administrator Guide: Classification Workflow Step Configuration

version 7.7.x   Download Pending

 

 Audience

This article is intended for PSIcapture Administrators.

 

Overview

 

PSIcapture's Classification Workflow step enables users to validate and match specific data to incoming documents in a variety of ways. To better understand Classification, we've broken Classification into three major focus areas:

  • Page Validation – When examining forms, users need to decide the type of page validation required when processing forms. Page validation in the Classification engine defines separation and page merging functionality.
  • Forms Identification – Currently in PSIcapture, users can define and classify forms based on OCR match criteria or barcode recognition. This is the most critical planning step, and will ultimately define how pages are classified and documents are created.
  • Data Extraction – The ultimate goal in classification is to identify the correct Form ID, and then extract data based on the assigned Record Type.

Table of Contents

 

Classification Form Definitions

The Form Definition in Classification allows users to define all the characteristics of a form, how to identify or classify, and provides key methods for how PSIcapture will behave when a classification occurs. In the Classification Workflow Step, users have a few more options than the Global Classification area of the root Configuration dialog, including:

  • Add, Insert, Edit, Copy and Delete - Classification forms are setup through these options. Insert places the form in order below the selected form.
  • Move Up, Move Down, Move To... - Change the order of the classification forms by either manually adjusting row by row, or using the "Move To..." to specify an exact position in the list.
  • Show Details - Shows a variety of additional details, seen by scrolling to the right, which include Rules, Table Extraction, Description, etc.
  • Select From Global - Opens the Global Classification Form Selection dialog:

    selectfromglobal1.png
     
    • Link to selected Classification Forms - Select a Global Classification Form and link it to the Classification Workflow step of this Capture Profile. Any changes to the Global Classification Form will be reflected in this capture profile.
    • Copy selected Classification Forms - Select a Global Classification Form and copy it to the Classification Workflow step of this Capture Profile. Any changes to the Global Classification Form will not be reflected in this capture profile.
  • Copy to Global - Copy one or more Classification Form(s) to the Global Classification Forms list.
  • Import/Export - Users can Import and Export Classification Form Definitions from other Capture Profiles or PSIcapture Installations.

 

class1.png

 

Classification Table Display

  • Form ID - This is the name of the form. 
  • Record Type - A record type could be something like an invoice, quote, purchase order, etc. This is another way to separate your forms.
  • Group Type - A group type could be something like manufacturing, tax, HR, construction, etc. This allows the user to group forms together per industry for instance.
  • Validated - This shows whether a Form has been validated or not.
  • Zone Profile Triggered - This shows which Zone Profile is triggered when the application recognizes a form. The gear icon allows users to edit the Zone Profile from Classification Settings.

    Zone Profile Editing Options - When an user clicks on the gear icon they get 3 options:

    • Edit the Zone Profile associated with the Form. If multiple profiles are associated to the current Form the option of choosing which profile to edit is available.

    • Create a new Zone Profile for the Form.
    • Select a different Zone Profile not associated with the Form.

At the bottom of the forms list there is an area where the user can see a few statistics. These statistics tell the user how many total forms there are, how many validated forms there are and what percentage of them are actually validated, how many record types there are, and how many groups of forms there are. Users can also choose whether to show if a form has been validated via a checkbox in the far right column of the forms list.

 

 

NOTE: The View Usage button allows users to view global usage of each form. When clicking on View Usage the following window will pop up allowing users to run a query by timeframe.

 

 

Finally, at the bottom of the Classification Forms section, there are three notable functions:

 

nottom.png

 

Process Classification using all global Classification Forms... - if this option is enabled, new Classification Forms will always be saved to the Global Forms collection. Additionally, when Classification is run, it will always reference the Global Classification list as well as the local classification list.

 

Highlighted rows indication globally linked Classification Forms... - Rows highlighted in this shade indicate that the Classification Form locally listed there is linked to a Global Classification Form, as changes to the Global Form, or Local Form, will reflect changes in its counterpart.

 

Classification Form Validation options

Require entry of the following setting on Classification Forms: The administrator can require that when Classification Forms are created, either in the Configuration dialog or in Accelerated Classification (ACE), the user must enter the checked boxes of Group, Record Type, Description, or Tag. If the user attempts to save a classification form without completing these fields, they will encounter an error message:

 

err.png

 

Adding Form Definitions

Clicking the Add button will open the Form Definition dialog. As mentioned, this provides an interface for defining all the characteristics of a form. Within this configuration interface, users have the standard template toolbar which allows them to load or scan a template image, as well as a set of zooming tools.

 

class2_1.png

 

Form Settings

 

class2_2.png

  • Form ID – The Form ID is the name of the form these characteristics define. Note: This name will be available as a variable, and be placed in a linked index field.
  • Group – The Group allows users to create subsets of forms and currently is purely for organization within the configuration.
  • Record Type – This dropdown will link to the configured Record Types on step 3 of the configuration wizard, and allows the linking of the Form Definition to the chosen Record.
  • Description – Allows a user defined description of the form.
  • Tag – Tags can be mapped to an index field. Jump to the Classification Settings - General section below for the mapping area.

    NOTE: The Tag value is also available in the following product areas:

    • When using ACE in a Classification Workflow Step:

      ace23.png

      Whether the Tag field is displayed in Standard or Advanced mode, as with other ACE Settings, can be controlled via the "Display Mode Options" > Advanced section of the PSIcapture Administrator Guide: Accelerated Classification Engine (ACE) article:

      tag2.png

    • Also, Tags are displayed on the Classification Form Database Import screen:

      database_import_tag.png

  • Page Count – For forms of specified page lengths, this count will be utilized in page validation.
  • Usage Ranking Behavior - This option allows users to keep the current use ranked position or override usage ranking settings so that the selected form gets process in the beginning or end of the queue.
    • Use Ranked position

    • Override Ranking and process Form at the beginning of the Form list

    • Override Ranking and process Form at the end of the Form list

Tool Icons

 

When clicked a pop-up window comes up allowing the user to choose what text will be used to identify the Form.
When clicked the Barcode Recognition window pops up allowing the user to choose what barcode will be used to identify the Form.
When clicked the application will verify that it recognizes the text or barcode defined.
When clicked the edit Regex window pops up allowing the user to edit the regex for the rule.
Deletes the rule.

 

Rules

Classification Rules

The Classification Rules section of the module provides the ability to input one or more rules that will define the form. Below are the options:

 

class2_3.png

  • Match – Users can choose a positive or negative match for the rule, and combine them to build a series of rules that will define the form. For instance, users may have a form that has “Form OFS 2” on the top, but there are two versions, with different locations for the required data. One form has “Version 2” on the bottom, one does not. Users can use a negative rule to make sure the form without Version 2 is properly identified.
  • Rule Type – Currently there are two types of rules, OCR Text and Barcode.
  • Search Region - This allows the user to select where on the page the OCR text is searched for.
  • Index Value - This allows the user to select which index field to set the value of using the classification rule.
  • Rule Value – The Rule Value provides an entry point for a regular expression to match either the barcode value or an OCR expression. This will trigger the classification and setting of Record Type.
  • Rule Match Behavior – If users have multiple rules, this drop down will provide a means to logically combine them to define the overall match.
    IMPORTANT NOTE: Rule Matching behavior applies to all corresponding Classification areas, including ACE and Database Import settings.
    Users can either choose to:

    rules1.png

    • Classify based on first matching rule - Classifications will be matched on the first matching rule.
    • Positive Classification if all positive rules match and no negative rules match - make the combination of all the positive rules required, with no matches to negative rules.
    • Positive Classification if any positive rules match and no negative rules match - make any of the positive rules required, with no matches to negative rules.
    • Positive Classification if N or more positive rules match and no negative rules match - specify a minimum threshold for number of rules that must be matched, with no matches to negative rules.
    • Custom matching using Rule Sets - Group different Rules into custom Rule Sets by adding Rules via the "Add button". The following extension to the dialog window is added:

      rules3.png

      • Rule Set match behavior - Define how the matching process works for your customized Rule set with the following options:

        rules2.png

      • Positive Classification if all Rule Sets match - make the combination of all the positive Rule sets required, with no matches to negative rules.
      • Positive Classification if any Rule Set matches - make any of the positive Rule sets required, with no matches to negative rules.
      • Positive Classification if N Rule Sets match - specify a minimum threshold for number of rule sets that must be matched, with no matches to negative rules.

Note: The order of rules can be used to the user's advantage as rules are processed in the order of entry.

 

Last Page Classification Rules

If Last Page Rule processing is enabled and a Form Definition contains Last Page Rules, then when that Form is classified, all other Page Validation and classification is disabled and classification will only search for a matching last page for that form. Once is it is found, all pages up to that page will be added to that Form and classification will switch back to normal processing looking for matches for all defined forms. We will also handle the special case where the first page of a Form is also a last page.

If a Form Definition does not contain Last Page Rules, then the selected option under Page Validation will be used (Loose, Strict, None). This allows users to mix both types of validation in case they aren't able to use Last Page Rules for all of their forms.

 

lastpage1.png

 

Table Extraction-Line Items

 

tableex1.png

 

Form Qualifiers

This allows classification based on the page orientation or the size of the form. This can be useful as an additional criteria for defining a form, or can be used by itself with no rules to define a form. An example might be when scanning checks and check stubs, users can assign a record type of Check when certain page size criteria are met.

 

 

Importing Classification Forms

Clicking Import button on Classification Module settings will now display a dialog allowing users to choose which type of import to perform:

 

 

Database Import

 

class2.png

 

Database Connection

  • Database Type  - Specify the database type from the dropdown menu.
  • Database - Manually build a connection string, or use the "Select" button to pull up the database connection dialogs and browse for an available database, which will then automatically build a connection string based on your input and selections.
  • Table or view - Select the table or view applicable to this database import.

Import Definition

  1. Form ID - This field is required, and should be unique. Form ID, Description and Rules all use the standard Build Custom Value dialog to build those values from different database fields/constants.
  2. Description - Build a description, which can be based on an index field.
  3. Tag - Setup a Tag, which can also be mapped to an index field.
  4. Record Type - Set a record type from Existing, Map to a Database Column, or Enter Manually.
  5. Group - Set a group from Existing, Map to a Database Column, or Enter Manually.
  6. Page Count - Map this value to a Database Column or Enter Manually.

The other fields are all optional including Rules. Setting up Rules during this step applies them universally across all imported forms. By making Rules optional, it allows the user to come back later and add rules to individual forms.

When defining Rules, users can either use the values from the table as is, or, run the values through the Regex Builder to generate codes necessary. This behavior is controlled for each rule separately using the “Convert to Regex” option. The global Regex Options can be accessed using the Regular Expression Options button.
NOTE: Additionally, as of PSIcapture 7.7+, once "Convert to Regex" is selected, administrators have the further option to specify the Regex Format:

standardnumeric.png

    • Standard - Uses the global Regular Expression Builder options to generate the regular expression.

    • Numeric - Format Independent - Generates a Regular Expression that will match any text that contains the numeric digits of the value being used to generate the regular expression no matter what extra characters are also in the text.

Rule Match Behavior - This corresponds with the Match Type column available on Classification Form rules and last page rules. See the Rules section above for more information. When rules are created during the import, they will be generated with the Match Type defined on the rule definition on the import template.

 

Import Options

  1. Duplicate Form ID Behavior – Users can either skip creation of a form if a duplicate is found or add the rules to an existing form.
  2. “Mark Imported Classification Form Definitions as Not Validated….” – If selected, this option will import the form as Not Validated. If the corresponding option on the Classification Definition settings is selected (see below), documents that match these Non Validated Forms will be treated as Exceptions to be processed on the Classification Validation dialog. To validate the Form, the user will open the Form in the ACE dialog. When they save out of ACE, the form will be validated for that document, any others in the batch of that type of Form and all future documents classified as that Form type.
  3. "Do not create Classification Form Definitions that have no rules" - If selected no rule will be added and the form will not be created. The system will warn the user and let them know which form definitions were not made.

Sample Database Import

 

class3.png

 

Scheduling Recurring Database Imports

 

Once a database connection has been configured, select the "Load/Manage Templates" button at the top of the Classification Forms Database Import dialog:

 

load1.png

 

Once selected, the following screen will appear:

 

db1.png

 

Select the "Calendar" icon next to the configured Database Import profile that you wish to schedule for a recurring import. The following dialog will appear:

 

db2.png

 

Enable Recurring Automated Import - Select this box to enable a regularly scheduled database import. Then, specify whether you wish to import the Classification Profile to the "Global Classification" Forms list, or restrict it to a specific Capture Profile via "Capture Profile", and then pick the corresponding Capture Profile by hitting "Select Capture Profile".

 

Recurrence Settings... - Setup the recurrence of the Database Import via the following dialog:

 

db3.png

 

Customize Interval and time of day to perform the import according to your business needs.

 

Custom Text File Import

All users need to do is Browse to the location of the text file and click the Import button.

 

XML Import

This allows users to select an XML file that they have exported previously from the Form Definitions export option. NOTE: In versions 6.0.2.x and below this import option is only available in the Classification Configuration settings of the main configuration.

Export

This allows users to export an XML file from Classification Workflow Settings.

 

Classification Settings - General

 

class2.png

 

OCR Text Classification Settings

The Classification module works by extracting a specified amount of header and footer text from the processed page, and then searching for match terms. The module allows users to adjust these settings to take in more or less text, depending on the structure of the forms they are processing.

 

 

Users can set the text area to be either set in lines or area, and then define the amount of either they want to consume. 

NOTE: The more area users consume, the more time it will take the engine to process the text. This can become a performance issue if users are processing large areas or number of lines.

 

Indexing Options

 

indexing1.png

 

  • Index field to populate Form ID -This option allows users to choose to assign the Form ID to an index field or populate the Page Information dialog with Classification OCR Text for troubleshooting.
  • Index field to populate with Group Name - This option allows users to assign the Group Name to an index field.
  • Index field to populate with Record Type - This option allows users to assign the Record Type to an index field.
  • Index field to populate with Description - This options allows users to assign the Description to an index field.
  • Index field to populate with Tag - The "Tag" field described in Adding Form Definitions above, in the "Form Settings" section can be mapped to an index field, allowing for a link between index fields and classification profile Tags, expanding the ability for classification forms to have an independent field "Tag" that can be used for specific identification configurations.

 

OCR Text Viewing Options

This options allow the user to view the OCR text in either the Page Information dialog or the Classification Validation dialog.

 

 

 

Processing Options

 

class3.png

 

Group Processing Options

In PSIcapture 7.7+, Processing Options was moved to its own page. Group processing was put into its own section, and allows users to process Classifications based on group designation.

 

Groups to Process

  • Select Process Groups - Process Documents based on specific group of classification profiles.
    • Example: A PSIcapture Administrator wishes to test a workflow on a new set of Classification profiles. They setup two groups: Production and Testing. User only wants Production forms used for classification so selects just that Group here.

 

Form Processing Setup

 

 

This option allows the choice of processing forms in either the order the forms are defined as (default) or by group. When selecting By Group, the forms will be separated into their groups and each group will be processed in the order the groups are defined. The By Usage Ranking option processing the forms that are matched most often first, allowing for faster processing.

Form Processing Order

  • Defined Order
  • By Group
  • By Usage Ranking
  • By Usage Ranking and by Groups

Calculate Usage Ranking using (only available when usage ranking is selected)

  • All Usage
  • Usage from the last 3 months
  • Usage from the last 6 months
  • Usage from the last 12 months

Page Processing

 

 

Run Classification on:

  • All Pages of document (Default)
  • First Page of Document Only
  • Custom Page List of Document

Enter Pages to Classify

Allows users to enter a list of pages that classification should be run on. NOTE: Available if Custom Page List of Document is selected.

Group Processing Options

This allows the ability to filter which Classification forms are run on a Capture Profile by selecting from a list of form groups.

 

 

Classification - Page Validation

 

classification_7.7.png

 

Page Validation Options

The type of forms that users are processing will determine the type of Page Validation to choose. Page validation in the Classification Module determines how pages will be combined and validated during the classification process. Users have several choices in how pages are treated once a Form Type is identified/matched to a classification rule. In validation methods that require page count, counts are referenced from the Form ID Definition. Below are the types of validation and an explanation of the behavior:

  • Loose Page Count Validation – In this validation method, once a Form is identified, susequent pages will be added to the form until the page count is reached, or until another Form is identified. This method can be utilized with both fixed page count forms as well as varibale length documents like invoices in mixed batches.
  • Strict Page Count Validation – In Strict mode, the product will count form pages, and if they do not equal the page count defined in the Form Definition, an exception will occur.
  • No Page Count Validation – In this method, page counts are totally ignored, and the combining of pages can occur based on one of the chosen options:
    • Combine with non-classified document with previously classified document
    • Combine with classified document with previously classified document if same Form ID

Last Page Rule validation options

  • When a Form is matched that has Last Page Rules defined...

This option enables/disables Last Page Rule processing. 

If Last Page Rule processing is enabled and a Form Definition contains Last Page Rules, then when that Form is classified, all other Page Validation and classification is disabled and classification will only search for a matching last page for that form. Once is it is found, all pages up to that page will be added to that Form and classification will switch back to normal processing looking for matches for all defined forms. We will also handle the special case where the first page of a Form is also a last page.

If a Form Definition does not contain Last Page Rules, then the selected option under Page Validation will be used (described in the above section). This allows users to mix both types of validation in case they aren't able to use Last Page Rules for all of their forms.

  • Classification should fail on a Document if...

When checked this will validate that the page count on Documents that have run through Classification restructuring matches the page count defined on the form that Document was classified as. If the page counts don't match, the Document will fail classification and an alert with the appropriate error message will be attached to that document. Since it fails classification, the Classification Validation dialog will be displayed with that document marked as failed.

  • Start new Document if classification of a new form...

When this box is checked a new document will be created if a new Form ID is triggered before the Last Page Rule is processed.

  • Only start a new Document if Form ID...

When checked a new Form ID is created ONLY if the Form ID is different than the one being checked.

 

Classification - Exceptions Processing

If there are any exceptions during the validation process, users can either interactively fix them in the classification module, or auto-reject to an exceptions batch based on the selection in the Exceptions Processing section below:

 

exceptions1.png

 

Enable Exceptions Processing: Enable the ability, when encountering a failed classification, to process the exception by:

Load Batch in Classification module to allow user to manually classify or reject Documents - 

Automatically reject Documents to Exceptions Batch which fail classification - 

 

User Experience Example: Exceptions that occurred during the classification process will appear with a triangular exclamation icon to their left:

 

 

Check for Classification alerts when user closes Batch - When enabled, PSIcapture will check for Classification Failed alerts on the Batch when a user attempts to close the Batch out of Classification. If Classification Alerts are found, the Batch Errors dialog will be displayed with the list of Documents that have Classification Alerts on them.

 

Note: The action available to the user will depend on the "Do not allow Batch to proceed to next Workflow step if it contains Classification alerts" setting:

  1. Not Checked: User will be asked if they still want to close the Batch:

    screenshot-1.png

  2. Checked: User will not be given the option to close the Batch:

    BatchErrorDialogDontAllowClose.png


Accelerated Classification Engine Options

 

options2.png

 

We have moved Classification Exceptions Processing to its own tab and added a new feature called Accelerated Classification Engine (ACE). As of PSIcapture 7.4+, and expanded in PSIcapture 7.7+, we have added additional automation options, including:

  • Form ID auto-naming
  • Automated Rule creation
  • Automated Last Page Rule creation

 

For a full breakdown of these options, see:

 

PSIcapture Administrator Guide: Accelerated Classification Engine (ACE)

 

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Article is closed for comments.