Extracting data from packets
To process data across related document types, extend your project schema to manage packets.
Packets are sets of related documents processed as a unit, such as a loan application with supporting bank statements and tax documents. While each document type has specific data points, they also share common information like applicant name that you can consolidate at the packet level.
Packet-level data consolidation is achieved with cross-class fields, which leverage fields from existing classes to provide overarching results. Projects with cross-class fields become packet-processing apps. Conversely, removing all cross-class fields reverts the project to a standard automation app.
Understanding packets
When you add cross-class fields to a project, existing project files are automatically organized into packets based on upload batch. You can manually reorganize packets as needed.
During project development, aim for five or so representative packets in your project so you can see how your app performs across a variety of packets.
In production, upload structure dictates packet composition. Each uploadââwhether file, folder, or emailââis treated as a packet and processed in its own run. Plan upstream deployment integrations with this constraint in mind.
Reorganizing packets
As you develop packet-processing apps, you can manually reorganize project files to reflect a different packet structure if needed.
-
From the Cross-class tab, in the aggregated table view, select Reorganize packets.
-
In the Reorganize packets window, drag documents to the appropriate packet or click Add packet to create new packets as needed, then click Save.
Saving automatically triggers re-extraction of cross-class fields to provide updated results that reflect the new packet organization.
Creating cross-class fields
Create cross-class fields for data points that need to be consolidated across classes.
Before you begin
You must create classes and fields before you can add cross-class fields, because cross-class fields build on class fields. For details, see Extracting data from documents.-
From the Cross-class tab, click Add cross-class field.
-
Enter a field name and select a cross-class field type, then specify required details based on the field type.
-
When youâre satisfied with the result, click â to exit the cross-class field editing panel and continue adding fields.
Cross-class field types
Choose the cross-class field type that matches how you want to consolidate data across classes.
-
Ranked â Used to select the best result among multiple input fields based on criteria you specify: first valid field, highest field confidence, or highest OCR confidence. If your ranking logic specifies First valid field, select input fields in prioritized order. For confidence-based ranking logic, the order of input fields doesnât matter.
-
Derived â Used to generate values based on class fields. In the Prompt, reference fields by field name: either type the field name or select it from the dropdown. For example, Generate a risk score by combining Tax document: Annual income, Bank statement: Average balance, and Application: Loan amount using the formula: (income + balance) / loan amount.
When referencing table or list extraction fields, derived fields try to match the input format. For consistent results, especially when combining different field types, consider normalizing tables or lists as text first.
- Custom function â Used to compute values or import third-party data with a custom Python function. For more details, see Cross-class custom function fields.
Cross-class custom function fields
The cross-class custom function field type lets you use a Python function to consolidate or compute values across multiple document classes within a packet.
For example, you might use a cross-class custom function to prioritize applicant name based on multiple sources:
Cross-class custom function fields accept these parameters:
For additional guidance about custom functions, see Writing custom functions.
Handling multiple documents per class
When a packet contains multiple documents of the same class, the system uses the result with the highest field confidence, or the most frequent value if confidence scores arenât available.
For example, if a packet contains two paystubs, the system selects the gross pay with the highest confidence score for cross-class field calculations.
Viewing packet results
When you select the Cross-class tab in the editing panel, an aggregated table view shows cross-class results across packets. From here, drill into individual packets to view class-level results for documents within that packet.
