Extracting data from packets | Instabase AI Hub Documentation

Enterprise

To process data across related document types, extend your project schema to manage packets.

Packets are sets of related documents processed as a unit, such as a loan application with supporting bank statements and tax documents. While each document type has specific data points, they also share common information like applicant name that you can consolidate at the packet level.

Packet-level data consolidation is achieved with cross-class fields, which leverage fields from existing classes to provide overarching results. Projects with cross-class fields become packet-processing apps. Conversely, removing all cross-class fields reverts the project to a standard automation app.

Understanding packets

When you add cross-class fields to a project, existing project files are automatically organized into packets based on upload batch. You can manually reorganize packets as needed.

During project development, aim for five or so representative packets in your project so you can see how your app performs across a variety of packets.

In production, upload structure dictates packet composition. Each upload––whether file, folder, or email––is treated as a packet and processed in its own run. Plan upstream deployment integrations with this constraint in mind.

Reorganizing packets

As you develop packet-processing apps, you can manually reorganize project files to reflect a different packet structure if needed.

From the Cross-class tab, in the aggregated table view, select Reorganize packets.
In the Reorganize packets window, drag documents to the appropriate packet or click Add packet to create new packets as needed, then click Save.

Saving automatically triggers re-extraction of cross-class fields to provide updated results that reflect the new packet organization.

Creating cross-class fields

Create cross-class fields for data points that need to be consolidated across classes.

Before you begin

You must create classes and fields before you can add cross-class fields, because cross-class fields build on class fields. For details, see Extracting data from documents.

From the Cross-class tab, click Add cross-class field.
Enter a field name and select a cross-class field type, then specify required details based on the field type.
When you’re satisfied with the result, click ← to exit the cross-class field editing panel and continue adding fields.

Cross-class field types

Choose the cross-class field type that matches how you want to consolidate data across classes.

Ranked — Used to select the best result among multiple input fields based on criteria you specify: first valid field, highest field confidence, or highest OCR confidence. If your ranking logic specifies First valid field, select input fields in prioritized order. For confidence-based ranking logic, the order of input fields doesn’t matter.
Derived — Used to generate values based on class fields. In the Prompt, reference fields by field name: either type the field name or select it from the dropdown. For example, Generate a risk score by combining Tax document: Annual income, Bank statement: Average balance, and Application: Loan amount using the formula: (income + balance) / loan amount.

When referencing table or list extraction fields, derived fields try to match the input format. For consistent results, especially when combining different field types, consider normalizing tables or lists as text first.

Custom function — Used to compute values or import third-party data with a custom Python function. For more details, see Cross-class custom function fields.

Cross-class custom function fields

The cross-class custom function field type lets you use a Python function to consolidate or compute values across multiple document classes within a packet.

For example, you might use a cross-class custom function to prioritize applicant name based on multiple sources:

1 def consolidate_applicant_name(class_fields, context, keys):
2     # Access class field values through class_fields dictionary
3     bank_name = class_fields['Bank Statement']['Account Holder']['value']
4     license_name = class_fields['Driver License']['Full Name']['value']
5 
6     # Implement consolidation logic
7     if bank_name and len(bank_name.strip()) > 0:
8         return bank_name
9     elif license_name:
10         return license_name
11     else:
12         return "No applicant name found"

Cross-class custom function fields accept these parameters:

Parameter	Required?	Description
`context`	Required	Stores metadata about the packet.
`context['class_fields']`	Optional	When writing cross-class custom functions in automation projects, click Add argument to select class fields to reference in the function. Access using `class_fields['Class Name']['Field Name']['value']`.
`context['document_text']`	Optional	Retrieves the entire text of all documents in the packet.
`context['file_path']`	Optional	Retrieves the path to the uploaded packet.
`keys`	Optional	Access custom variables and organization secrets. Use `keys['custom']['<key-name>']` for custom keys and `keys['secret']['<key-name>']` for secret keys.

Return type

When defining a cross-class custom function field, you can set the return type to Text choices to limit valid returned values to a specified set. In the custom function field editor, use the return type dropdown to switch from Text to Text choices, then add the allowed values. Define as a comma-separated list or use Import as CSV to upload a CSV file with one value per cell (up to 1,000 options).

The custom function must return one of those values, otherwise a validation error is shown. In human review, fields with text choices display a dropdown of valid options from which reviewers can select.

For additional guidance about creating and using custom functions, see Writing custom functions.

Handling multiple documents per class

When a packet contains multiple documents of the same class, the system uses the result with the highest field confidence, or the most frequent value if confidence scores aren’t available.

For example, if a packet contains two paystubs, the system selects the gross pay with the highest confidence score for cross-class field calculations.

Viewing packet results

When you select the Cross-class tab in the editing panel, an aggregated table view shows cross-class results across packets. From here, drill into individual packets to view class-level results for documents within that packet.