Building apps
In AI Hub Build, you can create a custom document understanding app in just a few steps. Upload sample documents, tell Build what document types and data points you want to identify, then publish your app for use.
Using the Build interface
Understanding the Build interface can help you efficiently build document understanding apps.
The default Build interface includes three main panels:
-
The document list on the left displays all documents included in your project. If your project includes classification, documents are automatically grouped by class, but you can use the filter, sort, and group icons to change the list display.
-
The document view in the center displays the document selected in the document list. The view pane includes a toolbar, auto-hidden by default, with controls for viewing the selected document, including image or text-only views, keyword search, and page selection.
-
The editing panel on the right displays classes and fields in your project, with results for the document selected in the document list. In projects that belong to organization accounts, a validation tab lets you view, create, and edit validation rules across your project, while class and field controls are displayed in a separate schema tab.
Working with projects
A project is a collection of files and artifacts used to create a Build app. Projects correspond to your unique document understanding workflow, with document types and data points that address your specific use case.
You can view and modify project settings by clicking the gear icon at the top right in any project. Project settings include digitization options and, in organization accounts, file-splitting options that impact how your files are processed.
Digitizing documents
When you upload files to a Build project, they’re digitized, or converted to machine-readable text, according to your project settings.
By default, page rotation, skew, and warp are corrected, and signatures and barcodes are detected.
As you work with your project, you might need to modify digitization settings if data isn’t processed accurately. In your project’s digitization settings, you can preview how changes impact machine-readable text with up to three documents from your project. The before-and-after preview shows a heat map overlay using a red-to-green gradient to represent OCR confidence for each word. Additionally, you can see a summary confidence score for the entire document.
Any time you change digitization settings, all files in your project are redigitized.
Choose the digitization settings suitable for your documents and AI Hub subscription. For details about OCR support for various languages, see Supported languages.
-
Tables — Provides better results when extracting information from tables. For organization members, this option must be enabled to use the table data type in extraction prompts.
-
Checkboxes — Provides better results when extracting information from checkboxes.
Table and checkbox recognition change the OCR processor used, which slows digitization slightly and might impact accuracy, particularly with less common languages. Enable tables and checkboxes only if needed. -
Non-Latin characters Commercial & Enterprise — Enables support for many common languages that use writing systems other than the Latin alphabet (a, b, c…). Support for non-Latin characters is offered in standard and advanced language sets. For details, see Supported languages.
-
Process spreadsheets natively — Processes Excel spreadsheets in their native file format instead of converting to PDF. This option offers better results for wide tables, but doesn’t support embedded objects or source highlighting in results.
-
Treat files as images Commercial & Enterprise — Digitizes files as they appear, discarding any embedded machine-readable text. This option often provides better results for documents that use non-Latin characters, handwritten text, and visually complex documents.
-
Pages Commercial & Enterprise — Limits digitization to specified pages.
Collaborating on projects
You can collaborate on Build projects with other members of a shared workspace by creating your project in that shared workspace. Collaboration in Build uses a combination of exclusive edit access and a timed lock system to mitigate the risk of conflicts or another member overwriting your changes.
Exclusive edit access means that only one user can edit a project at a time. Examples of actions that initiate edit access include clicking Save, uploading a document, and renaming a field or the project. When you hold edit access, a banner displays at the top of the page.
The timed lock system means you can only retain exclusive edit access while actively editing the project. After a period of five consecutive minutes of inactivity, another member of the shared workspace can take over exclusive edit access and initiate their own timed lock. If another user takes over edit access, any unsaved changes made before the five minute period of inactivity are lost.
Creating a project
To get started building a custom app, create a project and add files.
Organization members can fast-track project development by copying classes and fields from another project in their organization.
Before you begin
You must have a set of files that represent the types of documents you want to process. Five or so files of each type is a good start.-
Commercial & Enterprise In the sidebar under your organization name, verify the workspace in which to create the project.
-
Click Create Project or the + icon, then select one of these options based on your AI Hub subscription and project requirements:
-
Blank project — Starts your project in a blank state.
-
Use existing schema Commercial & Enterprise — Copies all classes and fields from another project in your organization. If you select this option, choose the project with the schema you want to duplicate and click Create project.
-
-
Select supported files to start building with.
Before your files are uploaded, you’re prompted to enable tables and checkboxes if needed.
After confirming your visual object selections, files are uploaded and digitized according to your project settings. Documents are added to the document list as they finish processing.
What's next
If your project is blank, your next step is creating classes and fields. If you copied an existing project schema, you can add more classes and fields as needed, or modify what was imported.Migrating projects
Community accounts can migrate projects to their organization accounts, if both accounts share a user ID. Migration is a one-way operation. A copy of the project isn’t retained in your community account.
-
In workspaces, on the Create tab, locate the project you want to migrate.
-
Click the overflow icon
, select Migrate project, then select the organization. -
Select the organization workspace that you want to move the project to, then click Migrate project.
Moving projects
Any member of a workspace can move a project from one workspace to another. After moving a project, only members of the new workspace can access and edit the project.
-
In workspaces, on the Create tab, locate the project you want to move.
-
Click the overflow icon
, then select Move to another workspace. -
Select the workspace that you want to move the project to, then click Move.
Copying projects
Any member of a workspace can copy a project from one workspace to another.
Copying a project duplicates project settings, files, classes, fields, cleaning, and validation, but results aren’t copied. When you open a copied project for the first time, classification and extraction results are generated.
-
In workspaces, on the Create tab, locate the project you want to copy.
-
Click the overflow icon
, then select Copy project. -
Select the workspace that you want to move the project to, then click Copy.
Creating classes and fields
To process documents, you must specify which data points, or fields, you want to extract. If your project includes different document types, like a mix of passports and driver’s licenses, create a class for each document type and specify fields for each class.
You can create up to 250 classes per project, and up to 100 fields per class.
Creating classes
If your project includes different document types, start by creating a class for each document type. You can then specify a different set of fields for each class.
Organization members can import prebuilt classes from a library of established schemas, such as paystubs, invoices, bank statements, and utility bills.
In projects with classification, a default class called other is assigned to documents that can’t be classified. You can’t delete or modify this class.
-
In the editing panel, on the Schema tab, click the Create classes icon
, then select one of these options based on your AI Hub subscription and project requirements:-
Create classes — Lets you create a custom class without any fields. If you select this option, enter a succinct name for your document type, then click ← to exit the class editing panel.
-
Browse prebuilt classes Commercial & Enterprise — Lets you add common document types and their associated fields based on a library of available schemas. If you select this option, choose the prebuilt classes that you want to add and click Add to project.
-
-
Use the Create classes icon
to add more classes as needed. -
When you’re done creating classes, click Classify documents.
Commercial & Enterprise If your project includes multipage files, you’re prompted to enable splitting files that include multiple document types. After classifying your documents, page ranges indicate how files are split.
Build assigns classes to your documents and groups documents by class in the document list. Any documents that can’t be classified are assigned the other class.
-
Verify classification. If documents weren’t classified as expected, edit classes to improve your results.
For enterprise users, you can reference classification confidence scores to help you verify classification. Classification confidence scores are displayed for the selected document in the class editing panel.-
In a class that wasn’t identified accurately, click the overflow icon
, then select Edit class. -
Enter a description to help the model more accurately identify documents in the class, then click ← to exit the class editing panel.
Effective descriptions include unique identifying details about a document class. Use details related to text in the documents, rather than visual elements like color, which the model can’t “see.” For example, Documents include the phrase “Criteria for use” or Drug name, printed in all capital letters, contains “statin”. As a best practice, limit descriptions to 1,000 characters (4,000 maximum). -
Use the overflow icon to edit more classes as needed.
-
When you’re done editing classes, click Classify documents.
-
Creating fields
Create fields for each of the data points you want to identify.
-
In the editing panel, on the Schema tab, click Add field.
-
Enter a field name or select a suggested field name, then press Enter.
Build extracts data based on field name alone and displays the result.
-
Do one of the following, based on whether your result is accurate:
-
Accurate result — Click ← to exit the field editing panel and continue adding fields.
-
Inaccurate result — Edit the field. When you’re done editing, click ← to exit the field editing panel and continue adding fields.
-
Editing fields
If field name alone doesn’t return the results you expect, you can edit fields to give Build more guidance.
Access the field editor for an existing field by hovering over the field and clicking the edit icon
In the field editor, first choose the field type appropriate for the data you want to identify.
-
Text extraction — Used to extract a string of text or numbers, such as address, account balance, or filing status.
-
Table extraction Commercial & Enterprise — Used to extract tables. For more details, see Extracting tables.
-
List extraction Commercial & Enterprise — Used to extract a list of items, such as deposits on a banking summary, billing codes on a medical claim form, properties on a broker submission, or items on a receipt. If there are additional data points associated with each item that you want to identify, such as price and SKU for receipt items, you can add an attribute for up to 30 data points. For best results, limit attributes to 10.
-
Document reasoning — Used to generate results that aren’t explicitly found in the document, but can be deduced, summarized, or calculated.
Unless you specify otherwise in your prompt, document reasoning fields assume the current date and time. -
Derived field Commercial & Enterprise — Used to generate values based on preceding fields in the class. Reference fields by field name: either type the field name or select it from the dropdown. For example, Identify the state in Customer address. If necessary, you can reorder fields to enable referencing.
When referencing table or list extraction fields, the derived field tries to match their format. For consistent results, especially when combining different field types, consider normalizing tables or lists as text first. -
Custom function Commercial & Enterprise — Used to compute values or import third-party data with a custom Python function. For more details, see Custom function fields.
For extraction and reasoning fields, if necessary, use Description or Natural language prompt to add details about the field. As a best practice, keep field and attribute names under 48 characters and use a description or prompt for longer content up to 1,000 characters (4,000 maximum).
In reasoning fields, you can use the Enhance prompt option to make improvements to your prompt. Prompt enhancement relies on the selected model to optimize your prompt, checking for clarity, concision, and coherence while eliminating contradictions and redundancies.
For extraction and reasoning fields, organization members can change the model from standard to advanced using the Select model icon
When you’re done editing a field, click Run to see results and further refine your edits if needed.
Extracting objects
You can extract objects such as tables, checkboxes, signatures, and barcodes using specific settings and prompts.
Extracting tables
The method for extracting tables differs for community users and organization members.
Community To extract tables as a community user, use a reasoning prompt and describe the table extraction you want to perform in the prompt.
You can extract multipage tables and perform some table manipulation with reasoning prompts, however this method requires more trial and error than the commercially supported method.
Here are some examples of reasoning prompts for tables:
-
Extract transactions as a Markdown table
-
Extract transactions as JSON
-
Extract all tables with columns Date, Description, Debit, Credit
-
Extract transactions and filter for amounts greater than $1,000
Commercial & Enterprise To extract tables as a commercial or enterprise user, use an extraction prompt with the table data type and describe the table extraction you want to perform in the prompt. This method extracts tables with a high degree of accuracy and lets you manipulate tables in various ways.
Here are some examples of extraction prompts for tables:
-
Extract tables with columns Date, Description, Debit, Credit
-
Extract transactions and filter for amounts greater than $1,000
-
Extract transactions and return results for 01 April through 15 April
-
Extract transactions and sort amounts from smallest to largest
-
Extract transactions and add a column Flagged with values set to Yes if the debit is greater than $70
Extracting checkboxes
Checkboxes can be extracted with either an extraction or reasoning prompt.
-
For a group of checkboxes with a label, such as the Filing Status field on a tax form, use the label for your field name.
-
For a standalone checkbox, use a question that indicates whether the checkbox is ticked. For example, Is the filer claiming capital gains or losses?
Extracting signatures
You can extract information about signatures, including whether a document is signed, who the signer was, and the signature date. Extraction of signature images isn’t available.
Signature information can be extracted with either an extraction or reasoning prompt.
Often, a field name like signatory name, signatory title, or signature date is adequate. If field name alone fails to extract the data you want, edit the field and provide a description or natural language prompt. For example:
-
Extract all signatures
-
Return yes if this document is signed
Extracting barcodes
You can extract information about barcodes, including their presence, quantity, and if printed, associated numeric values.
Barcode information can be extracted with either an extraction or reasoning prompt.
Often, a field name like barcode value is adequate. If field name alone fails to extract the data you want, edit the field and provide a description or natural language prompt. For example:
-
Return all barcode values
-
Return yes if this document contains a barcode
Custom function fields
The custom function field type lets you use a Python function to compute values or import data to your project schema.
For example, you might use a custom function to calculate total invoice amount using existing subtotal and tax rate fields:
Custom function fields accept these parameters:
For additional guidance about custom functions, see Writing custom functions.
Reordering fields
To change the order of fields in the field editor, use the up and down arrows that display when you hover over a field.
Reordering fields can be necessary when creating derived fields, which can reference fields that precede it in the field editor. Additionally, reordering fields can be helpful to speed up reviews or support downstream integrations, because fields are displayed in processed results in the same order as in the field editor.
Viewing results across documents
To quickly scan or compare results, click the Results table icon
The results table corresponds to the current view in the editing panel, so the results you see change depending on your current task.
Cleaning results
If results for a given field aren’t formatted as needed, you can clean data using quick clean options. For organization members, additional cleaning options include a natural language prompt or a custom cleaning function.
Quick clean
Quick clean options include:
-
Changing character casing to all uppercase, all lowercase, or sentence case.
-
Removing characters by specifying characters to remove, with no comma separator.
-
Reformatting date by selecting from available formatting options.
In numeric-only dates like 06/01/2024, use the Input field to specify whether the original value lists the month or the day first.
Quick clean options use Python functions to reformat data. There is no model processing and no unit cost.
Cleaning prompt
If quick clean options don’t work for your data, you can instead use a natural language prompt to clean the output. Prompt-based refinement takes the raw output of your extraction or reasoning prompt and applies whatever instructions you specify in the clean prompt. Effective clean prompts are clear, concise, and detailed.
Cleaning function
For advanced cleaning, you can write a custom cleaning function in Python.
For example, you might use a cleaning function to calculate wages minus income tax from a W-2.
Cleaning functions accept these parameters:
Cleaning functions can return any value. The value is converted to a string when it’s passed to subsequent refinement lines or validation rules. If the cleaning function encounters issues, it must raise an exception.
For additional guidance about custom functions, see Writing custom functions.
Confidence scores
Confidence scores are percentage values that indicate the level of certainty in results, including classification, digitization (OCR), or field results. Confidence scores are calculated by the model or OCR processor.
Field and OCR confidence scores are displayed in the field editor.
For enterprise users, classification confidence scores are displayed in the class editor.
Higher percentages suggest greater confidence. You can use confidence scores to help you fine-tune prompts, or to establish validation rules.
Non-OCR confidence is calculated using log probabilities, which provide a mathematical measure of how likely a result is. In internal testing, this method proves reliable when compared to benchmarks.
Validating results
Organization members can validate results based on confidence scores, a prompt, or a custom validation function. In production, fields that fail validation are flagged for human review.
Classification confidence
Classification confidence indicates the model’s certainty in predicting the class of a document.
You can set a classification confidence threshold across all classes in your project, or you can set thresholds that apply to individual classes.
If multiple confidence validation rules are configured for a class, the stricter rule is applied. For example, if you set a project-wide confidence threshold of 85% and a class threshold of 95%, the 95% confidence threshold is used.
-
In the editing panel, select the Validations tab.
-
Specify validation rules as needed.
To see how a validation rule performs across documents, click any rule in the Validations tab to see an aggregate view.
Field confidence
Field confidence indicates the model’s certainty in predicting results for a given field.
You can set a field confidence threshold across all fields in your project, or you can set thresholds that apply to individual fields.
If multiple confidence validation rules are configured for a field, the stricter rule is applied. For example, if you set a project-wide confidence threshold of 85% and a field threshold of 95%, the 95% confidence threshold is used.
-
In the editing panel, select the Validations tab.
-
Specify validation rules as needed.
To see how a validation rule performs across documents, click any rule in the Validations tab to see an aggregate view.
OCR confidence
OCR confidence indicates the OCR processor’s certainty in digitization accuracy. For instance, a high OCR confidence for Amy Cooper in an account holder field means the model is sure it read those letters correctly, not that Amy Cooper is actually the account holder.
In multi-word results, the lowest confidence score across all words is used. For example, if Amy returns a confidence score of 95 percent, but Cooper returns a confidence score of 65 percent, the reported OCR confidence score for the account holder field is 65 percent.
You can set an OCR confidence threshold across all fields in your project, or you can set thresholds that apply to individual fields.
If multiple confidence validation rules are configured for a field, the stricter rule is applied. For example, if you set a project-wide confidence threshold of 85% and a field threshold of 95%, the 95% confidence threshold is used.
-
In the editing panel, select the Validations tab.
-
Specify validation rules as needed.
To see how a validation rule performs across documents, click any rule in the Validations tab to see an aggregate view.
Validation prompt
Validation prompts let you tell Build how you want to validate a field in your own words. Your prompt is used to generate a custom validation rule written in Python code.
You can permanently convert a prompt-based rule to a custom function, but you can’t edit it later using a prompt.
-
In the editing panel, select the Validations tab.
-
In the class that you want to add a validation rule for, click + Add rule, then select Custom rule > Validation prompt.
-
In the Prompt, describe how you want to validate fields, using autocomplete to select field names.
-
Click Run.
Your prompt is used to generate a custom validation rule in Python code. Rule generation can take several minutes.
When complete, you can view the code by clicking Show generated code.
-
If necessary, iterate on the rule using one of these methods.
-
Modify your prompt — Change your prompt as needed and click Run to regenerate the rule.
-
Convert and edit the rule — While viewing the generated code, hover over the code and click Convert to custom function. After conversion, edit the custom function as needed.
-
Validation function
For advanced validation, you can write a custom validation function in Python.
For example, you might use a validation function to check that a date is within a certain range.
Validation functions accept these parameters:
Validation functions must return None
if the validation rule passes, and an error string if the validation rule fails.
For additional guidance about custom functions, see Writing custom functions.
Creating your app
Creating an app lets you reuse, automate, and share your project functionality.
-
From your Build project, click Create app.
-
Confirm the name of your app and specify optional details, then click Next.
-
Name — By default, apps are assigned the same name as the corresponding Build project. You can change app name as needed when you create the app, but it can’t be changed later.
-
Description — Enter a description for the app to help users understand its purpose.
-
App icon — Upload an icon to represent your app. Icons can be up to 2 MB.
-
Sample files — Upload up to three representative files that users can use to preview app functionality.
-
-
Confirm version details, then click Create app.
-
Version — By default, the first version of your app is numbered 0.0.1.
-
Release state — By default, apps are created in the Production state, which gives access to other users you share the app with. To restrict access to only yourself, select Pre-production.
Your app is created and published to the Apps tab.
-
To version an app, make any needed changes to the corresponding project, then click Update. To change the release state of an existing version, from the Version history tab, hover over an app version and click the Edit release state icon