Working with automation projects
Understanding the automation project interface can help you efficiently create automation apps.
The default interface includes these elements:
-
The project header displays your projectβs workspace and project name, with a dropdown indicator to access project settings. The processing mode indicator shows whether your project uses agent mode or legacy mode. Projects in legacy mode can update to agent mode using the dropdown. Workflow selectors at the right outline development phases from project to app to deployment. Use the selectors to navigate between stages and to review required steps for each stage.
-
The document list on the left displays all documents included in your project. If your project includes classification, documents are automatically grouped by class, but you can use the filter, sort, and group icons to change the list display.
-
The document view in the center displays the document selected in the document list. The view pane includes a toolbar, auto-hidden by default, with controls for viewing the selected document, including image or text-only views, keyword search, and page selection.
Use the icons in the Documents header to switch between viewing modes: single document, document grid, or results table. -
The editing panel on the right displays classes and fields in your project, with results for the document selected in the document list. A validation tab lets you view, create, and edit validation rules across your project, while class and field controls are displayed in a separate schema tab.
Creating projects
A project is a collection of files and artifacts used to create an automation app. Projects correspond to your unique document understanding workflow, with document types and data points that address your specific use case.
You can create a project in several ways, depending on your AI Hub subscription and project requirements.
-
Create a blank project β Starts your project in a blank state.
-
Use an existing schema β Copies all classes and fields from another project in your organization.
-
Customize an existing app β Copies the source project for an enabled app, including all project settings, classes, fields, validations, and up to three sample files, if available.
Creating a blank project
To get started building a new app, create a blank automation project and add files.
Before you begin
You must have a set of files that represent the types of documents you want to process. Five or so files of each type is a good start.-
In Workspaces, select the Create tab.
-
In the sidebar under your organization name, verify the workspace where to create the project in.
-
Click Create > App > Blank project.
-
Select supported files to start building with.
Before your files are uploaded, youβre prompted to enable tables and checkboxes if needed.
After confirming your visual object selections, files are uploaded and digitized according to your project settings. Documents are added to the document list as they finish processing.
π Visual tutorial: Creating a blank project
What's next
Create classes and fields.Using an existing schema
Commercial & EnterpriseFast-track project development by copying classes and fields from another project in your organization.
Before you begin
You must have a set of files that represent the types of documents you want to process. Five or so files of each type is a good start.-
In Workspaces, select the Create tab.
-
In the sidebar under your organization name, verify the workspace to create the project in.
-
Click Create > App > Use existing schema.
-
Select the project with the schema to duplicate and click Create project.
-
Select supported files to start building with.
Before your files are uploaded, youβre prompted to enable tables and checkboxes if needed.
After confirming your visual object selections, files are uploaded and digitized according to your project settings. Documents are added to the document list as they finish processing.
π Visual tutorial: Creating a project from an existing schema
Customizing an existing app
Commercial & EnterpriseEdit any customization-enabled app by saving it as a new project, so you can modify it to suit your needs.
Customization copies all project settings, classes, fields, validations, and up to three sample files, if available. Customized apps are independent copies that arenβt automatically updated when the original app is modified.
Customizable apps display a Customize app option below the app name and owner.
-
From the Hub, open the app to customize.
-
Click Customize app.
-
Enter a project name and select the workspace to create the project in, then click Customize.
A notification appears indicating that your project is being created.
-
When the notification indicates that your project is ready, click Open project.
What's next
-
Consider adding more files to verify results with a larger or more representative sample.
-
Add classes and fields as needed, or edit existing fields.
Collaborating on projects
You can collaborate on automation projects with other members of a shared workspace by creating your project in that shared workspace. Collaboration uses a combination of exclusive edit access and a timed lock system to mitigate the risk of conflicts or another member overwriting your changes.
Exclusive edit access means that only one user can edit a project at a time. Examples of actions that initiate edit access include changing project settings, uploading a document, and renaming a field or the project. When you hold edit access, a banner displays at the top of the page.
The timed lock system means you can only retain exclusive edit access while actively editing the project. After a period of five consecutive minutes of inactivity, another member of the shared workspace can take over exclusive edit access and initiate their own timed lock. If another user takes over edit access, any unsaved changes made before the five minute period of inactivity are lost.
Relocating and duplicating projects
There are several options for moving or duplicating project work across workspaces, organizations and environments.
-
Move β Change which workspace owns an existing project from the Workspaces list. The project leaves the source workspace; only members of the destination workspace can access it afterward.
-
Copy β Create a duplicate project in another workspace, including files and settings. Extraction results arenβt copied; theyβre regenerated when you open the copy.
-
Export and import β Download a ZIP file from a project, then import it to a blank project in another workspace, organization and environment. Use this for handoffs, for backups, or when the source project isnβt available where youβre working.
Moving projects
Any member of a workspace can move a project from one workspace to another. After moving a project, only members of the new workspace can access and edit the project.
-
In Workspaces, select the Create tab.
-
Locate the project to move, click its overflow icon , then select Move project.
-
Select the workspace to move the project to, then click Move.
Copying projects
Any member of a workspace can copy a project from one workspace to another.
Copying a project duplicates project settings, files, classes, fields, cleaning, and validation, but results arenβt copied. When you open a copied project for the first time, classification and extraction results are generated.
-
In Workspaces, select the Create tab.
-
Locate the project to copy, click its overflow icon , then select Copy project.
-
Select the workspace to copy the project to, then click Copy.
Exporting and importing projects
Export and import projects to reuse work across workspaces, organizations, and environments, or to create backups. The exported ZIP file contains the project details, schema, validation rules, and custom functions, including shared functions if the project uses them. Sample files or previously uploaded files arenβt included.
-
In Workspaces, select the Create tab and open the project to export.
-
In the project header, click the dropdown indicator next to the project name.
-
Select Export project.
-
Click Export, then save the ZIP file when prompted.
Navigate to the target workspace.
-
Create or open a blank project.
-
In the editing panel, click Import project.
-
Select the exported ZIP file and complete the import flow.
After import, add documents and run processing as you would for any new project.
Project settings
Access project settings from the dropdown indicator next to your project name.
File digitization
Legacy modeAs you work with projects in legacy mode, you might need to modify digitization settings if default digitization processes donβt provide high-quality results.
Edit digitization settings from the Processing tab. Here, you can preview how changes impact machine-readable text with up to three documents from your project. The before-and-after preview shows a heat map overlay using a red-to-green gradient to represent OCR confidence for each word. Additionally, you can see a summary confidence score for the entire document.
Any time you change digitization settings, all files in your project are redigitized.
Choose the digitization settings suitable for your documents and AI Hub subscription.
-
Tables β Provides better results when extracting information from tables, and enables table highlighting, which lets you enlarge, copy, or download highlighted tables directly from the document viewer.
This option must be enabled to use table extraction fields in organization projects.
-
Checkboxes β Provides better results when extracting information from checkboxes.
Table and checkbox recognition change the OCR processor used, which slows digitization slightly and might impact accuracy, particularly with less common languages. Enable tables and checkboxes only if needed. -
Non-Latin characters Commercial & Enterprise β Enables support for many common languages that use writing systems other than the Latin alphabet (a, b, cβ¦). Support for non-Latin characters is offered in standard and advanced language sets. For details, see Supported languages.
-
Process spreadsheets natively β Processes Excel spreadsheets in their native file format instead of converting to PDF. This option offers better results for wide tables, but doesnβt support embedded objects or source highlighting in results.
-
Process attachments β Processes attachments as separate files, or removes them if disabled. This setting doesnβt affect emails processed via upstream integration, which is managed by deployment configuration.
-
Treat files as images Commercial & Enterprise β Digitizes files as they appear, discarding any embedded machine-readable text. This option often provides better results for documents that use non-Latin characters, handwritten text, and visually complex documents.
-
Pages Commercial & Enterprise β Limits digitization to specified pages.
Processing
Agent modeProjects in agent mode offer simplified file processing configuration options that optimize performance and resource usage.
Any time you change file processing settings, all files in your project are reprocessed.
Choose the file processing settings suitable for your documents and use case.
-
Disable OCR β Skips optical character recognition (OCR) during file processing. When enabled, files are processed without text extraction, meaning no document text is available for extraction or processing. This setting significantly improves processing speed and reduces memory usage, making it ideal for use cases with large documents and workflows that donβt require the text content of documents.
When OCR is disabled, any custom functions or functionality that require document text donβt work. Only enable this setting if your extraction workflow doesnβt depend on document text content.When OCR is disabled, file splitting is supported for larger file sizes. For details about requirements and constraints, see file limitations. -
Force image OCR β Treats all pages as images and processes them with OCR during file processing. When enabled, text extraction from machine-readable PDFs is skipped, which can improve processing speed and reduce memory usage for certain documents. Extracted text might vary slightly from the default path because itβs generated through OCR.
-
Advanced checkbox detection β Uses enhanced OCR to improve detection accuracy for checkbox fields. While optimized for checkboxes, you might see improved results for table fields as well.
Consider the following before enabling this setting:
-
Token usage β The full extracted text is included in each request, which increases input token consumption. If using your own LLM endpoint, monitor token usage and be aware of potential increased costs, particularly for high document volumes.
-
Token limits β Because the full extracted text is included in requests, this setting is best suited for small to medium-sized documents. Large documents are more likely to hit token limits, which can cause information to be missed.
-
Additional error sources β Because the LLM relies on the OCR output, any OCR misreads can propagate into extraction errors, in addition to any reasoning errors the model might make on its own.
-
Field scope β This setting affects all fields in your project, not just checkboxes. Most fields stay the same or improve, but accuracy can decrease slightly for fields that contain only plain text.
-
File splitting
Classification settings control how classes are assigned to documents. If your project includes multipage files, you can change if and how theyβre split into documents by turning on the Split multipage files toggle.
-
Classes β Uses an LLM to split files based on class names and descriptions.
-
Page breaks β Uses programmatic logic to split files at each page break.
-
Splitting function β Uses a custom function to split files.
When you select Splitting function, you can create or edit a custom Python function to define how files are split. The function editor opens automatically when you select this option. For guidance, see Splitting function.
Cross-class
Cross-class settings determine how results are selected if packets include multiple documents from a class. For more information about packets and cross-class fields, see Extracting data from packets.
-
Field priority (default) β Uses the result with the highest field confidence and greatest frequency across all documents in a class.
-
Document priority β Uses results from one document based on criteria you specify.
-
Fewest empty values β Selects the document with the fewest empty field values.
-
Highest average confidence score across all fields β Selects the document with the highest average confidence score across all fields.
-
General
Move, copy, or delete projects from the General tab.
