Running and deploying apps
The Hub displays all apps available to you, including prebuilt apps, apps you created, apps shared within your organization, and advanced apps customized for your enterprise. Open any app to run it or see results from previous runs.
From any app, use the left sidebar to review version history, app info, and other details.
Running apps
You can run any app from the Hub on demand.
-
From the Hub, open the app you want to run.
-
(Optional) If the app has sample files and you want to preview app functionality, click Run with sample files, then click Run.
When the run completes, click Sample run to view results.
Sample runs incur usage charges at the same rate as regular app runs. -
When you’re ready to run the app, click Run app.
-
If you’re an organization member, verify the workspace you want to run the app in.
Run results are available only in the selected workspace, and are viewable by all members of that workspace.
-
Select files to process and click Run.
When the run completes, click the run ID to view results.
Sharing apps
You can share apps you create with other AI Hub users.
Sharing settings impact all production versions of an app. Pre-production versions of apps are never shared. When you share an app with a link, users are directed to the latest version of the app.
Other users with access to your app can run the app and view their results. Additionally, organization members can view run results in any workspace they have access to, regardless of whether they initiated the app run. The account that initiates an app run is responsible for any consumption units used.
Access sharing settings from the homepage of apps you created by clicking Share.
Sharing functionality differs based on your AI Hub subscription.
-
Community App sharing is enabled with a link. Any AI Hub user with the link can use your shared app. Shared apps aren’t listed in the Hub.
-
Commercial & Enterprise App sharing is enabled through organization membership. Any member of your organization can use your shared app. Shared apps are listed in the Hub.
Creating deployments
Deployments let you configure an app to run at scale with automation, integration, and human review.
-
In Workspaces, select the Deploy tab, then click Add deployment.
-
Specify options for your deployment, then click Save.
-
Name — Specify a unique name to help users differentiate the deployment across all workspaces they have access to.
-
Description — Specify an optional description for the deployment.
-
App — Select the app that you want to run at scale for this deployment. Available apps include all apps that are accessible to you, whether prebuilt, shared within your organization, or created by you.
-
Workspace — Select the workspace where you want to run the deployment and store run results. If you enable reviews, only members of this workspace can review results.
-
Integrations — Configure pre- and post-processing options, either pulling file or folders from upstream systems or sending results to downstream systems. For details, see Configuring integrations.
-
Notifications — Configure email or webhook notifications when runs start, complete, fail, or when items are queued for review. For details, see Configuring notifications.
-
Review — To manually verify results that fail validation, select Enable human review, then select a review strategy.
-
Review by file — Sends only files that fail validation for human review, and assigns reviews by file.
-
Review by run — Sends entire runs for human review if any file fails validation, and assigns reviews by run.
Whether you review by file or by run, required reviews for a given run must be complete before you can close the review.
Enterprise Enterprise organizations can configure additional review options:
-
Review queue — Assign a group within the deployment workspace to conduct initial reviews. You can select whether reviews are assigned manually or round robin, with reviews assigned to group members in turn. If you select round robin assignment, admins and managers are excluded from reviews by default, but you can optionally include them.
-
Escalation queue — Assign a group within the deployment workspace to review files flagged for further evaluation. Like review queues, you can select assignment method and optionally include admins and managers in reviews.
Queue options aren’t available in personal workspaces. -
Service-level agreement — Specify efficiency targets for human review in minutes, hours, or days. Timing begins when a deployment run begins, and the SLA is satisfied on a given file when it’s marked as reviewed. The Review tab indicates time remaining against the SLA to help reviewers prioritize.
-
-
Configuring integrations
Use integrations to pull files from upstream systems for processing or send results to downstream systems. Results are sent only after required reviews are closed.
Supported integrations include:
-
Email — Send results to an email address in CSV, XLSX, or JSON format. In projects with classes, separate CSV files are generated for each class.
-
Connected drive — Pull files or folders from a workspace or organization drive for processing, or send results in CSV, XLSX, or JSON format. In projects with classes, separate CSV files are generated for each class. For upstream integrations, you can specify whether to run the deployment on a set schedule or any time a new file is detected.
-
Custom function — Send results in JSON format using a custom Python function.
During configuration, you can test the connection by sending the results from a previous app run to your downstream integration.
Integration function
For advanced integrations, you can write a custom integration function in Python.
For example, you might use an integration function to send results to a webhook:
Integration functions accept these parameters:
For additional guidance about custom functions, see Writing custom functions.
Configuring notifications
Notifications inform you when a deployment run starts, completes, fails, or when items are queued for review.
Supported notifications include:
-
Email — Send a rich text notification to specified email addresses when runs reach designated checkpoints. Messages include a link to access run results or reviews, as applicable. You can preview and test notification emails, but you can’t change the subject or content of messages.
-
Webhook — Send HTTP POST requests to a specified endpoint URL when runs reach designated checkpoints. Payloads contain event details like timestamp, run ID, status, and contextual information. You can add custom headers, preview the payload format, and send test notifications to validate the integration.
Running deployments
While deployments are most beneficial when automated with upstream integrations, you can run them on demand if necessary.
-
In Workspaces, select the Deploy tab, then click the name of the deployment you want to run.
-
Click Run deployment.
-
Select files to process.
When the run completes, click the run ID to view results.
Managing app versions
Managing apps throughout the Software Development Life Cycle (SDLC) involves two key aspects: accuracy testing of apps and integration testing of deployments. This approach ensures both the quality of your app’s core functionality and its smooth integration into various environments.
A robust SDLC in AI Hub consists of these components:
-
Workspaces that correspond to your organization’s development process, for example:
-
Development (
dev
) — Used to create apps and conduct preliminary testing. -
Testing (
test
) — Used for thorough testing before apps are promoted to production. -
Production (
prod
) — Used to run tested apps for operational use.
Organization admins can manage access to these workspaces with customized access controls, restricting who can view, edit, test, or deploy app versions in each environment.
-
-
Ground truth datasets for each app in your pre-production environments.
Ground truth datasets are used for accuracy testing as you iterate on apps. Datasets are tied to specific workspaces, so you must create datasets in each environment where you want to conduct accuracy testing.
-
Deployments for each app in all environments.
Each deployment can have unique integration settings, so you can pull files from upstream systems or send results to downstream systems as appropriate to the environment. As you test and promote new app versions, you can update the version used by each deployment.
As you iterate on apps, create new app versions and test them progressively through your workspaces following these high-level steps.
-
Develop or iterate on an app in Build and create a new app version with the Production release state.
Your new app version is stored in the Hub.
Share the app to enable other organization members to access app versions with the production release state.
-
In your
dev
workspace, conduct accuracy testing on the new app version using yourdev
ground truth datasets. -
When you’re satisfied with the results of accuracy testing, update the app version in your
dev
deployment to reflect the new version, and conduct integration testing.Verify that any upstream or downstream integrations are functioning as expected, and that your human review settings match your expected workflow.
-
When you’re satisfied with the results of all testing in
dev
, repeat steps 2 and 3 in yourtest
workspace.Expand your testing as needed to include larger or more varied ground truth datasets, stricter accuracy thresholds, or additional integration scenarios.
If testing fails at any stage, make necessary adjustments to the Build project, create a new app version, and restart the testing process in thedev
workspace. -
When all tests pass, deploy the new version in your
prod
workspace.
Following this process ensures that each app version is thoroughly tested for accuracy and integration before reaching production.
Monitoring deployments
Deployment metrics help you monitor consumption, handling time, and automation rates, giving you insight into deployment efficiency.
In Workspaces, you can enable Show automation metrics to display key metrics and trends over the past 7 days for each deployment.
-
Documents processed shows the total number of documents processed from submission to completion of any reviews.
-
Avg handling time shows the average time to process a document from submission to when the run is complete or, if human review is required, when the document is marked reviewed.
-
Avg automation rate shows the average percent of all fields extracted accurately as measured by unmodified human review results.
To see additional metrics with visualizations, click the name of a deployment to view its deployment overview page, then select the Metrics tab.
The deployment metrics page reiterates the key metrics shown in Workspaces. To display an alert when these metrics deviate more than a specified amount, click Configure alert. Hover over any metric type and click the edit icon
The detailed report provides in-depth information about deployment metrics over the period you specify: last 6 hours, last 24 hours, last 7 days, or last 30 days. You can download the detailed report as a ZIP file containing CSV files for individual metrics.
Consumption metrics
Consumption indicates how many documents, pages, or runs were processed by a deployment. If the deployment classifies documents, you can filter by class to see consumption for specific document types.
Handling time metrics
Handling time measures the average time to process a document from submission to when the run is complete or, if human review is required, when the document is marked reviewed.
The main handling time chart displays average human review processing time versus average total processing time (including human review) for documents or runs. Data is plotted across the time range you specify, with yellow representing human review and blue showing the total. Spikes in the chart indicate longer processing times, which might represent anomalies or particularly complex cases. Use this chart to quickly gain insight into trends over time and to understand processing efficiency for automation and human review.
The Handling time distribution chart presents a histogram of processing times for documents or runs. Use the toggle to display total handling time or human review times only. The x-axis shows time intervals in minutes, while the y-axis displays the number of runs or documents. The chart includes key statistics such as mean handling time and a trimmed mean that excludes outliers above a specified percentile. A vertical red line represents the percentile cutoff. Use this chart to understand the distribution of handling times, identify common durations and outliers, and assess overall efficiency.
The Handling time by class chart lets you compare processing times across document types. Use the toggle to display total handling time or human review times only. Additionally, you can search by class name or sort the data by various criteria. A vertical dashed line indicates the overall average handling time across all classes. Use this chart to identify classes that require more processing time, which might suggest the need for app improvements or additional human review bandwidth.
Automation metrics
Automation measures how accurately fields are processed as measured by unmodified human review results.
The Automation accuracy by field chart shows the automation state of individual fields. You can search by field name or sort the data by various criteria. Use the toggle to show runtime accuracy, which is the percent of validated fields that were extracted correctly as measured by unmodified human review results. Use this chart to measure validation accuracy based on human review outcomes.
The Extraction automation rate | All fields chart shows the percent of all fields that were extracted accurately as measured by unmodified human review results. Unlike runtime accuracy, automation rate includes fields without validation rules, and fields that failed extraction. High automation rates indicate fields that are extracted accurately without needing human intervention. Low automation rates indicate fields that are extracted incorrectly or that require human correction. You can search by field name or sort the data by various criteria. Use this chart to compare automation success across fields and identify fields that require improvements. If automation rates differ from runtime accuracy, it indicates fields that have no validation rules or that failed extraction.
The Extraction automation rate chart shows the automation rate for a specific field over time. The x-axis shows the specified time range, while the y-axis displays the automation rate. The graph includes two lines: one representing the automation rate for the selected field and another showing the average automation rate across all fields. Use this chart to visualize performance over time, particularly for lower performing fields identified in the adjacent chart.
Automation states
Automation state evaluates the effectiveness of automation through validation rules and human review.
Automation state includes two key measures:
-
Validation outcome (valid or invalid) indicates whether a field passed validation rules. Fields are also considered valid if no validation rules apply.
-
Human review outcome (unmodified or modified) indicates whether a field was changed during human review.
Combined, these measures provide four automation states:
-
Valid and unmodified (dark green) — Result passed validation and wasn’t corrected in human review. This state indicates a high degree of extraction accuracy.
-
Invalid and unmodified (lighter green) — Result failed validation but wasn’t corrected because it was actually valid. This state indicates effective human review, but suggests a need to improve validation rules.
-
Invalid and modified (yellow) — Result failed validation and was corrected in human review. This state indicates both effective validation and effective human review, but suggests a need to improve extraction accuracy.
-
Valid and modified (red) — Result passed validation but was corrected in human review. This state indicates effective human review, but suggests a need to improve validation rules.
Viewing logs
Logs provide detailed insights into app and deployment runs, helping you troubleshoot issues, monitor performance, and understand how your documents are being processed.
You can access logs from the runs page of any app or deployment. To view a log, hover over the run you want to investigate, click the overflow icon
Each log entry includes a timestamp, log level, and detailed message. Log levels indicate the severity and type of information.
-
INFO
— General operational information, such as processing status and model calls. -
WARNING
— Potential issues that don’t stop execution but might require attention. -
ERROR
— Serious problems that might cause failures or unexpected behavior.
When troubleshooting, focus on error and warning messages first, as they often indicate the root cause of issues. Info messages provide context about standard operations and can help trace document flow through your application.
Using advanced apps
Advanced apps are custom apps created by Instabase to address complex enterprise use cases.
Advanced apps are available from the Hub and tagged with Advanced. You can test, run, and deploy advanced apps just like any other app, but you can’t edit them or access an underlying Build project.
If required for your use case, advanced apps might be designed with multiple review checkpoints. In this case, each review must be closed before the run can proceed or complete.