Validating documents

Commercial & Enterprise

Ensure high-quality output by implementing validation rules that flag questionable results for human review. Validation rules are also factored into accuracy metrics, supporting more accurate performance measurements.

About confidence scores

Confidence scores are percentage values that indicate the level of certainty in results, including classification, digitization (OCR), or field results. Confidence scores are calculated by the model or OCR processor.

Field and OCR confidence scores are displayed in the field editor.

For enterprise users, classification confidence scores are displayed in the class editor.

Higher percentages suggest greater confidence. You can use confidence scores to help you fine-tune prompts, or to establish validation rules.

Non-OCR confidence is calculated using log probabilities, which provide a mathematical measure of how likely a result is. In internal testing, this method proves reliable when compared to benchmarks.

Creating validation rules

Organization members can validate results based on confidence scores, a prompt, or a custom validation function.

To see how a validation rule performs across documents, click any rule in the Validations tab to see an aggregate view.

Classification confidence

Enterprise

Classification confidence indicates the model’s certainty in predicting the class of a document.

You can set a classification confidence threshold across all classes in your project, or you can set thresholds that apply to individual classes.

If multiple confidence validation rules are configured for a class, the stricter rule is applied. For example, if you set a project-wide confidence threshold of 85% and a class threshold of 95%, the 95% confidence threshold is used.

  1. In the editing panel, select the Validations tab.

  2. Specify validation rules as needed.

    1. In the Apply to all group, click + Add rule, then select Confidence rule > Classification confidence.
    2. Enter a project classification confidence threshold and click Save.
    1. In the class that you want to add a validation rule for, then select Confidence rule > Classification confidence.
    2. Enter a classification confidence threshold and click Save.

Field confidence

Field confidence indicates the model’s certainty in predicting results for a given field.

You can set a field confidence threshold across all fields in your project, or you can set thresholds that apply to individual fields.

If multiple confidence validation rules are configured for a field, the stricter rule is applied. For example, if you set a project-wide confidence threshold of 85% and a field threshold of 95%, the 95% confidence threshold is used.

  1. In the editing panel, select the Validations tab.

  2. Specify validation rules as needed.

    1. In the Apply to all group, click + Add rule, then select Confidence rule > Field confidence.
    2. Enter a project field confidence threshold and click Save.
    1. In the class that you want to add a validation rule for, click + Add rule, then select Confidence rule > Field confidence.
    2. Select the field to apply the validation rule to, enter a confidence threshold, and click Save.

OCR confidence

OCR confidence indicates the OCR processor’s certainty in digitization accuracy. For instance, a high OCR confidence for Amy Cooper in an account holder field means the model is sure it read those letters correctly, not that Amy Cooper is actually the account holder.

In multi-word results, the lowest confidence score across all words is used. For example, if Amy returns a confidence score of 95 percent, but Cooper returns a confidence score of 65 percent, the reported OCR confidence score for the account holder field is 65 percent.

You can set an OCR confidence threshold across all fields in your project, or you can set thresholds that apply to individual fields.

If multiple confidence validation rules are configured for a field, the stricter rule is applied. For example, if you set a project-wide confidence threshold of 85% and a field threshold of 95%, the 95% confidence threshold is used.

  1. In the editing panel, select the Validations tab.

  2. Specify validation rules as needed.

    1. In the Apply to all group, click + Add rule, then select Confidence rule > OCR confidence.
    2. Enter a project field confidence threshold and click Save.
    1. In the class that you want to add a validation rule for, click + Add rule, then select Confidence rule > OCR confidence.
    2. Select the field to apply the validation rule to, enter a confidence threshold, and click Save.

Validation prompt

Validation prompts let you describe how you want to validate a field in your own words. Your prompt is used to generate a custom validation rule written in Python code.

You can permanently convert a prompt-based rule to a custom function, but you can’t edit it later using a prompt.

  1. In the editing panel, select the Validations tab.

  2. In the class that you want to add a validation rule for, click + Add rule, then select Custom rule > Validation prompt.

  3. In the Prompt, describe how you want to validate fields, using autocomplete to select field names.

  4. Click Run.

    Your prompt is used to generate a custom validation rule in Python code. Rule generation can take several minutes.

    When complete, you can view the code by clicking Show generated code.

  5. If necessary, iterate on the rule using one of these methods.

    • Modify your prompt — Change your prompt as needed and click Run to regenerate the rule.

    • Convert and edit the rule — While viewing the generated code, hover over the code and click Convert to custom function. After conversion, edit the custom function as needed.

Validation function

For advanced validation, you can write a custom validation function in Python.

For example, you might use a validation function to check that a date is within a certain range:

1def validate_date_in_range(date_value, context):
2 """
3 Validates that a date value falls within an acceptable range.
4 Returns None if valid, error message if invalid.
5 """
6 import datetime
7
8 # Skip validation if input is empty
9 if not date_value:
10 return "Date value is missing"
11
12 try:
13 # Parse date (handles common formats like MM/DD/YYYY, YYYY-MM-DD)
14 date_obj = datetime.datetime.strptime(date_value, "%m/%d/%Y").date()
15
16 # Define acceptable date range (example: between 1 year ago and today)
17 today = datetime.date.today()
18 one_year_ago = today.replace(year=today.year - 1)
19
20 # Check if date falls within acceptable range
21 if date_obj < one_year_ago:
22 return f"Date {date_value} is more than one year old"
23 if date_obj > today:
24 return f"Date {date_value} is in the future"
25
26 # Date is valid - return None
27 return None
28
29 except ValueError:
30 # Date couldn't be parsed
31 return f"Invalid date format: {date_value}"

Validation functions accept these parameters:

ParameterRequired?Description
<field-name>RequiredValue of the field that the custom function validates.
contextRequiredStores metadata about the document.
context['document_text']OptionalRetrieves the entire text of the document.
context.get('runtime_config', {})OptionalValidates results passed as structured data when running an app via API. Returns an empty dictionary if not present. Nested parameters are sent as structured JSON objects and accessed using dictionary notation.
context['file_path']OptionalRetrieves the path to the uploaded file.
keysOptionalAccess custom variables and organization secrets. Use keys['custom']['<key-name>'] for custom keys and keys['secret']['<key-name>'] for secret keys.
<additional-field-name>OptionalWhen writing custom functions in automation projects, click Add argument to select additional fields in the class to use in the function.

Validation functions must return None if the validation rule passes, and an error string if the validation rule fails.

For additional guidance about custom functions, see Writing custom functions.