Validating documents
Ensure high-quality output by implementing validation rules that flag questionable results for human review. Validation rules are also factored into accuracy metrics, supporting more accurate performance measurements.
About confidence scores
Confidence scores are percentage values that indicate the level of certainty in results, including classification, digitization (OCR), or field results. Confidence scores are calculated by the model or OCR processor.
Field and OCR confidence scores are displayed in the field editor.
For enterprise users, classification confidence scores are displayed in the class editor.
Higher percentages suggest greater confidence. You can use confidence scores to help you fine-tune prompts, or to establish validation rules.
Non-OCR confidence is calculated using log probabilities, which provide a mathematical measure of how likely a result is. In internal testing, this method proves reliable when compared to benchmarks.
Creating validation rules
Organization members can validate results based on confidence scores, a prompt, or a custom validation function.
Classification confidence
EnterpriseClassification confidence indicates the model’s certainty in predicting the class of a document.
You can set a classification confidence threshold across all classes in your project, or you can set thresholds that apply to individual classes.
If multiple confidence validation rules are configured for a class, the stricter rule is applied. For example, if you set a project-wide confidence threshold of 85% and a class threshold of 95%, the 95% confidence threshold is used.
-
In the editing panel, select the Validations tab.
-
Specify validation rules as needed.
Add a validation rule that applies to all classes...
- In the Apply to all group, click + Add rule, then select Confidence rule > Classification confidence.
- Enter a project classification confidence threshold and click Save.
Add a validation rule for a specific class...
- In the class that you want to add a validation rule for, then select Confidence rule > Classification confidence.
- Enter a classification confidence threshold and click Save.
Field confidence
Field confidence indicates the model’s certainty in predicting results for a given field.
You can set a field confidence threshold across all fields in your project, or you can set thresholds that apply to individual fields.
If multiple confidence validation rules are configured for a field, the stricter rule is applied. For example, if you set a project-wide confidence threshold of 85% and a field threshold of 95%, the 95% confidence threshold is used.
-
In the editing panel, select the Validations tab.
-
Specify validation rules as needed.
Add a validation rule that applies to all fields...
- In the Apply to all group, click + Add rule, then select Confidence rule > Field confidence.
- Enter a project field confidence threshold and click Save.
Add a validation rule for a specific field...
- In the class that you want to add a validation rule for, click + Add rule, then select Confidence rule > Field confidence.
- Select the field to apply the validation rule to, enter a confidence threshold, and click Save.
OCR confidence
OCR confidence indicates the OCR processor’s certainty in digitization accuracy. For instance, a high OCR confidence for Amy Cooper in an account holder field means the model is sure it read those letters correctly, not that Amy Cooper is actually the account holder.
In multi-word results, the lowest confidence score across all words is used. For example, if Amy returns a confidence score of 95 percent, but Cooper returns a confidence score of 65 percent, the reported OCR confidence score for the account holder field is 65 percent.
You can set an OCR confidence threshold across all fields in your project, or you can set thresholds that apply to individual fields.
If multiple confidence validation rules are configured for a field, the stricter rule is applied. For example, if you set a project-wide confidence threshold of 85% and a field threshold of 95%, the 95% confidence threshold is used.
-
In the editing panel, select the Validations tab.
-
Specify validation rules as needed.
Add a validation rule that applies to all fields...
- In the Apply to all group, click + Add rule, then select Confidence rule > OCR confidence.
- Enter a project field confidence threshold and click Save.
Add a validation rule for a specific field...
- In the class that you want to add a validation rule for, click + Add rule, then select Confidence rule > OCR confidence.
- Select the field to apply the validation rule to, enter a confidence threshold, and click Save.
Validation prompt
Validation prompts let you describe how you want to validate a field in your own words. Your prompt is used to generate a custom validation rule written in Python code.
You can permanently convert a prompt-based rule to a custom function, but you can’t edit it later using a prompt.
-
In the editing panel, select the Validations tab.
-
In the class that you want to add a validation rule for, click + Add rule, then select Custom rule > Validation prompt.
-
In the Prompt, describe how you want to validate fields, using autocomplete to select field names.
-
Click Run.
Your prompt is used to generate a custom validation rule in Python code. Rule generation can take several minutes.
When complete, you can view the code by clicking Show generated code.
-
If necessary, iterate on the rule using one of these methods.
-
Modify your prompt — Change your prompt as needed and click Run to regenerate the rule.
-
Convert and edit the rule — While viewing the generated code, hover over the code and click Convert to custom function. After conversion, edit the custom function as needed.
-
Validation function
For advanced validation, you can write a custom validation function in Python.
For example, you might use a validation function to check that a date is within a certain range:
Validation functions accept these parameters:
Validation functions must return None
if the validation rule passes, and an error string if the validation rule fails.
For additional guidance about custom functions, see Writing custom functions.