Apply classifier
The Apply classifier step runs rule-based split classification using a custom Python classifier. Use it when you want code-driven, deterministic class assignments, for example rule-based routing, keyword or layout heuristics, or cases where external context is needed before classification. The step links to a classifier module (a classifier file) in your flow.
For classification that uses a large language model with a schema you define, use the Agent classifier step.
Split PDF, TIF, and TIFF files
The Split PDF and TIF source files option, if enabled, splits and groups pages in PDF, TIF, and TIFF files into separate records according to their class. Split documents are assigned file names that indicate the original document, the assigned class, and the range of pages included.
For example, if pages 1–5 of a 10-page PDF file named input.pdf are classified as class_name_1 and pages 6–10 are class_name_2, then:
-
labeled_outputs/class_name_1/input.class_name_1-1-5.pdfcontains pages 1–5 ofinput.pdf -
labeled_outputs/class_name_2/input.class_name_2-6-10.pdfcontains pages 6–10 ofinput.pdf
