Automate use case

You’re ready to use the SDK to do real work. This use case demonstrates an AI Hub feature called automate, which performs automated document processing to classify documents and extract key information.

Processing receipts with an AI Hub app

This example processes two receipts with an AI Hub app called Meal Receipt. The app looks at images of restaurant receipts and extracts key information such as the restaurant name, items ordered, and amount paid.

Each step of this workflow uses the SDK.

  1. Make an empty batch to hold input documents.

  2. Upload two JPG files to that batch.

  3. Run the Meal Receipt app to process the receipts in the batch.

  4. Retrieve and print the results of the app run.

This tutorial first writes code that assumes everything works perfectly. This condition is sometimes referred to as the “happy path.” This helps you focus on understanding what the SDK does.

After you’re comfortable with the basic flow, the tutorial shows you how to make the code more robust by identifying and handling things that might go wrong.

Implementing the use case

1

Make a file to hold your Python program

With a text editor, create and save an empty file called automate_with_sdk.py. Put it anywhere on your computer. As each snippet of test code is introduced, paste it at the bottom of this file.

2

Import and authorize the SDK

For any SDK-enabled program, you must first make the SDK available and configure authorization details.

1from aihub import AIHub
2
3client = AIHub(api_root="PASTE YOUR API ROOT HERE",
4 api_key="PASTE YOUR API KEY HERE",
5 ib_context="PASTE YOUR IB-CONTEXT HERE")
Remember to replace the three authorization placeholders with your own values.
3

Make an empty batch

This code is similar to the code you used to test the SDK, but with a different batch name.

1print("creating an empty batch")
2create_batch_resp = client.batches.create(name="receipt batch",
3 workspace="SDK-Tutorial")
4

Save the batch ID

When you used the client.batches.create() method to test the SDK, you didn’t care what data it returned, so you didn’t store its response. In this use case you need to capture the batch ID, which is returned by the method. That’s why you store the response in the create_batch_resp variable.

The ID is stored as a field—or piece of data—within the object that’s returned. Python’s syntax for accessing a field is OBJECT.FIELD. So in this case you can access the ID with create_batch_resp.id and store it in a variable called batch_id.

1batch_id = create_batch_resp.id

How are you supposed to know that the response includes the batch ID, or that the field with the batch ID is called id? How can you tell if the response includes any other interesting information?

The answers to these questions are in the documentation for the create batch SDK operation: the operation returns the batch ID in an integer field called id, and returns no other data.

The documentation page for an SDK operation is the best source of information for what to pass in to the method and what data to expect in its response.

5

Download files to process

Download two JPG files of receipts to your computer, so you can upload them to AI Hub later to be processed.



6

Upload files to the batch

You can upload any number of files to a batch, but this example uploads two so you can see how to handle multiple files.

Scroll through the API and SDK documentation for managing batches and find an operation called upload file to batch.

Operation names (in this case, upload file to batch) aren’t always identical to their SDK method calls (in this case, client.batches.add_file()). Names can change over time, but they refer to the same operation and are both found on the documentation page for that operation.

The documentation says you need to pass the local file to the method as a Python file object and pass a name for the file on the AI Hub filesystem.

The file doesn’t need to have the same name on the AI Hub filesystem as on your computer.
If you’re not familiar with how Python handles files, learn more at the RealPython site.

In this example, upload receipt-1.jpg from your computer and call it receipt-a.jpg on the AI Hub filesystem, and upload receipt-2.jpg from your computer and call it receipt-b.jpg.

You also need to pass in the ID of the batch to upload the file to. Good thing you saved that during the last step.

You now have all the information you need to upload files to a batch.

  • Call the client.batches.add_file() method.

  • Pass in the ID of the batch to upload to.

  • Pass a Python file object created from the file on your local computer.

  • Pass the name for the file in the Instabase filesystem.

  • The documentation shows that client.batches.add_file() doesn’t return anything, so you don’t need to store the response in a variable.

Because the method uploads only one file, repeat this process to upload the second file, changing values as needed.

1print("uploading two files to the batch")
2
3client.batches.add_file(
4 batch_id=batch_id,
5 file_name="receipt-a.jpg",
6 file=open("PATH/ON/YOUR/COMPUTER/TO/receipt-1.jpg", "rb"))
7
8client.batches.add_file(
9 batch_id=batch_id,
10 file_name="receipt-b.jpg",
11 file=open("PATH/ON/YOUR/COMPUTER/TO/receipt-2.jpg", "rb"))
Replace the path placeholders in this code with paths to wherever you downloaded receipt-1.jpg and receipt-2.jpg.
7

Run the app

Now that your batch is set up, use the SDK to run the Meal Receipt app to process the documents in the batch.

To trigger an app run, call the client.apps.runs.create() method. Pass it the name of the app, the person or organization that created the app, and the ID of the batch to process.

Use Instabase as the owner for all Instabase-provided apps, including Meal Receipt.

Store the response in a variable called run_resp so you can refer to the run ID that’s included in that response in the future.

1print("running the app")
2run_resp = client.apps.runs.create(app_name="Meal Receipt",
3 owner="Instabase",
4 batch_id=batch_id)
8

Retrieve the results

Running an AI Hub app is an asynchronous process. This means you don’t get results immediately after running it.

Analogy

Asynchronous methods are like ordering from an online store. There’s no immediate gratification after you click “buy.” Instead, keep checking your front porch until the items arrive.

Asynchronous methods don’t return values immediately. Keep checking their status until they’re done.

The whole process of asynchronously running an app and getting the results looks like this.

  1. Run the app.

  2. Pause for a few seconds and then check the status of the app. As long as the status is PENDING or RUNNING (meaning that the app isn’t done yet), repeat this step.

  3. When the app is no longer in PENDING or RUNNING state, get the results.

You’ve already accomplished step 1 with the previous piece of code. Here’s code for steps 2 and 3.

Don’t be misled by the line numbers in this snippet starting at 1. Paste this in at the end of automate_with_sdk.py, not the beginning. The line numbers are just to make it easier to discuss the code.
1import time
2
3print("checking the app status until it finishes")
4while True:
5 time.sleep(3)
6 status_resp = client.apps.runs.status(run_resp.id)
7 print(f"status: {status_resp.status}")
8 if status_resp.status not in ["PENDING", "RUNNING"]:
9 break
10
11print("fetching the app results")
12results_resp = client.apps.runs.results(run_resp.id)

Line 1 makes Python’s built-in time library available to your program, so you can add pauses.

Line 4 starts an infinite loop for checking the app status. Break out of this loop when the app finishes running.

Line 5 pauses for 3 seconds. Without the pause, your program would swamp the AI Hub instance with rapid-fire requests for the app status.

Line 6 asks AI Hub for the app status, passing it the ID of the app run.

Line 7 prints the app run status so the user can keep track of what’s happening.

The SDK rarely returns bare data—it’s usually wrapped in some sort of Python object. In this case, the status in stored in the status field of the object that client.apps.runs.status() returns.

As mentioned before, the only way to know what field to look in is to consult the documentation for the SDK operation.

Line 8 inspects the app run status to see if it indicates that the app is still running.

  • PENDING means the app hasn’t started yet.

  • RUNNING means the run is in progress.

App run status is always one of several predefined values, each with a specific meaning.

Line 9 breaks out of the infinite loop if the status reveals that the app run is finished.

Line 12 asks the SDK for the results of the app run.

Notice that you’re invoking two separate SDK operations: client.apps.runs.status() to check the run’s status, and client.apps.runs.results() to retrieve the results when the app finishes.

9

Display the results

1for file in results_resp.files:
2 print(f"file name: {file.original_file_name}")
3 for document in file.documents:
4 for field in document.fields:
5 print(f" {field.field_name}: {field.value}")
6 print("---")

In the previous step you called client.apps.runs.results(). The response from this method contains a field called files, which holds a list of files that the Meal Receipt app processed. In line 1 you iterate across each file to print its results.

Line 2 prints the name of the file.

Line 3 iterates across the documents within a file. This might be confusing: you need to iterate across files because your processed batch contains two files, but why do you need to iterate across documents within each file?

The reason is that each file can potentially contain more than one document. For example, imagine a three-page PDF with a separate receipt on each page. AI Hub splits that single file into three separate documents and processes each document separately.

Each file in the batch in this example contains only one document, so it isn’t necessary to iterate across the documents in each file. But it’s a good general practice considering batches can contain multi-document files.

Line 4 introduces more iteration: you’re looking at each extracted field in a single document. Fields are stored as a list in the fields field on the document variable.

Line 5 prints the name and value of each field, indented to make it easier to understand which values belong to which documents. This line is where the rubber meets the road—it shows the data that the Meal Receipt app was able to pull out of each receipt.

Line 6 prints a visual separator before moving on to the next document.

10

Run the SDK-enabled program

You’ve now seen all the code for this use case. It’s time to run it.

  1. If you haven’t been entering the code into a file on your computer called automate_with_sdk.py, do so now.

  2. Run it the same way you ran the short program to test if the SDK was installed, only with a different program name. Follow the instructions for your operating system.

    1. Open a terminal.

    2. Type cd PATH/TO/DIRECTORY/WITH/automate_with_sdk.py, replacing the placeholders with the path to wherever you stored automate_with_sdk.py.

    3. Type python3 automate_with_sdk.py.

The output looks similar to this.

creating an empty batch
uploading two files to the batch
running the app
checking the app status until it finishes
status: RUNNING
status: RUNNING
status: RUNNING
status: STOPPED_AT_CHECKPOINT
fetching the app results
file name: receipt-a.jpg
Total_Amount: 164.46
Vendor: Flo's Greasy Spoon Diner
Date: 07/25/2025
Transaction_ID: N/A
Vendor_Address: 123 Main St Springfield, Missouri
Subtotal: 152.99
Tax: 11.47
Tip: N/A
Receipt_Item_Details: [{"Description": "Beluga Caviar", "Quantity": "1", "Amount": "150.0"}, {"Description": "Cherry Coke", "Quantity": "1", "Amount": "2.99"}]
---
file name: receipt-b.jpg
Total_Amount: 9619.21
Vendor: CJ's Burger Shack
Date: 07/16/2025
Transaction_ID: N/A
Vendor_Address: 17 Montgomery Road Dallas, Texas
Subtotal: 9204.99
Tax: 414.22
Tip: N/A
Receipt_Item_Details: [{"Description": "Lg Basket Fries", "Quantity": "6", "Amount": "5.99"}, {"Description": "Bot Roman\u00e9e-Conti", "Quantity": "1", "Amount": "9199.00"}]
---
You might wonder about the STOPPED_AT_CHECKPOINT status. That means the Meal Receipt app isn’t fully confident about some values it extracted from the receipt and wants a human to double-check the results. For this example, you can ignore that step.

After running the program yourself, look at the JPG files for each receipt and compare them to the output. Do you see the values you expected? If not, or if you see error messages, see the Troubleshooting section below.

11

Add exception handling

It’s a good practice to predict what might go wrong with a program and add logic to handle those situations. This tutorial would be too long if it tried to predict every glitch that could occur with automate_with_sdk.py. It explains how to deal with one type of error so you can learn a model for exception handling.

Authorizing the SDK with incorrect values for the API Key or IB-Context causes the first SDK operation to throw aihub.exceptions.UnauthorizedException, print an error message, and halt the program. There are different ways to deal with handling exceptions in Python, but the most common technique is to use a try...except block.

RealPython has an excellent introduction to Python exception handling.

In your existing code, replace these lines:

1create_batch_resp = client.batches.create(
2 name="receipt batch",
3 workspace="SDK-Tutorial")

with these lines:

1from aihub.exceptions import UnauthorizedException
2
3try:
4 create_batch_resp = client.batches.create(
5 name="receipt batch",
6 workspace="SDK-Tutorial")
7except UnauthorizedException:
8 import sys
9 sys.exit("ERROR: SDK not authorized. "
10 "Check the API Key and IB-Context values.")

Line 1 makes your program aware of a specific exception that AI Hub throws when authorization fails.

Lines 3 through 6 embed your original client.batches.create() method call in a try block. This alerts Python that the code within that block might throw an exception.

Line 7 tells Python what kind of exception you’d like to watch for and handle.

If the code within your try block could throw several different types of exceptions, you can have a separate except line for each type of exception and handle them all differently.

In this example, you’re only watching out for UnauthorizedException.

Lines 8 through 10 quit the automate_with_sdk.py program and display an error message to the user that explains what went wrong and how to fix the problem.

Some except blocks fix the problem that’s causing the exception and continue running the program. Misconfigured authorization isn’t a problem that can be fixed while the program is still running, so this except block aborts the program.

Watch your exception handling logic in action.

  1. Break the SDK authorization by adding a single character to the API Key in the line of code that configures the SDK client (this is probably line 4 in automate_with_sdk.py).

  2. Re-run automate_with_sdk.py.

Do you see the user-friendly error message you added?

To make automate_with_sdk.py robust enough for a production environment, you would need to think about other things that might go wrong in this program, and add try...except blocks to handle each possibility.

This can dramatically increase the length of your program, but is an important part of responsible programming.

Complete program

So far you’ve only seen snippets of code. Here’s a fully assembled version of the automate_with_sdk.py script. The only difference between this complete program and the snippets is that this version moves all the import statements to the beginning and adds explanatory comments throughout.

Replace the code you’ve typed in so far with this version. The third use case expects and builds on this code.

Remember to replace the placeholder values for API Root, API Key, IB-Context, and the paths to the receipt-1.jpg and receipt-2.jpg files.
1# prepare to use standard Python libraries
2import sys
3import time
4
5# prepare to use the SDK
6# and an exception that the SDK throws when authorization fails
7from aihub import AIHub
8from aihub.exceptions import UnauthorizedException
9
10# authorize the SDK
11client = AIHub(api_root="PASTE YOUR API ROOT HERE",
12 api_key="PASTE YOUR API KEY HERE",
13 ib_context="PASTE YOUR IB-CONTEXT HERE")
14
15print("creating an empty batch")
16try:
17 # make an empty batch with a specific name in a specific workspace
18 create_batch_resp = client.batches.create(
19 name="receipt batch",
20 workspace="SDK-Tutorial")
21except UnauthorizedException:
22 # exit the program while printing a user-friendly error message and
23 # instructions on how to fix the problem
24 sys.exit("ERROR: SDK not authorized. "
25 "Are the API ROOT, API KEY, and IB-Context values correct?")
26
27# store batch_id in an easy-to-read variable, since we'll use it several times
28batch_id = create_batch_resp.id
29
30print("uploading two files to the batch")
31
32# upload the first file to the batch
33client.batches.add_file(batch_id=batch_id,
34 file_name="receipt-a.jpg",
35 file=open("PATH/ON/YOUR/COMPUTER/TO/receipt-1.jpg", "rb"))
36
37# upload a second file to the batch
38client.batches.add_file(batch_id=batch_id,
39 file_name="receipt-b.jpg",
40 file=open("PATH/ON/YOUR/COMPUTER/TO/receipt-2.jpg", "rb"))
41
42print("running the app")
43# trigger an app run, specifying which app, who wrote it, and which batch it should process
44run_resp = client.apps.runs.create(app_name="Meal Receipt",
45 owner="Instabase",
46 batch_id=batch_id)
47
48print("checking the app status until it finishes")
49while True: # loop until explicitly told to leave the loop
50 time.sleep(3) # pause a few seconds between each app status check
51 status_resp = client.apps.runs.status(run_resp.id) # get the app status
52 print(f"status: {status_resp.status}") # update the user on the app status
53 if status_resp.status not in ["PENDING", "RUNNING"]: # these statuses mean the app is still running
54 break # the app is done, so stop looping
55
56print("fetching the app results")
57results_resp = client.apps.runs.results(run_resp.id) # get the app results
58
59for file in results_resp.files: # iterate across all processed files
60 print(f"file name: {file.original_file_name}")
61 for document in file.documents: # iterate across all documents in a file
62 for field in document.fields: # iterate across all fields in a document
63 print(f" {field.field_name}: {field.value}") # print the field name and value
64 print("---") # visual separator between files

Troubleshooting

If the test program fails to create the batch, the program displays an error message. These can be quite long, but you can get the gist of the problem by looking at the very end of the message.

Here are some phrases you might see at the end of error messages issued by this program. Click each to see the solution.

Check your API Root value.

You’re passing client.batches.create() a workspace name that doesn’t exist. In this case, you passed it SDK Tutorial instead of SDK-Tutorial.

When calling client.batches.add_file() you passed an incorrect path to the file you’re trying to upload.

When calling client.apps.runs.create() you passed an incorrect value for the app_name or owner parameters. In this case, you called the app Meal Receipts instead of Meal Receipt.

Next steps

For practice writing exception handling code, try writing more try...except blocks for any of the conditions listed in the Troubleshooting section above.

Go to the next page to see a different AI Hub use case: analyzing a document by having a conversation about it.