Calling LLMs from custom functions in flows

Enterprise Single-tenant

In single-tenant environments with advanced view enabled, you can call an LLM from custom function code to get text or structured output—for example, to drive extraction, classification, or validation with AI. Implement Python in the project’s legacy scripts folder or module, register entrypoints, and reference functions by name in the step configuration. Model and provider come from the tenant’s configured LLM provider and the AI runtime model.

Calling LLMs from custom functions works differently in the app editor, see Calling LLMs from custom functions.

Availability

In the flow editor, llm_client is supported in these steps and event hooks:

The Agent classifier and Agent extract steps natively support calling LLMs.

Design considerations

When you use the LLM client in flow steps, consider the following:

  • The LLM client bypasses the platform’s built-in guardrails. Use it only where the dedicated agent steps can’t meet the requirement — for example, where custom preprocessing logic must run before an LLM call, or where an external data source must be queried and incorporated into a prompt. In all other cases, use the Agent classifier and Agent extract steps.

  • Contain LLM client calls to a single, well-defined step. Don’t scatter custom LLM calls across multiple UDFs throughout the flow.

  • When you call an LLM directly through the client rather than through a dedicated agent step, the following don’t apply automatically:

    • Grounding — The model isn’t constrained to return values from the source document. Enforce grounding through careful prompting. For example, if using the LLM client for classification, the prompt must explicitly instruct the model to return only one of the defined class names and handle cases where the document doesn’t match any category.

    • Schema validation — The platform doesn’t validate that LLM output matches an expected structure. Output parsing and error handling are the developer’s responsibility. Error handling must account for both external API failures and LLM output validation failures.

    • Rate limit management — Multiple LLM client calls across a flow at high document volume can quickly exhaust Azure OpenAI or Vertex AI rate limits, causing flows to fail.

Using the LLM client in flow steps

Calling an LLM from a flow step involves the following components and processes.

  • Legacy scripts — Add your Python function to the project’s legacy scripts folder or module, then reference the function by name in the step configuration.

  • Registration — Decorate the entrypoint with @register_fn(provenance=False) from instabase.provenance.registration so the platform can register and invoke it.

  • Clients — Declare CLIENTS as a positional argument in your function signature. The flow runtime passes it in automatically. Use attribute access to reference clients — for example, CLIENTS.llm_client and CLIENTS.ibfile — rather than dict-style access with .get('llm_client').

  • LLM usage — Call generate_content on llm_client using the same parameters described in the Call generate_content subsection below.

1from instabase.provenance.registration import register_fn
2
3
4@register_fn(provenance=False)
5def llm_step(CLIENTS, *args, **kwargs):
6 llm_client = CLIENTS.llm_client
7 resp_text = llm_client.generate_content(
8 prompt="<Task and source text the model must use.>",
9 )
10 return resp_text

When you wire the function in the flow—for example for a pre-flow or post-flow event—invoke it by passing CLIENTS according to your step’s formula, such as llm_step(CLIENTS). Use the function name you registered and the variable order your step expects (see Custom functions in flow).

1

Get llm_client from CLIENTS

When the LLM client is available, the CLIENTS object exposes an LLM client (llm_client) and a file client (ibfile). Use the file client’s read_file to read document content. Use the LLM client’s generate_content method to send a prompt to the model and get a response (optionally with file data or a response schema for structured output). If the LLM client isn’t available for a given run (for example, the step doesn’t support it), guard the call before use, for example if CLIENTS.llm_client is None.

2

Call generate_content

When you have the client, call generate_content for either text-only or file-aware generation:

1generate_content(
2 prompt='str'
3 file_data=None,
4 mime_type=None,
5 file_path=None,
6 response_schema=None,
7 enable_thinking=True,
8 enable_logprobs=False,
9)
ParameterRequired?TypeDescription
promptYesstringThe text prompt to send to the model.
file_dataNobytesFile data. Must be provided with mime_type. Default: None.
mime_typeNostringThe multipurpose internet mail extensions (MIME) type of the file. Must be provided with file_data. Default: None.
file_pathNostringThe file path. Default: None.
response_schemaNodictSchema for structured output. Default: None.
enable_thinkingNobooleanWhether to enable thinking/reasoning mode. Default: True.
enable_logprobsNobooleanWhether to include log probabilities (per-token confidence scores from the model) in the response. Default: False.

Example: Map UDF with structured output

The following example uses @register_fn, takes CLIENTS as a positional argument together with Map UDF variables, reads a file with CLIENTS.ibfile, and writes model output into out_files. Set the Map UDF step formula to match your declared parameters—for example map_with_llm(INPUT_RECORD, STEP_FOLDER, CLIENTS).

1import json
2
3from instabase.provenance.registration import register_fn
4
5
6@register_fn(provenance=False)
7def map_with_llm(input_record, step_folder, CLIENTS, *args, **kwargs):
8 if CLIENTS.llm_client is None:
9 return {"out_files": []}
10
11 input_filepath = input_record["input_filepath"]
12 file_content, err = CLIENTS.ibfile.read_file(input_filepath)
13 if err:
14 raise IOError("Could not read file {}".format(input_filepath))
15
16 resp = CLIENTS.llm_client.generate_content(
17 prompt="Return the deductions table. Remove any extra symbols from the amount deducted.",
18 file_data=file_content,
19 mime_type="application/pdf",
20 file_path=input_filepath,
21 response_schema={
22 "type": "array",
23 "items": {
24 "type": "object",
25 "properties": {
26 "description": {"type": "string"},
27 "amount deducted": {"type": "float"},
28 },
29 },
30 "description": "deductions table",
31 },
32 )
33
34 if isinstance(resp, (dict, list)):
35 payload = json.dumps(resp).encode("utf-8")
36 else:
37 payload = str(resp).encode("utf-8")
38
39 return {
40 "out_files": [
41 {
42 "filename": "deductions.json",
43 "content": payload,
44 }
45 ]
46 }

Adjust filenames, MIME type, schema, formula, and return shape to match your step.