DatE

December 10, 2024

Reading Time

9 Min.

API

An API rarely comes alone: Defining Arazzo API Workflows

By

Andreas Siegel

APIs have become indispensable in today’s digital economy. As early as 2019, API calls already accounted for 83 percent of all Internet traffic, largely driven by individual applications in the cloud—a clear consequence of digital transformation. There is absolutely no indication of this trend slowing down. On the contrary, the API economy continues to evolve at a rapid pace, and we can assume that APIs will remain one of the key factors for the success of digital business models. By enabling the provisioning of data and functionality, APIs pave the way for digital openness. Those who open up gain the opportunity to build an ecosystem or platform:

APIs make it possible to create a comprehensive ecosystem in which various partners, developers, and third-party providers are integrated and encouraged to contribute their own applications and services. New services can be developed, the functionality of the overall platform can be expanded, interoperability is simplified, and additional revenue streams can emerge along the path toward innovation and flexibility. This may also include the integration of advanced technologies such as artificial intelligence. Perhaps the best example is the wide range of services that have emerged around OpenAI’s APIs.

It becomes clear that, from a business perspective, value is rarely created by a single API in isolation. Integration is the magic word: multiple APIs are connected and orchestrated to create a seamless and efficient system in which data and functions can be leveraged across system boundaries.

Seamless? How is that supposed to work? How can I represent and describe it? Until recently, these questions were not easy to answer.

For an individual API, the situation is perfectly clear: with OpenAPI (formerly known as Swagger), we have a standardized specification available for describing web APIs. In both human- and machine-readable form, we can document how the API works, which endpoints exist, which parameters are required, and which data formats are used. In other words: OpenAPI allows us to precisely describe what an API offers and how it works—typically at the level of individual endpoints.

However, when functionality requires the interplay of multiple endpoints to generate real business value, we reach the limits of OpenAPI. We need to represent a workflow, but with OpenAPI alone this can only be done by enriching the description of individual endpoints with prose, so that the associated process can be understood.

Example: File Upload and Processing

Let’s look at a simple example.
Imagine we want to upload a file in order to have it processed later—whether for image compression, text recognition in a scan, or format conversion. This type of process can be easily automated and elegantly orchestrated using APIs.

In the following, we will walk through step by step how such a workflow can be specified and implemented. It includes the following steps:

Upload file
Process file
Retrieve processing status

A simple OpenAPI specification for an API with the described functionality could look as follows in YAML format:

‍

openapi: 3.0.0 info: title: Datei-Verarbeitungs-API version: 1.0.0 description: | Diese API ermöglicht die Automatisierung eines Workflows für den Datei-Upload, die Verarbeitung und die Statusabfrage. Ein typischer Workflow umfasst folgende Schritte: 1. **Datei hochladen**: Eine Datei wird über den `/upload`-Endpunkt hochgeladen. Die API gibt eine eindeutige Datei-ID zurück. 2. **Datei verarbeiten**: Die hochgeladene Datei wird über den `/process`-Endpunkt verarbeitet. Der Verarbeitungsstatus wird zurückgegeben. 3. **Verarbeitungsstatus abrufen**: Mit dem `/status/{fileId}`-Endpunkt kann der aktuelle Status der Datei abgefragt werden.

Beispiel-Workflow: - Eine Datei wird hochgeladen. - Der Workflow wartet, bis die Datei erfolgreich verarbeitet wurde. - Am Ende wird eine Bestätigung über den Abschluss des Prozesses zurückgegeben.

paths: /upload: post: summary: Datei hochladen description: | Lädt eine Datei hoch und gibt eine eindeutige Datei-ID zurück, die für die weiteren Schritte verwendet wird. Der erste Schritt im Workflow ist der Datei-Upload. Diese Datei-ID ist notwendig für die Verarbeitung und Statusabfrage. operationId: uploadFile requestBody: required: true content: multipart/form-data: schema: type: object properties: file: type: string format: binary description: Die Datei, die hochgeladen werden soll. responses: '200': description: Datei erfolgreich hochgeladen. content: application/json: schema: type: object properties: fileId: type: string description: Eindeutige ID, die die hochgeladene Datei identifiziert. example: fileId: "abc123" '400': description: Ungültige Anfrage, z. B. fehlende Datei. /process: post: summary: Datei verarbeiten description: | Startet die Verarbeitung einer hochgeladenen Datei anhand ihrer ID. Dieser Schritt im Workflow wird ausgelöst, sobald der Upload abgeschlossen ist. Die Verarbeitung erfolgt asynchron, und der Verarbeitungsstatus kann später über den `/status`-Endpunkt abgefragt werden. operationId: processFile requestBody: required: true content: application/json: schema: type: object properties: fileId: type: string description: Die ID der zu verarbeitenden Datei. example: fileId: "abc123" responses: '200': description: Datei erfolgreich zur Verarbeitung eingereicht. content: application/json: schema: type: object properties: status: type: string enum: [processing, completed, failed] description: Status der Verarbeitung. example: status: "processing" '400': description: Ungültige Anfrage, z. B. fehlende Datei-ID. /status/{fileId}: get: summary: Verarbeitungsstatus abrufen description: | Gibt den aktuellen Status der Verarbeitung einer Datei zurück. Dieser Schritt im Workflow wird verwendet, um den Abschluss des Verarbeitungsprozesses zu prüfen. Sobald der Status "completed" erreicht ist, ist die Datei vollständig verarbeitet. operationId: getProcessingStatus parameters: - name: fileId in: path required: true schema: type: string description: Die ID der Datei, deren Verarbeitungsstatus abgefragt werden soll. responses: '200': description: Der aktuelle Verarbeitungsstatus der Datei. content: application/json: schema: type: object properties: status: type: string enum: [processing, completed, failed] description: Der aktuelle Status der Datei. example: status: "completed" '404': description: Datei nicht gefunden. content: application/json: schema: type: object properties: error: type: string description: Fehlerbeschreibung. example: error: "Datei nicht gefunden."

‍

OpenAPI for Describing API Workflows?

Although OpenAPI is fundamentally machine-readable—in some cases even more easily read by machines than by humans—we find that the example above is, at best, only human-readable. While we can extract the sequence of requests from the documentation, the description of the workflow is in no way standardized. As a result, this specification is of limited use when it comes to describing workflows or documenting an integration scenario.

Now imagine that our API does not just contain the exact endpoints needed for our small example workflow, but instead provides a large number of endpoints for a wide variety of workflows. In such a case, this type of description quickly reaches its limits—and very soon becomes unhelpful even for humans.

‍

Arazzo for Describing API Workflows!

This is where Arazzo comes into play—a new specification from the OpenAPI Initiative for describing API workflows. The goal of this new standard is the machine-readable definition of API workflows. These workflows are oriented around use cases and represent a sequence of API calls that, taken together, create value and serve business objectives.

In other words: workflows specified with Arazzo are deterministic “recipes” for how APIs should be used. They allow us to express exactly how the APIs are intended to be consumed. This type of explicit information will, sooner or later, also benefit various AI tools. The precise description of the steps required for a business use case can be executed by artificial intelligence, thereby delivering the actual value of the business scenario.

We can also move beyond simple code generation based solely on OpenAPI specifications. Until now, the output of such tools has typically been generated clients that expose the operations from the API documentation one-to-one. However, how those operations should be combined or in which order they should be used remained unclear and had to be manually implemented. With proper tool support, the Arazzo workflow specification can contribute here as well, enabling targeted code generation. The function contained in the generated SDK would then no longer be just uploadFile() but could instead be something like executeFileProcessingWorkflow().

Interaction with the API thus becomes directly aligned with the value provided by the API itself, making it possible to specify a new level of abstraction.

‍

Example: File Processing as an Arazzo Workflow

Let’s return to the example of file upload followed by processing. In the following, we will specify the previously outlined workflow using Arazzo.

It quickly becomes clear that the specification bears the hallmark of the OpenAPI Initiative and leverages familiar concepts. This lowers the barrier to entry when working with the API workflow specification.

‍

Step 1: Defining the Arazzo Specification

We start by defining the basic metadata of the workflow. This includes the workflow name, a description, and the version. Just like an OpenAPI specification, an Arazzo specification begins with an info block that contains metadata about the workflow. The fields title and version are mandatory.

Next comes a list of API descriptions (sourceDescriptions) required for the workflow. In our example, this consists of a single OpenAPI specification:

‍

arazzo: '1.0.0' info: title: Datei-Verarbeitungs-Workflow version: '1.0.0' description: | Dieser Workflow automatisiert den Prozess des Hochladens, Verarbeitens und Überprüfens des Status einer Datei. sourceDescriptions: - name: fileProcessingAPI url: https://example.com/openapi.yaml type: openapi

Step 2: Basic Definition of the Workflow

The next section contains the description of the workflows. Yes, an Arazzo specification can indeed include multiple workflows based on shared APIs. This means that it is not necessary to create a separate specification for each individual workflow. In this context, the reusable components already known from OpenAPI also prove to be valuable.

In the Arazzo workflow specification, a workflow is a structured description of a sequence of steps that are executed automatically. A workflow is defined within the workflows object, which contains the following key elements:

‍

workflowId: A unique identifier for the workflow.
summary: A brief summary of the workflow.
description: A detailed description of the workflow, including its context and purpose.
inputs: The input data required for the workflow to run.
steps: A list of steps executed in a defined order.

The specification of a workflow therefore begins with an ID, a summary, and a description:

workflows: - workflowId: fileUploadProcessing summary: Automatisiert den Datei-Upload und die Verarbeitung. description: | Dieser Workflow lädt eine Datei hoch, startet deren Verarbeitung und überprüft den Abschluss der Verarbeitung.

Step 3: Defining Input Parameters

A workflow requires certain parameters in order to be executed. In our example, this is the file that needs to be processed. These parameters are defined as inputs:

inputs: type: object properties: file: type: string format: binary description: Die Datei, die hochgeladen und verarbeitet werden soll.

Step 4: Adding the Steps

Next come the individual steps of the workflow. These actions are defined as steps within the steps object. Each step has a unique stepId, a description, and references to the corresponding API endpoints.

Each step is uniquely identifiable by its ID and, at the same time, references an operation of an API.

‍

Step 4.1: Upload File

The operationId, together with the parameters object, establishes the link to the API specification. In the following example, the value of values refers back to the workflow’s defined input parameters:

steps: - stepId: uploadFile description: Lädt die Datei hoch und erhält eine eindeutige Datei-ID. operationId: uploadFile parameters: - name: file in: body value: $inputs.file successCriteria: - condition: $statusCode == 200 outputs: fileId: $response.body.fileId

‍

The successCriteria then defines the conditions under which the step is considered successfully completed. The outputs collect relevant results that are needed for subsequent steps. In our example, this is the fileId field from the response of the API call used to upload the file.

Step 4.2: Process File

The output fileId from the first step is then used as an input parameter in the step responsible for processing the uploaded file:

- stepId: processFile description: Startet die Verarbeitung der hochgeladenen Datei. operationId: processFile parameters: - name: fileId in: body value: $steps.uploadFile.outputs.fileId successCriteria: - condition: $statusCode == 200 outputs: status: $response.body.status

‍

In this step, the status field from the response is defined as an output.

Step 4.3: Check Processing Status

After the file processing has been initiated in the previous step, the next step checks the status of the (asynchronous) processing.

In the simplest case, this step can be defined as follows:

However, we must assume that processing will not be completed immediately—in other words, that the status field in the first response will not yet be set to completed, as required by the successCriteria.

This means we need to account for repeated polling. To achieve this, we can adjust the definition of the checkStatus step by introducing a retry mechanism using onFailure:

‍

- stepId: checkStatus description: Überprüft den aktuellen Verarbeitungsstatus der Datei. operationId: getProcessingStatus parameters: - name: fileId in: path value: $steps.uploadFile.outputs.fileId successCriteria: - condition: $statusCode == 200 - condition: $.outputs.status == 'completed' outputs: status: $response.body.status onFailure: - name: retryCheckStatus type: retry stepId: checkStatus retryAfter: 5 retryLimit: 10 criteria: - condition: $.outputs.status == 'processing' - name: terminateWorkflow type: end criteria: - condition: $.outputs.status == 'failed'

If the request for the file’s processing status does not return an HTTP status 200 OK and the status field in the response is not set to completed, the step will be retried after 5 seconds (up to a maximum of ten times), provided that status is still set to processing. In the case of failed, the workflow will be terminated.

Complete Workflow Specification

Here is the full workflow specification:

arazzo: '1.0.0' info: title: Datei-Verarbeitungs-Workflow version: '1.0.0' description: | Dieser Workflow automatisiert den Prozess des Hochladens, Verarbeitens und Überprüfens des Status einer Datei. sourceDescriptions: - name: fileProcessingAPI url: https://example.com/openapi.yaml type: openapi workflows: - workflowId: fileUploadProcessing summary: Automatisiert den Datei-Upload und die Verarbeitung. description: | Dieser Workflow lädt eine Datei hoch, startet deren Verarbeitung und überprüft den Abschluss der Verarbeitung, indem die Statusprüfung mehrfach wiederholt wird. inputs: type: object properties: file: type: string format: binary description: Die Datei, die hochgeladen und verarbeitet werden soll. steps: - stepId: uploadFile description: Lädt die Datei hoch und erhält eine eindeutige Datei-ID. operationId: uploadFile parameters: - name: file in: body value: $inputs.file successCriteria: - condition: $statusCode == 200 outputs: fileId: $response.body.fileId

- stepId: checkStatus description: Überprüft den aktuellen Verarbeitungsstatus der Datei. operationId: getProcessingStatus parameters: - name: fileId in: path value: $steps.uploadFile.outputs.fileId successCriteria: - condition: $statusCode == 200 - condition: $.outputs.status == 'completed' outputs: status: $response.body.status onFailure: - name: retryCheckStatus type: retry stepId: checkStatus retryAfter: 5 retryLimit: 10 criteria: - condition: $.outputs.status == 'pending' - name: terminateWorkflow type: end criteria: - condition: $.outputs.status == 'failed'

‍

Options for Describing Workflow Logic

The example has shown that Arazzo provides us with simple ways to represent logic within workflow definitions through onSuccess and onFailure. In addition to retry, branching to other steps or even other workflows is possible using goto.

In more complex scenarios, dependencies between different workflows can also be defined: the dependsOn attribute of a workflow description within workflows makes it possible to specify workflows that must be completed before the respective workflow can be executed. Accordingly, parameter values can also reference other workflows.

The ability to chain and align workflows makes the specification particularly powerful and provides a clear structure for automation processes and integration scenarios. This deterministic description can also benefit us when testing our APIs, since dependencies and branching are explicitly defined.

‍

Outlook

The Arazzo specification is still very new. As such, not all requirements are met yet, nor are all conceivable scenarios represented. For instance, support for APIs other than synchronous HTTP APIs specified in OpenAPI almost made it into version 1.0.0 of the Arazzo standard. However, this was postponed in favor of releasing a stable first version. Event-driven APIs, nevertheless, are expected to find their way into the specification in a future iteration of the standard.

The clear goal is to achieve wide adoption of Arazzo for describing API workflows. The relevance is already there, as is the interest.

To truly bring the standard to life, appropriate tooling support is helpful—if not essential. Considering the broad adoption of the related OpenAPI specification, it is reasonable to assume that compatible tools will not take long to appear.

With Redocly, a linter for Arazzo is already available—both as a command-line tool and as a Visual Studio Code extension. This marks the first step toward adoption of the new standard: specification-compliant workflows can already be created, and further tools will follow. The Redocly roadmap already provides a preview of what’s to come.

If this has sparked your interest, you can find more information and a collection of examples in the Arazzo repository of the OpenAPI Initiative. And feel free to reach out to us to explore together the possibilities that arise with API workflows and Arazzo as an open, standardized way of defining them!

knowledge

What interests us

View all articles