Autogenerating TypeScript types and data validation for OpenAPI Schemas

; Date: Fri Dec 15 2023

Tags: Node.JS »»»» TypeScript »»»» OpenAPI

OpenAPI lets one describe web API methods and data types in a language-neutral format. While we can manually write code matching the schema, automatic code generation is more agile. This article focuses on autogeneration of data types and data validation for TypeScript on Node.js.

TypeScript is a valuable addition to the JavaScript ecosystem. By adding types to JavaScript code, we can write more expressive code, and the compiler helps us catch certain classes of bugs before we run a line of code.

However, TypeScript does not check data validity at run time, meaning incorrect data at runtime leads to errors. This can be fixed with runtime data verification. While TypeScript does not support runtime data verification, several packages do. For example, (www.npmjs.com) runtime-data-validation offers decorators for use with TypeScript classes that automatically run data verification on method calls or setting property values. With other libraries, like Zod and Joi, there is a function to call which validates data based on a schema description you've written.

It is possible to automate creating the data validation schema. The goal is not only runtime validation, but to automatically keep the validation in sync with type declarations.

For example, we can use OpenAPI to describe object schema's, REST API methods, and the response to those methods. The API specification has all the information for generating API implementation and type or class definitions. The OpenAPI specification becomes a single source of truth on API's and object types from which one can generate a wide variety of software engineering artifacts (documentation, code, etc).

While we can manually create that code by reading the specification, we'd be in the situation of having two sources of truth: the specification and the code. But, autogenerating code from the specification, using appropriate generator tools, preserves the API spec as the one source of truth.

Image by David Herron

In this article we'll explore a variety of tools for OpenAPI specifications to autogenerate either TypeScript type definitions or data validation schema's using Zod and Joi.

The ideal result is that, every time we modify the OpenAPI specification, those parts of our application code are automatically rebuilt. The IDE environment immediately responds by telling us what application code must be refactored due to the specification change.

This article was inspired while creating an npm package containing data types and validation code for a specification written in OpenAPI (OpenADR v3). In the process I tried several code generation tools TypeScript on Node.js. The goal is a solid data type and data validation code with which TypeScript programmers on Node.js can implement both OpenADR v3 servers (VTNs) and clients

Therefore, the scope is:

  • Autogeneration of TypeScript types for OpenAPI schema objects
  • Autogeneration of both Joi and Zod validation code for the same

What follows is an evaluation of several tools for that purpose, tool usage, and a discussion of how to make use of the generated code.

There are packages for autogenerating both Joi and Zod schema's from OpenAPI specifications. Both are suitable for use in TypeScript applications, and support supplying the default values from the specification, if needed.

A sample OpenAPI object schema

We're not going to build an application, but instead evaluate some tools. To aid the evaluation, it will be helpful to have a sample object type, specified in OpenAPI, and to compare the generated code from the various tools.

# SOURCE: OpenADR v3 (version 3.0.1) specification
#   The source specification is under Apache 2 license
intervalPeriod:
  type: object
  description: |
    Defines temporal aspects of intervals.
    A duration of default null indicates infinity.
    A randomizeStart of default null indicates no randomization.
  required:
    - start
  properties:
    start:
      $ref: '#/components/schemas/dateTime'
      #  The start time of an interval or set of intervals.
    duration:
      $ref: '#/components/schemas/duration'
      #  The duration of an interval or set of intervals.
    randomizeStart:
      $ref: '#/components/schemas/duration'
      #  Indicates a randomization time that may be applied to start.
# ...
dateTime:
  type: string
  format: date-time
  description: datetime in ISO 8601 format
  example: 2023-06-15T09:30:00Z
  default: "0000-00-00"
duration:
  type: string
  pattern: '^(-?)P(?=\d|T\d)(?:(\d+)Y)?(?:(\d+)M)?(?:(\d+)([DW]))?(?:T(?:(\d+)H)?(?:(\d+)M)?(?:(\d+(?:\.\d+)?)S)?)?$'
  description: duration in ISO 8601 format
  example: PT1H
  default: PT0S

The IntervalPeriod object is used to describe time intervals. Often a request for modifying energy consumption or production covers a time period, the start time and the duration. The randomizedStart value avoids overloading the electric grid when a period of energy curtailment ends by randomizing when each device restarts. Both fields rely on the ubiquitous ISO 8601 format for date-time and duration strings.

Generating TypeScript data validation code from OpenAPI with openapi-to-zod

The (www.npmjs.com) openapi-to-zod is a wrapper around Json-Schema-to-Zod. It extracts JSON Schema from an OpenAPI file, and from that generate Zod schema's in the form of JavaScript code.

The way to use this tool is:

$ npm install openapi-to-zod --save-dev
$ npx openapi-to-zod -x ts \
      -i ./path/to/openapi-spec.yml \
      -o ./path/to/src/zod/

The first line installs the tool into an npm-based project, adding it to the devDependencies section of the package.json. This should be done once for every project.

The second runs the tool to convert the OpenAPI spec into a directory full of JavaScript files containing Zod schema declarations. In my project, it created the following files:

zod-dateTime.ts                zod-intervalPeriod.ts
zod-objectTypes.ts             zod-reportDescriptor.ts
zod-subscription.ts            zod-duration.ts
zod-interval.ts                zod-point.ts
zod-reportPayloadDescriptor.ts zod-valuesMap.ts
zod-eventPayloadDescriptor.ts  zod-notification.ts
zod-problem.ts                 zod-report.ts
zod-ven.ts                     zod-event.ts
zod-objectID.ts                zod-program.ts
zod-resource.ts

Each of these has a corresponding declaration in the components.schemas section of the OpenADR specification.

The individual files are in the ES module format making it easy to compile them with TypeScript. The -x ts option causes the file names to have the .ts extension.

The code generated for the IntervalPeriod object is:

import { z } from "zod";

export default z
  .object({
    start: z.string().datetime().describe("datetime in ISO 8601 format"),
    duration: z
      .string()
      .regex(
        new RegExp(
          "^(-?)P(?=\\d|T\\d)(?:(\\d+)Y)?(?:(\\d+)M)?(?:(\\d+)([DW]))?(?:T(?:(\\d+)H)?(?:(\\d+)M)?(?:(\\d+(?:\\.\\d+)?)S)?)?$"
        )
      )
      .describe("duration in ISO 8601 format")
      .default("PT0S"),
    randomizeStart: z
      .string()
      .regex(
        new RegExp(
          "^(-?)P(?=\\d|T\\d)(?:(\\d+)Y)?(?:(\\d+)M)?(?:(\\d+)([DW]))?(?:T(?:(\\d+)H)?(?:(\\d+)M)?(?:(\\d+(?:\\.\\d+)?)S)?)?$"
        )
      )
      .describe("duration in ISO 8601 format")
      .default("PT0S"),
  })
  .describe(
    "Defines temporal aspects of intervals.\nA duration of default null indicates infinity.\nA randomizeStart of default null indicates no randomization.\n"
  );

This schema declaration is straight-forward and is a good translation from the specification. The object definitions for DateTime and Duration objects were pulled into this schema declaration, and not used by reference.

An application using these schema declarations needs two things for every object:

  1. A type, e.g. named IntervalPeriod
  2. A validator, e.g. named parseIntervalPeriod

In the directory above those generated files, I created an index.ts containing a series of declarations like this:

import parseIntervalPeriod from './zod/zod-intervalPeriod.js';
export { default as parseIntervalPeriod } from './zod/zod-intervalPeriod.js';
export type IntervalPeriod = z.infer<typeof parseIntervalPeriod>;

The z.infer function extracts the TypeScript type from a Zod schema. To view the type, open the project in Visual Studio Code (which has TypeScript support installed) and hover the mouse over the type declaration. A popup window shows the generated type.

This is a one-step process. With openapi-to-zod, we generate Zod schema's. From those we extract the TypeScript types.

It's cool to have one tool to serve both needs. But, there are some problems.

The first is that .optional() is not generated for some of fields in the schema. This means editing the generated code to add that, or else editing the specification in a way which causes the tool to generate .optional().

The second I ran across is the code generated for regular expression patterns in the OpenAPI specification. The OpenADR spec currently has one object where slashes were put around the regular expression. The code generated in that case is:

.regex(
  new RegExp(
    "/^(-?)P(?=\\d|T\\d)(?:(\\d+)Y)?(?:(\\d+)M)?(?:(\\d+)([DW]))?(?:T(?:(\\d+)H)?(?:(\\d+)M)?(?:(\\d+(?:\\.\\d+)?)S)?)?$/"
  ))

That is an invalid regular expression, and does not match Duration strings as expected. The problem might be in the specification rather than the tool. But, couldn't the tool detect this case? The cure in this case is to modify the specification, and to recommend this change to the spec maintainer.

The third problem we can call code bloat. Since it dereferences all object schema's, every schema declaration pulls in all code for all referenced schema object, increasing the resulting code size.

However, as we see later with ts-to-zod, this results in correctly generating default values for missing fields.

The last problem has to do with not generating type declarations as text, and not capturing information about data formats or allowed values. A textual data type can be inspected for review when needed. An inferred Zod type cannot be inspected that way. Further, as we see later, JSDoc tags are useful in multiple ways for presenting data formats and allowed values. The inferred Zod type does not have any place for those tags.

Generating TypeScript type declarations from OpenAPI using openapi-typescript

The (www.npmjs.com) openapi-typescript is an interesting tool for generating TypeScript declarations for not just the schema's, but also the REST API parameters.

Usage:

$ npx openapi-typescript ./path/to/spec.yaml \
    -o ./path/to/types.ts

What's generated are nested objects within which are type declarations. This means we do not have a clean type declaration, like type IntervalPeriod, but an object nested within a tree of objects that contains the type declaration.

For example, the definition for IntervalPeriod is referenced & defined like so:

components: {
  schemas: {
    // ...
    interval: {
      // ...
      intervalPeriod?: components["schemas"]["intervalPeriod"];
      // ...
    };
    /**
     * @description Defines temporal aspects of intervals.
     * A duration of default null indicates infinity.
     * A randomizeStart of default null indicates no randomization.
     */
    intervalPeriod: {
      start: components["schemas"]["dateTime"];
      duration?: components["schemas"]["duration"];
      randomizeStart?: components["schemas"]["duration"];
    };
    // ...
    /**
     * Format: date-time
     * @description datetime in ISO 8601 format
     * @example "2023-06-15T09:30:00.000Z"
     */
    dateTime: string;
    /**
     * @description duration in ISO 8601 format
     * @default PT0S
     * @example PT1H
     */
    duration: string;
    // ...
  }
}

The documentation for this package gives this example:

import { paths, components } from "./path/to/my/schema";

// Schema Obj
type MyType = components["schemas"]["MyType"];

// Path params
type EndpointParams = paths["/my/endpoint"]["parameters"];

Information about an API endpoint is stored at paths['/my/endpoint']. One of the items in that object is parameters. For example, the API endpoint "/vens/{venID}/resources/{resourceID}" has this parameters object:

parameters: {
  path: {
    /** @description object ID of the associated ven. */
    venID: components["schemas"]["objectID"];
    /** @description object ID of the resource. */
    resourceID: components["schemas"]["objectID"];
  };
};

The two path entries refer to the parameters you see in the API endpoint. Each has the type objectID, referenced from components["schemas"]. The parameters object can also have a query object, for query string parameters.

In other words, openadr-typescript extracts a lot of useful information from an OpenAPI spec, presenting that information as JavaScript/TypeScript code.

But, my attempt to use this as a Type failed:

export type IntervalPeriod
        = components["schemas"]["intervalPeriod"];

This is what the documentation suggests that I do. However, there two problems arose. The first is that the Zod generator tool crashed with a weird error message. The second is that the generated code did not include JSDoc tags, which we'll see later are extremely useful.

Kicking the tires of openapi-client-axios-typegen

The (www.npmjs.com) openapi-client-axios-typegen is the core of the openapicmd typegen tool. Openapicmd is part of the OpenAPI Stack suite of tools.

The OpenAPI Stack collects together a very useful set of tools for developing applications around OpenAPI specifications. The Typegen feature tries to generate useful type declarations for use in writing TypeScript code.

$ npm install openapicmd --save-dev
$ npx openapicmd typegen ./path/to/spec.yaml \
      >./path/to/package-name.d.ts 

This command generates a TypeScript Types file containing type declarations for the object schema's in the API spec.

But, the vast glaring problem is that it loses track of important information such as string formats, allowed values, and the like.

To see what I mean, let's look at the IntervalPeriod, DateTime and Duration object definitions. The generated declarations are:

/**
 * datetime in ISO 8601 format
 * example:
 * 2023-06-15T09:30:00.000Z
 */
export type DateTime = string; // date-time
/**
 * duration in ISO 8601 format
 * example:
 * PT1H
 */
export type Duration = string; // ^(-?)P(?=\d|T\d)(?:(\d+)Y)?(?:(\d+)M)?(?:(\d+)([DW]))?(?:T(?:(\d+)H)?(?:(\d+)M)?(?:(\d+(?:\.\d+)?)S)?)?$
// ...
/**
 * Defines temporal aspects of intervals.
 * A duration of default null indicates infinity.
 * A randomizeStart of default null indicates no randomization.
 *
 */
export interface IntervalPeriod {
    start: /**
      * datetime in ISO 8601 format
      * example:
      * 2023-06-15T09:30:00.000Z
      */
    DateTime /* date-time */;
    duration?: /**
      * duration in ISO 8601 format
      * example:
      * PT1H
      */
    Duration /* ^(-?)P(?=\d|T\d)(?:(\d+)Y)?(?:(\d+)M)?(?:(\d+)([DW]))?(?:T(?:(\d+)H)?(?:(\d+)M)?(?:(\d+(?:\.\d+)?)S)?)?$ */;
    randomizeStart?: /**
      * duration in ISO 8601 format
      * example:
      * PT1H
      */
    Duration /* ^(-?)P(?=\d|T\d)(?:(\d+)Y)?(?:(\d+)M)?(?:(\d+)([DW]))?(?:T(?:(\d+)H)?(?:(\d+)M)?(?:(\d+(?:\.\d+)?)S)?)?$ */;
}
// ...

All TypeScript knows is that DateTime and Duration are strings. But, for each of these it is extremely important that they are further limited to the ISO8601 DateTime format, or Duration format. The comments in the source code are not useful as anything the software could use to enforce these restrictions.

Therefore, this package is not useful for the purpose of runtime data validation. The type declarations are very useful. But, to have certainty that data fits the constraints of the API requires runtime data validation which knows the allowed values.

Another tool, (www.npmjs.com) dtsgenerator, produced nearly identical output.

Usage:

$ npm install dtsgenerator --save-dev
$ npx dtsgen -o openadr-301.d.ts \
      -t ESNEXT \
      ../oadr3.0.1.yaml

The .d.ts file contained type declarations for the schema objects. The comments showed that had gathered information about specific formats and allowed values. But it did nothing useful, and simply left that information in comments.

Generating TypeScript types with openapi-codegen

The (www.npmjs.com) OpenAPI-CodeGen package generates code from OpenAPI specifications. It doesn't just generate type declarations, but also client libraries, and REACT query components.

Usage:

$ npm install @openapi-codegen/cli --save-dev
$ npx @openapi-codegen/cli init

The first command installs the package locally, and the second command interactively generates a configuration object.

The configuration object is a JavaScript source file describing what code generation tasks to perform. It can be generated using the init command, or you can edit the config file yourself. Unfortunately the documentation is weak about details of the config file.

The init command interactively queries you for the particulars you're interested in. The GitHub repository has an animated display showing what that's like. In my case, after running this command, I edited the config file to implement some other features, and ended up with this:

import {
  generateSchemaTypes,
  generateFetchers
} from "@openapi-codegen/typescript";
import {
  defineConfig
} from "@openapi-codegen/cli";
export default defineConfig({
  openADR: {
    from: {
      relativePath: "../oadr3.0.1.yaml",
      source: "file",
    },
    outputDir: "./foo",
    to: async (context) => {
      const { schemasFiles } = await generateSchemaTypes(context, {
        filenamePrefix: "openADR",
      });
      await generateFetchers(context, {
        /* config */
        schemasFiles,
      });
    },
  },
});

The string openADR is the name for the API, since it is derived from the OpenADR v3 specification. The from section says where to retrieve the OpenAPI specification, and the outputDir setting says where to place the generated source.

The to section is where we specify what to generate. The generateSchemaTypes function generates a file containing TYpeScript types. Notice it returns a variable schemasFiles that is used later.

The generateFetchers function generates a client library for accessing the API.

Once you've created the configuration, use it like so:

$ npx openapi-codegen gen {namespace}

The namespace parameter appears to be the tag in the configuration file. It is also used in the names for generated files. Using the above configuration, these files are generated.

openadr3ApiComponents.ts  openadr3ApiFetcher.ts
openAdRSchemas.ts

The last file contains the TypeScript types. For example:

/**
 * Defines temporal aspects of intervals.
 * A duration of default null indicates infinity.
 * A randomizeStart of default null indicates no randomization.
 */
export type IntervalPeriod = {
  start: DateTime;
  duration?: Duration;
  randomizeStart?: Duration;
};

/**
 * datetime in ISO 8601 format
 *
 * @format date-time
 * @example "2023-06-15T09:30:00.000Z"
 * @default 0000-00-00
 */
export type DateTime = string;

/**
 * duration in ISO 8601 format
 *
 * @pattern '^(-?)P(?=\d|T\d)(?:(\d+)Y)?(?:(\d+)M)?(?:(\d+)([DW]))?(?:T(?:(\d+)H)?(?:(\d+)M)?(?:(\d+(?:\.\d+)?)S)?)?$'
 * @example PT1H
 * @default PT0S
 */
export type Duration = string;

This is almost exactly what we would write by hand, but it takes just a couple seconds to generate.

Notice that the comments contain JSDoc annotations describing the allowed format. This supports other attributes such as minimum/maximum string length, and minimum/maximum values.

The file openadr3ApiComponents.ts contains definitions for the API operations defined in the OpenAPI specification. This includes type definitions for the parameters passed through the API request, definitions of the return values from the request, and functions for invoking the REST call. The file openadr3ApiFetcher.ts contains support code for these functions. Together they comprise REST client code, as implied by the Fetcher name.

That the @openapi-codegen/cli package adds JSDoc tags is very important. They will be used for generating the Zod schema code in the next section, but there are many tools which read these tags for many purposes. For example, VSCode can use them while writing code, and of course JSDoc uses them in generating documentation.

Generating Zod code using ts-to-zod

The (www.npmjs.com) ts-to-zod package generates very nice Zod code from TypeScript type declarations. You simply give it a source file containing types, and it generates a file containing Zod schema code. Additionally, ts-to-zod looks at JSDoc tags while generating Zod code. So.. look back a couple paragraphs ...

Like openapi-to-zod it generates Zod schema's, But it does not work from an OpenAPI specification. Instead, ts-to-zod works from TypeScript types and can use JSDoc tags. Further, the ts-to-zod package does not generate bloated code.

For example, the generated schema for IntervalPeriod is:

export const dateTimeSchema = z.string().datetime();

export const durationSchema = z
  .string()
  .regex(
    /^(-?)P(?=\d|T\d)(?:(\d+)Y)?(?:(\d+)M)?(?:(\d+)([DW]))?(?:T(?:(\d+)H)?(?:(\d+)M)?(?:(\d+(?:\.\d+)?)S)?)?$/,
  );

export const intervalPeriodSchema = z.object({
  start: dateTimeSchema,
  duration: durationSchema.optional(),
  randomizeStart: durationSchema.optional(),
});

That is, a schema containing a field for an existing object type will reuse the schema for that type.

But, we've gotten ahead of ourselves. Let's take a step back to see how ts-to-zod is used.

$ npm install ts-to-zod --save-dev
$ npx ts-to-zod ./path/to/types.ts ./path/to/zod-types.ts

The file types.ts is a TypeScript file containing type declarations. The tool supports reading JSDoc annotations so that it can generate the correct Zod schema based on all the constraints.

But, how is types.ts to be generated?

It's possible to manually restructure the output from openadr-typescript or openapicmd typegen to construct a file with clean type declarations. But, that is tedious, and then would have to be recreated every time you modify the specification.

There's another tool, @openapi-codegen/cli, which does a much better job and is a great fit for ts-to-zod. Refer back to the previous section to see what it does.

That makes the workflow for generating both TypeScript types and Zod schema's the following:

$ npx openapi-codegen gen {namespace}
$ npx ts-to-zod ./path/to/types.ts ./path/to/zod-types.ts

The first step generates good-looking TypeScript types, while the second generates good-looking Zod schema's. In both cases the generated code is nearly identical to that an experienced coder would create by hand, but the result is generated quickly enough that it could be done every time we save the OpenAPI spec file.

There is, unfortunately, a very serious problem with the schema objects generated by ts-to-zod. It is not generating the default values for input objects lacking a field.

Refer to the schema shown above. In intervalPeriodSchema, it refers to durationSchema with the .optional() function.

Example of expected and actual behavior:

const period = intervalPeriodSchema.parse({
  start: '2023-11-30T10:20:30Z',
  duration: 'PT3M'
});

// Expected value
// {
//    start: '2023-11-30T10:20:30Z',
//    duration: 'PT3M',
//    randomizedStart: 'PT0S'
// }
//
// Actual value
// {
//    start: '2023-11-30T10:20:30Z',
//    duration: 'PT3M'
// }

The default value in this case is contained within the nested schema object. Thinking about this, the nested schema must not be invoked when it has .optional() attached.

Refer back to the schema generated by openapi-to-zod. With the same test case, it produces the expected object shown here. The difference between the two is whether the subsidiary objects are parsed with a subsidiary schema, or not.

Joifully generating Joi schema's from OpenAPI

So far data validation has focused on Zod schemas. There is a package, (www.npmjs.com) @savotije/openapi-to-joi which generates Joi validation schemas from OpenAPI specifications.

It is relatively easy to use, and produces output similar to openapi-typescript. Namely, a TypeScript file is produced containing nested objects which in turn contain Joi schema's.

Usage:

$ npm install joi --save
$ npm install @types/hapi__joi --save-dev
$ npm install @savotije/openapi-to-joi --save
$ npm install joi-iso-datestring --save
$ npx openapi-to-joi \
    path/to/openapi-spec.yml \
    -o path/to/output-file.ts

Installing @types/hapi__joi alongside joi was the magic cure for several problems.

There are three forks of the openapi-to-joi package. The one maintained by @savotije is the most up-to-date, but even it has some problems which are detailed below.

The joi-iso-datestring package is the cure for one of those problems.

The output file is structured like so:

import Joi from "joi";

export const schemas = {
  parameters: {
    // One object per Operation listed
    // in the specification
    // Each contains Joi schemas for
    // path, query, header, and cookie schemas
  },
  components: {
    // One object per schema object
    // Each contains the Joi schema
    // corresponding to the object schema
  }
};

For example, this is the definition of the IntervalPeriod object.

import Joi from "joi";

export const schemas = {
  parameters: {
    // ...
  },
  components: {
    // ...
    intervalPeriod: Joi.object({
      start: Joi.date().description("datetime in ISO 8601 format").required(),
      duration: Joi.string()
        .allow("")
        .default("PT0S")
        .description("duration in ISO 8601 format")
        .pattern(
          /^(-?)P(?=\d|T\d)(?:(\d+)Y)?(?:(\d+)M)?(?:(\d+)([DW]))?(?:T(?:(\d+)H)?(?:(\d+)M)?(?:(\d+(?:\.\d+)?)S)?)?$/,
          {}
        )
        .min(0),
      randomizeStart: Joi.string()
        .allow("")
        .default("PT0S")
        .description("duration in ISO 8601 format")
        .pattern(
          /^(-?)P(?=\d|T\d)(?:(\d+)Y)?(?:(\d+)M)?(?:(\d+)([DW]))?(?:T(?:(\d+)H)?(?:(\d+)M)?(?:(\d+(?:\.\d+)?)S)?)?$/,
          {}
        )
        .min(0),
    })
      .description(
        "Defines temporal aspects of intervals.\nA duration of default null indicates infinity.\nA randomizeStart of default null indicates no randomization.\n"
      )
      .unknown(),
    // ...
  }
};

This is a useful and straight-forward schema object. Notice that it does not reference the DateTime and Duration schema's, but directly pulls them into the IntervalPeriod schema.

After studying this result, I recommend creating another TypeScript file - such as index.ts that's the main entry to your package - containing this pattern:

import { schemas } from './joi/oadr3.js';

export const joiDateTime = schemas.components.dateTime;
export const joiDuration = schemas.components.duration;

export const joiEvent = schemas.components.event;

export const joiIntervalPeriod
        = schemas.components.intervalPeriod;
// ...

This makes it easier for your code to make use of the Joi schema's. They're referenced using a name like joiEvent which is far more expressive than schemas.components.event.

Using Joi schema's in your code follows this pattern:

const result = joiEvent.validate({
    // Event object
});
if (result.error) {
    // error handling
}

The Joi validate function does not throw exceptions. Instead, it returns an object with error and value fields. If error is undefined, no error was detected, and value contains the validated object.

The error object is a Joi ValidationError object and contains a lot of useful information. As an Error subclass, it can be thrown as an exception.

All the above looks good. However, there is a problem with this rendering of the dateTime schema object:

components: {
  schemas: {
    // ...
    dateTime: Joi.date()
      .description("datetime in ISO 8601 format"),
    // ...
  }
}

The schema definition for dateTime is:

# SOURCE: OpenADR v3 (version 3.0.1) specification
#   The source specification is under Apache 2 license
components:
  schemas:
    dateTime:
      type: string
      format: date-time
      description: datetime in ISO 8601 format
      example: 2023-06-15T09:30:00Z
      # default: "0000-00-00"

The schema definition clearly says it is a string matching the date-time format. But, Joi.date() converts such strings to a Date object. Indeed, testing this schema definition results in a Date object. (github.com) See the GitHub issue.

To support fixing this, the Joi extension (www.npmjs.com) joi-iso-datestring has been created. It correctly validates more ISO 8601 date/time formats than does Joi.date(), plus it returns a string rather than a Date.

Unfortunately, openapi-to-joi does not offer a way to override or customize the generated code. One could fork that project to change the generated code. Or, one can patch the generated code.

To use joi-iso-datestring, the first is to install it in your package. Then add this to the code generated by openadr-to-joi:

import _Joi, { Extension } from "joi";
import {
    isoDate, isoDateTime, isoTime, isoYearMonth
} from 'joi-iso-datestring';

const Joi = _Joi.extend(isoDate as unknown as Extension)
                .extend(isoDateTime)
                .extend(isoTime)
                .extend(isoYearMonth);

The joi-iso-datestring package contains four extensions to Joi as shown here. They handle validating Date, Time, DateTime, and YearMonth formats. The Joi.extend function returns the extended Joi and the most straight-forward method to apply the extensions is as shown here.

The casts on the first .extend call (as unknown as Extension) is because TypeScript threw a strange error during compilation:

error TS2345: Argument of type '(joi: Root) => Extension | ExtensionFactory' is not assignable to parameter of type 'Extension | ExtensionFactory'

The error message is fairly clear. The isoDate object is declared as an arrow function which apparently is not the declared signature for the .extend call. But, the documentation clearly says to declare Joi extensions in the way isoDate is declared, and to use the extension in this way. Further, for the other three extensions this error is not thrown.

The cure shown here came from a (stackoverflow.com) StackOverflow discussion.

To put joi-iso-datestring to use requires modifying the generated code. For every instance of either Joi.string().date() or Joi.date(), convert to the following:

  // ...
  createdDateTime: Joi.isoDateTime().description("datetime in ISO 8601 format"),
  modificationDateTime: Joi.isoDateTime().description(
        "datetime in ISO 8601 format"
      ),
  // ...
  dateTime: Joi.isoDateTime().description("datetime in ISO 8601 format"),
  // ...

The question is, how to automate these changes.

Starting point:

sed --in-place 's/Joi\.date\(\)/Joi.isoDateTime()/' ../../openadr/openadr-3-ts-types/package/src/joi/oadr3.ts

The venerable sed program can modify the source code.

Another issue which arose is about stripping unknown values, or failing if unknown values are present. Remember that most APIs want strict adherence to the specification, while other APIs are very loose. This means the policy to follow is up to the application.

With Joi, there are two ways for specifying optionality:

const joiSchema = Joi. 
        // define schema
   .prefs({
     allowUnknown: false,
     stripUnknown: true,
   });
  
const result = joiSchema.validate(object);

// OR

const joiSchema = Joi.
          // define schema
          ;

const result = joiSchema.validate(object, {
     allowUnknown: false,
     stripUnknown: true,
   });

There are several options to the validate function, and the same options can be added to a schema using .prefs. The allowUnknown option says whether to allow unknown fields, while the stripUnknown option says whether to strip out such fields.

These are roughly equivalent to the Zod passthrough and strict options.

My initial test showed that these options did nothing. That was confusing, especially as a small test program worked perfectly. But, after installing the @types/hapi__joi package, these options began to work as advertised.

Here's a little program for exploring the JOI validate options:

// Source: https://github.com/hapijs/joi/issues/2735
// This file is for exploring the
// stripUnknown and allowUnknown options.
// The second should throw errors on
// unknown data.
// The first should strip them.

// In this example, both work correctly.

import Joi from 'joi';

const schema = Joi.object({
    foo: Joi.string(),
    items: Joi.array().items(Joi.object({
        bar: Joi.string(),
    })),
})
// .prefs({
//     allowUnknown: false,
//     stripUnknown: true,
// })

const options = {
    // abortEarly: false,
    // allowUnknown: false,
   stripUnknown: true,
}

const input = {
    foo:  'five', // 5, // invalid value
    unknownValue: 'strip',
    items: [
        {
            bar: 'bar five', // 5, // invalid value
            stripValue: 'strip'
        },
    ],
}

const { error, value } = schema
          .validate(input, options)
console.log(error);
console.log(value);

// { foo: 5, items: [{ bar: 5, stripValue: 'strip' }]

You can save this with a .mjs file extension to run it with Node.js without having to recompile it with TypeScript on every change. To explore, comment/uncomment things, change values, add or remove information to input. Then, see what kind of errors and output is generated.

The last issue with openapi-to-joi is incorrectly generated code where a schema has a field that uses oneOf to support several field types. What I mean is this:

    # SOURCE: OpenADR v3 (version 3.0.1) specification
    #   The source specification is under Apache 2 license
    # ...
    notification:
      type: object
      # ...
      properties:
        # ...
        object:
          type: object
          description: the object that is the subject of the notification.
          example: {}
          oneOf:
            - $ref: '#/components/schemas/program'
            - $ref: '#/components/schemas/report'
            - $ref: '#/components/schemas/event'
            - $ref: '#/components/schemas/subscription'
            - $ref: '#/components/schemas/ven'
            - $ref: '#/components/schemas/resource'
          discriminator:
            propertyName: objectType
        # ...

The oneOf feature is an important part of OpenAPI. It allows us to store several different types of objects in a given field. In this particular case the notification object is used to send notifications to client programs, and the object field contains the object for which the notification is sent.

Each type referenced with $ref has a field named objectType containing an enumerated value describing what kind of object it is. That way an application can quickly discriminate between what value is stored in this field.

If you know Joi, you know about the Joi.alternatives().match('one').try(...) feature. This allows a Joi schema to match one of the several object types listed within the try(...) method.

The problem is that instead openapi-to-joi is generating this structure:

      object: Joi.alternatives()
        .match("all")
        .try(
          Joi.object({})
            .description("the object that is the subject of the notification.")
            .unknown(),
          Joi.alternatives()
            .match("one")
            .try(
              // ...
            )
        )

Then, when using this schema to validate an object, an error is generated: "object" does not match all of the required types

The cure is to modify the generated code, as I discuss in this bug filing: (github.com) Incorrect code generated for schema with oneOf including several objects.

The tally is that we have two classes of modification to make to generated code. The first is the issue of ISO Date strings becoming Date objects, one cure for which is to use the joi-iso-datestring extension, and modifying generated code to use isoDateTime(). The other is this issue with oneOf constructs, and modifying that generated code.

While it is an excellent tool, and Joi has some advantages over Zod in data validation, having to manually verify generated code is a bummer.

Issues

There are a pair of issues which weren't discussed so far.

Unfortunate treatment of default null and default "0000-00-00"

In the API spec I'm working with, two declarations in the OpenAPI spec ended up with bad code in the Zod schema's.

In the first case, the DateTime schema includes default: 0000-00-00 which seems meant to be a dummy value which happens to be a valid date string. But, in the generated TypeScript code notice this:

* @default 0000-00-00

It's not a quoted string, and could accidentally be treated as a number. In the generated Zod schema this turned into default(0), which is a number, rather than a the expected string. I found it best to remove this default. Coincidentally the specification maintainer independently removed that default as well.

In the second case many fields contain default: null. This turned into default("null") in the Zod schema which obviously has a completely different meaning. Fixing this required editing the Zod schema to change it to default(null).

Zod's .passthrough() is shallow

This issue is not about the generated code, but how Zod treats the .passthrough() function. This function tells Zod to not strip extra values in objects, and instead to pass them through.

What's at stake is how to handle extra data included in an object. The JavaScript policy of if it looks like a duck it's a duck means additional data items are okay, because some ducks have laser cannons. But, whether this is desired, or not, actually depends on your application.

This is the closest image that Stable Diffusion would produce for the prompt 'Mallard duck carrying laser canon'

Some API specifications explicitly allow implementors to extend the objects. That's the policy for OpenADR, to allow the API to be used in contexts we didn't consider in the working group.

Zod's default behavior when parsing/validating an object is to strip out data that's not specified by the schema. That's useful for many applications to increase code stability. But, as just said, it's just as useful in other applications to leave that extra data in place.

The .passthrough() method tells Zod to not strip out that data.

With the generated Zod schema it is easily used like so:

const period = parseIntervalPeriod
                .passthrough().parse(data);

But, the .passthrough() function only affects the top-level items in the data object. It does not affect nested objects. If your application must pass additional values in a nested object, those values will not make it through the Zod parsing phase.

As it stands, to use Zod for validation and to allow extensions, you must also use a privately modified version of the specification. Your package of data types and validation code must be generated from the modified specification, and then distributed to all applications using the modified API.

JOI's allowUnknown and stripUnknown are deep

In contrast to the shallow impact of Zod's passthrough and strict functions, these two JOI options have a deep effect. You can readily see this with the test program shown earlier.

You apply the options at the top level, either in the schema definition with a .prefs method call, or as the options parameter to validate. The impact is readily seen at every level of the object being validated.

Honorable Mentions

(github.com) Swagger Codegen -- Swagger is the project from which OpenAPI was originally derived. Swagger Codegen is software written in Java that can generate code from OpenAPI specifications, targeting several programming languages. Unfortunately it does not support OpenAPI 3.1, and I was not able to generate code from the OpenADR specification.

One option is to use the Docker container as so:

$ docker run --rm -v  ./docker:/local \
    swaggerapi/swagger-codegen-cli \
    generate \
    -i /local/oadr3.0.1.yaml \
    -l nodejs-server \
    -o /local/js

This requires a host machine directory, which I named docker, that mounts into the Docker container as /local. Hence the swagger-codegen command reads the specification from /local and writes it to /local/js. But, this failed with a long list of errors.

(openapi-generator.tech) OpenAPI Generator -- This project looks like it might have been forked from Swagger Codegen, since the usage is very similar. But, it has more capabilities than does the Swagger tool.

Unfortunately it cannot generate any code from the OpenADR specification.

$ docker run --rm -v  ./docker:/local \
    openapitools/openapi-generator-cli \
    generate \
    -g markdown \
    -i /local/oadr3.0.1.yaml \
    -o /local/js

The -g option specifies a generator which generates code in any of the available modes. The list command shows the available generators.

I tried the typescript-node (client using TypeScript), javascript (client in JavaScript), nodejs-express-server (server code for ExpressJS), and markdown (documentation) generators. Every one gave these errors:

-attribute components.schemas.program.default is not of type `array`
-attribute components.schemas.ven.default is not of type `array`
-attribute components.schemas.event.default is not of type `array`
-attribute components.schemas.reportDescriptor.default is not of type `array`
-attribute components.schemas.report.default is not of type `array`
-attribute components.schemas.notification.default is not of type `array`
-attribute components.schemas.subscription.default is not of type `array`

These errors do not make sense since there is not a default attribute to any of those schema's.

(github.com) AutoRest -- This package promises to build code for REST APIs specified with OpenAPI. It looks like it can generate code for a wide variety of languages.

That it is a code generation framework for converting OpenAPI 2.0 and 3.0 specifications into client libraries for the services described by those specifications does not make me trust it can do so with a specification using OpenAPI 3.1.

Summary

An important leg of building stable secure software systems is runtime data validation. Checking the data in your application is what your code expects reduces the risk of failure or mistakes. Having correct validated data erases a large body of possible failures.

OpenAPI specifications contain the information to drive runtime data validation. The best approach is to automatically generate that code from the specification.

The exploration described here found three tools for automating generation of TypeScript type declarations and runtime data validation (using Zod). These are:

  1. openapi-to-zod - Generates Zod schema's directly from OpenAPI.
  2. @openapi-codegen/cli - Generates type declarations.
  3. ts-to-zod - Generates Zod schema's from type declarations. But, it has a fatal flaw in that default values in nested schema's are ignored.
  4. openapi-to-joi - Generates Joi schema's directly from OenAPI.

As a result the recommended procedure is to use openapi-to-zod to generate Zod schema's, then use @openapi-codegen/cli to generate TypeScript type declarations. The latter will have JSDoc annotations.

Another route is to still generate TypeScript types with @openapi-codegen/cli, and to generate Joi schema's with openapi-to-joi.

Having the JSDoc tags as part of the type declarations adds a lot of value. Those tags can be used in generating the Zod or JOI schema validators, and so much more. VSCode knows how to inspect and use those tags, for example.

May your software contain fewer errors.

About the Author(s)

(davidherron.com) David Herron : David Herron is a writer and software engineer focusing on the wise use of technology. He is especially interested in clean energy technologies like solar power, wind power, and electric cars. David worked for nearly 30 years in Silicon Valley on software ranging from electronic mail systems, to video streaming, to the Java programming language, and has published several books on Node.js programming and electric vehicles.

Books by David Herron

(Sponsored)