Runtime data validation in TypeScript using decorators and reflection metadata

; Date: Tue Feb 22 2022

Tags: TypeScript

TypeScript decorators allowed us to intercept calls to both accessor functions and methods. That let us spy on data passed to either type of function, or even to supply default values for any that are missing. A practical use we might have for this is automatic validation of data arriving at an accessor or method. By instrumenting these functions, a validation package can act as a gatekeeper ensuring that data in objects is always correct.

In this article we'll explore the implementation of automatic data validation using TypeScript decorators. The result will be objects where assigning data to properties, or invoking methods, have the data validated, and exceptions are thrown if an invalid value is provided. TypeScript brings compile-time type checking and validation to JavaScript, but no such behavior at runtime.

We know there's an objection many will raise. Namely, the common wisdom is that runtime data validation causes bloat, due to additional code, and slows execution, because every data access is validated. Let's address that concern at the beginning.

It's a valid concern, obviously. Data validation requires executing additional code, to check data values, requiring more memory for the code, and more execution time. But, consider three things:

  1. Defensive coding, which involves writing the input side of a function to correctly deal with any bad or implausible data.
  2. Type Guard functions are recommended by the TypeScript team as a means of implementing defensive coding. It's recommended to create these functions for every type, using them to check incoming data.
  3. By choosing critical paths to instrument with data validation, we can minimize the overhead.

A type guard function is an example of the overhead, or bloat, that we're warned about. By developing a type guard function, we can validate data before using it, which should reduce bugs and increase stability in our applications. But, each type guard adds to the amount of code to maintain, the memory size of the running process, and the execution time.

That right there is the tradeoff just mentioned. The type guard function adds to code size and execution time, while improving application quality.

The overhead can be lessened by intelligently choosing which code paths will, or will not, use data validation.

The benefits of defensive coding

I first heard of the defensive coding concept in the classic programming book, Elements of Programming Style by Kernighan and Plauger. Kernighan, in this case, is the same Kernighan who helped to develop the C programming language and the UNIX operating system. Chapter 5, Input and Output, begins with a news article describing how a city employee in Woonsocket, RI, made a "keypunching" (a.k.a. data entry) mistake that cost the city $290,000 in tax revenue. In that timeframe, 1972, data was entered by keypunch on punched cards, which of course is a distant memory most of you did not experience. The real issue is detecting data entry errors, and recognizing that data entry problems are just as possible today as then.

How many of us have written an application that did the wrong thing because we failed to validate inputs?

What is the potential cost to ourselves or our business (hence to our job) if our application does an insane thing due to a failure to validate inputs?

To put that in perspective, see: (xkcd.com) https://xkcd.com/327/ Bobby Tables they call him.

Using defensive coding, checking and validating input data before using it, is a very good idea. Type guard functions are an excellent tool for defensive coding. The guard function can be written once and used a thousand times, versus reimplementing data validation everywhere. The developers must then remember to invoke type guards everywhere that's required. A developer might forget to use type guards on a critical path, again risking insane behavior due to a failure to validate data.

Consider a class

class CarLicense {
    #make: string;

    set make(nt: string) { this.#make = nt; }
    get make() { return this.#make; }

    #model: string;

    set model(nm: string) { this.#model = nm; }
    get model() { return this.#model; }

    #year: number;

    set year(ny: number) { this.#year = ny; }
    get year() { return this.#year; }

    #vin: string;

    set vin(nv: string) { this.#vin = nv; }
    get vin() { return this.#vin; }

    #license: string;

    set license(nl: string) { this.#license = nl; }
    get license() { return this.#license; }
}

These are some important attributes of a car. It is important that the car be correctly identified in all attributes. Most of these attributes have known values - manufacturer names, model names, VIN number formats, and license plate formats. It is easy to verify these things. But the code as it stands does not validate anything. It means an instance of this class could be holding incorrect data, and the application cannot know.

My suggestion is that, on the set side of each of these attributes, to install data validation. Using decorator functions we can reliably override the set method, and therefore know that every value assigned to these fields will be correct. Because the data storage is in a JavaScript private property, the only avenue for giving it a new value is through the set method. Therefore, validation installed on that method will ensure that only correct, validated, data is stored.

In other words, what if validation could be as simple as adding corresponding decorators to the accessor functions? A decorator like @IsIn([ 'Ford', 'GM', 'Chevy', 'Tesla', ... ]) could validate a car manufacturer name to a list of known names, or the model year might be validated using @IsIntRange(1920, 2030).

In this article we will show how to implement runtime data validation using TypeScript decorators. The validation will automatically execute on the set side of accessor functions, and parameters to class methods. Validation will only occur for accessors or methods where validation decorators are attached.

This article is part of a series:

To use decorators, two features must be enabled in TypeScript, so be sure to review the introduction to decorators article in this series.

Reviewing accessor, method and parameter decorators

The core technique required is to override the functions attached to PropertyDescriptor objects. Both the accessor decorator function, and method decorator function, supply a PropertyDescriptor related to the object being decorated. In both cases the object has function(s), and we can replace those with our own functions. Specifically:

This requires three groups of decorator functions:

  • Validation decorators that can be attached either to accessor functions, or to method parameters. These describe the validation required for the object they're attached to
  • @ValidateAccessor<type>() - is attached to accessors, and is what attaches the override function which in turn performs the validation
  • @ValidateParams - is attached to methods, and is what attaches the override function which in turn performs the validation

For reference, there is a simple data validation example in Deep introduction to accessor decorators in TypeScript

The last consideration is enabling validation decorators to be attached either to set accessors or to method parameters. We discussed the implementation technique in Implementing hybrid decorator functions in TypeScript

The state of runtime data validation

There are several packages that handle runtime data validation. These include (www.npmjs.com) Joi, AJV, and (www.npmjs.com) Zod. An issue with all is what I mentioned earlier, that the programmer must remember to add data validation, and can forget to do so.

The (www.npmjs.com) class-validator package uses decorators attached to properties. This package served as an inspiration when developing the package to be described below. However, the programmer must remember to invoke its validate or validateOrReject method.

One package, (www.npmjs.com) validator, has a long list of validation functions. The class-validator package uses the validation functions in this package. The package to be described below does the same.

This article describes a different package, (www.npmjs.com) runtime-data-validation. It contains a long list of data validation decorators, and support for automatically executing data validation in the normal course of using class properties and methods.

The test class

What we'll use for a test case is this class definition:

class ValidateExample {

    #year: number;

    @ValidateAccessor<number>()
    @IsIntRange(1990, 2050)
    @IsInt()
    set year(ny: number | string) {
        this.#year = ToInt(ny);
    }
    get year() { return this.#year; }

    @ValidateParams
    area(
        @IsFloatRange(0, 1000)
        width: number | string,

        @IsFloatRange(0, 1000)
        height: number | string
    ) {
        return ToFloat(width) * ToFloat(height);
    }

}

There is an accessor pair covering the year property. We've defined this as an integer between 1990 and 2050. And, there is a function named area that takes a width and height to compute the area. Both are defined as a floating point value between 0 and 1000.

We've defined the parameters as number | string, because often we're reading from a data source which might contain a string representation of the number. The validation decorators must take care of recognizing the numbers in either format. Further, we use ToInt and ToFloat to convert a possible string value to a number.

The @ValidateParams decorator will look for decorators attached to parameters. Likewise the @ValidateAccessor decorator will look for other decorators attached to the accessor. In both cases it will override the correct function in the PropertyDescriptor, so that it can validate the values.

Implementing hybrid validation decorators

We have two types of decorator functions to implement:

  • Validation decorators must be capable of attaching to either accessors or method parameters.
  • Execution decorators are @ValidateParams, which is attached to methods, and @ValidateAccessor, which is attached to accessors.

For the validation decorators, we have a technique discussed in our article on hybrid decorators. In that article we developed five functions that are useful for determining what kind of object to which a decorator has been attached. Those functions are available from the (www.npmjs.com) decorator-inspectors package.

This means implementing something like:

import {
    isClassDecorator, isPropertyDecorator, isParameterDecorator,
    isMethodDecorator, isAccessorDecorator
} from 'decorator-inspectors';

function IsFloatRange(min: number, max: number) {

    return (target: Object, 
        propertyKey?: string | symbol,
        descriptor?: number | PropertyDescriptor) => {

        if (isAccessorDecorator(target, propertyKey, descriptor)) {
            // Record in metadata a function to check that a value is within range
        } else if (isParameterDecorator(target, propertyKey, descriptor)) {
            // Record in metadata a function to check that a value is within range
        }
    }
}

This is a decorator factory function. The outer function takes arguments, min and max, that customize the behavior of this decorator. The inner function is the actual decorator. The signature for this function is one we determined while studying hybrid decorators, which lets us create a decorator that can be attached to any decoratable object. With those five isXYZZYDecorator functions, we determine the context in which this decorator is being used, and then do the correct thing for each context.

In other words the decorator function must test how it is being used, and take the correct action for each context.

But, implementing a large number of hybrid decorator functions this way is not scalable. Considering the number of validation functions in the validator package, we need a more compact implementation, especially since there will be a lot of repetitious code.

Validation decorator implementation

What I came up with is for the outer decorator function to follow this pattern:

export function IsInt() {
    // console.log(`params.IsInt`);
    return generateValidationDecorator(
                (value) => numbers.IsInt(value),
                `Value :value: not an integer`);
}

The inner function is encapsulated by generateValidationDecorator, which in turn takes two parameters. One is the core of the validation function, the other is a message to use in the error which is thrown. In this case numbers.IsInt is a function in an internal module that handles the validation.

It is in generateValidationDecorator where we test what kind of decorator this is:

export function generateValidationDecorator(
                validator: Function, message: string) {
    return (target: Object, propertyKey?: string | symbol,
        descriptor?: number | PropertyDescriptor) => {

        if (isAccessorDecorator(target, propertyKey, descriptor)) {
            generateAccessorDecorator(validator, message,
                    target, propertyKey,
                    <PropertyDescriptor>descriptor);
        } else if (isParameterDecorator(target, propertyKey, descriptor)) {
            generateParameterDecorator(validator, message,
                    target, propertyKey, <number>descriptor);
        }
    }
}

For an accessor decorator, we use generateAccessorDecorator, otherwise generateParameterDecorator. Each contains the code specific to each decorator type.

Since descriptor can be either a number or PropertyDescriptor, we cast it to the correct type depending on which function we're calling.

Then, let's take a look at generateAccessorDecorator:

function generateAccessorDecorator(
    validator: Function, message: string,
    target: Object, propertyKey: string | symbol,
    descriptor: PropertyDescriptor) {

    let existing = Reflect.getMetadata(ACCESSOR_VALIDATORS,
                target, propertyKey)
        || [];
    const vfunc = function(value) {
        if (!validator(value)) {
            throw new Error(
                message.replace(':value:',
                    util.inspect(value)));
        }
    };
    if (!existing) {
        existing = [ vfunc ];
    } else {
        existing.push(vfunc);
    }

    // Store metadata
    Reflect.defineMetadata(ACCESSOR_VALIDATORS,
        existing, target, propertyKey);
    
}

This is where we get down to business. The decorator is attached to an object identified by target and propertyKey. We use Reflect.getMetadata to store the validator function in the ACCESSOR_VALIDATORS metadata in that target object. We generate a function that executes the validator function, passing in a value, and if the validator indicates false then an error is thrown. This function is pushed into an array, then added back to the ACCESSOR_VALIDATORS metadata.

Put another way, for the accessor identified by target and propertyKey, we maintain a metadata value named ACCESSOR_VALIDATORS. This contains an array of validator functions. The goal is to execute those functions to validate values before the set function executes, and to avoid executing set if the value is invalid.

In generateParameterDecorator we do roughly the same thing, but the metadata is PARAMETER_VALIDATORS instead. In that case, target and propertyKey refer to the method containing the parameter being decorated.

For both cases, a validation function is supplied. That validation function is stored in reflection metadata, and that's all these functions do. These two functions do not have the ability to set up execution of the validation functions. They can only store the functions, so that other code can set up their execution.

Execution decorator implementation

For accessors, the execution is arranged by this accessor decorator:

export function ValidateAccessor<T>() {
    return (target: Object, propertyKey: string,
        descriptor: PropertyDescriptor) => {
        
        const originals = {
            get: descriptor.get,
            set: descriptor.set
        };
        if (originals.set) {
            descriptor.set = function(newval: T) {
                let validators =
                    Reflect.getMetadata(ACCESSOR_VALIDATORS,
                            target, propertyKey)
                    || [];
                // const validators = AccessorValidators(target, propertyKey);
                // console.log(`AccessorValidation validators`, validators);
                for (const func of validators) {
                    func(newval);
                }
                originals.set.call(this, newval);
            };
        }
    }
}

For accessors we arrange to execute validator functions by attaching the ValidateAccessor decorator. This gets a PropertyDescriptor object where the get and set functions correspond to the get and set accessors. What we do is save these in the originals object. We then add a set function which retrieves the ACCESSOR_VALIDATORS metadata, which you remember contains the validator functions for this specific target.

Notice that this implementation allows multiple validation functions for each target. This is so we can add any number of validations.

This overriding function executes each of the validation functions. If they all execute, it means none of them failed, because a failed validation causes an error to be thrown which will abort this function. Therefore, if all execute correctly, then originals.set is called to perform the original set function.

Now.. since all accessor decorator functions receive PropertyDescriptor, does that raise a question of why we had to create ValidateAccessor?

The answer has to do with the necessity to pass the data type through ValidateAccessor. Notice that we use a Generic field to pass in a data type, and the data type is used in the overriding set function. This data type needs to be used by this inner function. We would be unable to do this using the validator decorator functions. Therefore, ValidateAccessor was required.

To review more about accessor decorators: Deep introduction to accessor decorators in TypeScript

In ValidateParams we have a similar implmentation. The difference comes because this decorator is attached to methods, which gets a PropertyDescriptor with different content.

export function ValidateParams(
    target: Object, propertyKey: string | symbol,
    descriptor: PropertyDescriptor,
) {
    // console.log(`ValidateParams ${target} ${String(propertyKey)}`, descriptor);
    // Store the original value
    const savedValue = descriptor.value;
    // Attach validation logic
    descriptor.value = function(...args: any[]) {
        let validators = Reflect.getMetadata(PARAMETER_VALIDATORS,
                                target, propertyKey)
                        || [];
        // const validators = ParamValidators(target, propertyKey) || {};

        for (const key in Object.keys(validators)) {
            if (key === 'length') continue;
            const funclist = validators[key];
            const value = args[key];
            // console.log(`ValidateParams ${target} ${String(propertyKey)} ${key} value ${value} funclist`, funclist);
            for (const func of funclist) {
                func(value);
            }
        }
        // Actually call the function
        return savedValue.call(this, ...args);
    };
}

With method decorators, the value field contains the function for the method. We save that away, then create an overriding function. In this case we look at every parameter for which validator functions were supplied. For each, we get the list of validator functions, then run each validator function against the value stored in that parameter. If all execute correctly, then we call the original function, supplying the original argument list.

To review more about method decorators, see: Deep introduction to method decorators in TypeScript -- For parameter decorators, see: Deep introduction to parameter decorators in TypeScript

Testing runtime data validation decorators

To test what we've created, create this script:

import {
    IsIntRange, IsInt, IsFloatRange, IsFloat,
    ToFloat, ToInt,
    Contains,
    ValidateParams, ValidateAccessor
} from 'runtime-data-validation';

// The above code - ValidateExample

const ve = new ValidateExample();

ve.year = 1990;
ve.year = 2000;
ve.year = 2020;
// ve.year = 2060;
// ve.year = 1980;

console.log({
    width: 10,
    height: 10,
    area: ve.area(10, 10)
});

console.log({
    width: '20',
    height: '16.6',
    area: ve.area('20', '16.6')
});

console.log({
    width: 'twenty',
    height: '16.6',
    area: ve.area('twenty', '16.6')
});

We have several lines assigning values to the year property. We then make several calls to the area method. For each we supply both valid data, and invalid data.

Running this, we get the following output:

$ npx ts-node lib/validation/validate2.ts 
{ width: 10, height: 10, area: 100 }
{ width: '20', height: '16.6', area: 332 }
.../runtime-data-validation-typescript/lib/validators.ts:51
            throw new Error(
                  ^
Error: Value 'twenty' not a float between 0 and 1000

Notice that passing strings, '20', and '16.6', are interpreted as numbers, and the result is correct. This is because the implementation recognizes both numeric strings, and converts them to numbers before performing the math. In any case, the main point is that each of these assignments and method calls are validated. Further, when we do have an error, the error message is fairly useful.

Summary

This demonstrates it is possible to automate runtime data validation in TypeScript. The method involves attaching decorators to accessor functions, and regular class methods. These decorators handle overriding calls to either, and ensure that data meets validity constraints designed by the application developer.

Therefore, as long as you can corral your data through an accessor or a method in an object, that data can be automatically validated.

The technique shown in this article is the core of a package: (www.npmjs.com) https://www.npmjs.com/package/runtime-data-validation This package contains a long list of validation decorators, and makes it easy to create your own validation decorators.

About the Author(s)

(davidherron.com) David Herron : David Herron is a writer and software engineer focusing on the wise use of technology. He is especially interested in clean energy technologies like solar power, wind power, and electric cars. David worked for nearly 30 years in Silicon Valley on software ranging from electronic mail systems, to video streaming, to the Java programming language, and has published several books on Node.js programming and electric vehicles.

Books by David Herron

(Sponsored)