Validate Json file with Json schema

eye-catch JavaScript/TypeScript

Json file is widely used for many purposes. To consume the data, we somehow need to validate the json file. It is ok to try to access the target property directly if it is unnecessary to tell which property is missing or wrong data.

Sponsored links

Json object validation without schema

If we don’t use json schema, we have to check one by one. Let’s assume that we have the following json object.

All json object shown in this article is written in JavaScript/TypeScript. Therefore, double quotations are not used for a property.

const json = {
    version: 0.1,
    name: "Yuto",
    timestamp: "2021-12-12T10:10:10.111Z",
    contactInfo: {
        tel: "000-1111-2222",
        address: "somewhere",
        postalCode: 12345,
    },
    languages: [
        {
            language: "Japanese",
            native: true
        },
        {
            language: "English",
            native: false
        },
        {
            language: "German",
            native: false
        }
    ],
};

Some of the properties might be missing. If we want to check it without schema if the json object has all required properties, the implementation looks as follows.

function isValid(json: any): boolean {
    const check = (obj: any, name: string) => {
        if (!Object.prototype.hasOwnProperty.call(obj, name)) {
            console.error(`missing property [${name}]`);
            return false;
        }
        return true;
    }
    return check(json, "version")
        && check(json, "name")
        && check(json, "timestamp")
        && check(json, "contactInfo")
        && check(json.contactInfo, "tel")
        && check(json.contactInfo, "address")
        && check(json.contactInfo, "postalCode");
}

This implementation checks only if the properties exist but does not check the data type. If we want to check the value, the code gets more complicated.

Sponsored links

Json Schema and validator

Let’s create Json Schema to make the validation easy. json-schema.org defines the Json Schema. We can check how to write it here.

Ajv JSON schema validator

To check if the schema is correct, we need to have a validator. Ajv seems to be the fastest validator, so let’s use it.

The usage is simple. Instantiate a validator, compile the schema and then use it.

import Ajv from "ajv";

const ajv = new Ajv({ strict: true });

const validateBySchema = (json: any, schema: any) => {
    const validate = ajv.compile(schema);
    const result = validate(json);
    console.log(result);
    if (!result) {
        console.log(validate.errors);
    }
}

We will use this function to check if json objects and schema are correct.

Primitive value check

Let’s create the simplest schema. The following schema expects that the data is number.

const schema = { type: "number" };
validateBySchema(42, schema);   //  true
validateBySchema(0xFF, schema); //  true
validateBySchema("42", schema); //  false
// [
//   {
//     instancePath: '',
//     schemaPath: '#/type',
//     keyword: 'type',
//     params: { type: 'number' },
//     message: 'must be number'
//   }
// ]

As you can see, if the value is string, the result shows false with error reason and error path.

Json object validation

The previous example is too simple to handle a json object. It is actually not json. Let’s create a schema for a json object.

const schema = {
    type: "object",
    properties: {
        foo: {
            type: "string",
        }
    }
}
validateBySchema({ foo: "foo" }, schema);   // true
validateBySchema({ foo: "foo", hoge: 12 }, schema); // true
validateBySchema({ foo: ["foo", "foo2"] }, schema); // false
// [
//   {
//     instancePath: '/foo',
//     schemaPath: '#/properties/foo/type',
//     keyword: 'type',
//     params: { type: 'string' },
//     message: 'must be string'
//   }
// ]

The schema expects that the json object has “foo” property and string data. The third one includes a string in the array but the type is an array. Therefore, its result is false. The second one has “hoge” property but the result is true because we didn’t set false to additionalProperties. If we don’t want to allow those additional properties, we need to set false to the property.

const schema2 = { ...schema, additionalProperties: false };
validateBySchema({ foo: "foo", hoge: 12 }, schema2);    // false
// [
//   {
//     instancePath: '',
//     schemaPath: '#/additionalProperties',
//     keyword: 'additionalProperties',
//     params: { additionalProperty: 'hoge' },
//     message: 'must NOT have additional properties'
//   }
// ]

The same property list

A json object can have a property in an object and they can have the same property list. How can we write the schema in this case?

const json = {
    obj: {
        foo: 1,
        hoge: "hoge"
    },
    obj2: {
        foo: 2,
        hoge: "hoge2"
    },
    values: [22, 42, 55]
};
const schema = {
    type: "object",
    properties: {
        obj1: {
            type: "object",
            properties: {
                foo: { type: "integer" },
                hoge: { type: "string", }
            }
        },
        obj2: {
            type: "object",
            properties: {
                foo: { type: "integer" },
                hoge: { type: "string", }
            }
        },
        values: {
            type: "array",
            items: { type: "integer", }
        }
    }
};
validateBySchema(json, schema); // true

The schema has the duplication in obj1 and obj2. Let’s extract it and refer it.

 const schema2 = {
    $defs: {
        objProperties: {
            type: "object",
            properties: {
                foo: { type: "integer" },
                hoge: { type: "string", }
            }
        }
    },
    type: "object",
    properties: {
        obj1: { $ref: "#/$defs/objProperties" },
        obj2: { $ref: "#/$defs/objProperties" },
        values: {
            type: "array",
            items: { type: "integer", }
        }
    }
};
validateBySchema(json, schema2); // true

It looks simpler than before.

Date time validation

The first json object that I showed at the top has timestamp. We can write a regular expression but json schema offers formats types.
See the details here

For the date-time, “date-time” is prepared but ajv module can’t handle it by default. We need to use ajv-formats module to extend it.

import Ajv from "ajv";
import addFormats from "ajv-formats";

const ajv = new Ajv({ strict: true });
addFormats(ajv);

Once we call the addFormats function, we can use “date-time”. If we don’t call addFormats function, it throws the following error message.

Error: unknown format "date-time" ignored in schema at path "#"

The result looks like this.

const schema = {
    type:"string",
    format: "date-time"
}
validateBySchema("2020-10-10T11:11:11",schema);         // true
validateBySchema("2020-10-10T11:11:11.123Z",schema);    // true
validateBySchema("2020-10-10T11:11:11+02:00",schema);   // true
validateBySchema("2020.10.10 11:11:11",schema);         // false
// [
//   {
//     instancePath: '',
//     schemaPath: '#/format',
//     keyword: 'format',
//     params: { format: 'date-time' },
//     message: 'must match format "date-time"'
//   }
// ]

If we want to have a strict check here, we need to write a regular expression.

Allow only pre-defined values by enum

If we want to assign only pre-defined value, we can use enum for it. In this case, type keyword is not necessary. enum keyword is used instead.

const schema = {
    enum: ["Japanese", "English", "German"],
}
validateBySchema("Japanese", schema);   // true
validateBySchema("English", schema);    // true
validateBySchema("German", schema);     // true
validateBySchema("Italian", schema);    // false
// [
//   {
//     instancePath: '',
//     schemaPath: '#/enum',
//     keyword: 'enum',
//     params: { allowedValues: [Array] },
//     message: 'must be equal to one of the allowed values'
//   }
// ]

Json object validation example with schema

Let’s validate the first json object with a schema that I showed at the top.

const json = {
    version: 0.1,
    name: "Yuto",
    timestamp: "2021-12-12T10:10:10.111Z",
    contactInfo: {
        tel: "000-1111-2222",
        address: "somewhere",
        postalCode: 12345,
    },
    languages: [
        {
            language: "Japanese",
            native: true
        },
        {
            language: "English",
            native: false
        },
        {
            language: "German",
            native: false
        }
    ],
};
const schema = {
    type: "object",
    properties: {
        version: {
            type: "number"
        },
        name: {
            type: "string"
        },
        timestamp: {
            type: "string",
            format: "date-time"
        },
        contactInfo: {
            type: "object",
            properties: {
                tel: {
                    type: "string",
                    pattern: "\\d{3}-\\d{4}-\\d{4}"
                },
                address: {
                    type: "string",
                },
                postalCode: {
                    type: "integer",
                    maximum: 99999
                },
            }
        },
        languages: {
            type: "array",
            items: {
                type: "object",
                properties: {
                    language: {
                        enum: ["Japanese", "English", "German"],
                    },
                    native: {
                        type: "boolean"
                    }
                }
            }
        }
    }
};
validateBySchema(json, schema); // true

no schema with key or ref “https://json-schema.org/draft/2020-12/schema”

When $schema: "https://json-schema.org/draft/2020-12/schema" is added to our schema, the following error occurs.

Error: no schema with key or ref "https://json-schema.org/draft/2020-12/schema"

If we want to use the schema, we need to use different class. Import Ajv2020 class instead of the normal Ajv class.

import Ajv2020 from "ajv/dist/2020";

The rest of the code doesn’t change.

Refer a different schema

To refer to a different json schema, we need to give a name to the schema. For that, let’s specify “$id”. It needs to be URI. It can be “http://example.com/base”, or simply “/base”, for example.

const schema = {
    $schema: "https://json-schema.org/draft/2020-12/schema",
    $id: "/base",
    type: "object",
    ...
};

We can refer to the schema by specifying the id to “$ref” property. The base schema doesn’t have mandatory parameters but this schema adds required property. “name” is a mandatory parameter now and it has “additionalProp” property.

const schema2 = {
    $schema: "https://json-schema.org/draft/2020-12/schema",
    $id: "/schema2",
    $ref: "base#",
    required: [
        "name",
        "additionalProp",
    ],
    type: "object",
    properties: {
        additionalProp: {
            type: "string",
        }
    }
};

OK, it’s ready to test. However, an error occurs in the following way.

const myAjv = new Ajv2020({
    allowUnionTypes: true,
});
addFormats(myAjv);
const validate = myAjv.compile(schema2);
// Error: can't resolve reference base# from id /schema2

This is because the first schema has not been added to the object yet. Let’s add it like this.

const myAjv = new Ajv2020({
    allowUnionTypes: true,
});
addFormats(myAjv);
// Add the base schema
myAjv.addSchema(schema);
const validate = myAjv.compile(schema2);

Let’s check multiple json objects.

{
    const result = validate({
        name: "Yuto",
        additionalProp: "value1",
    });
    console.log(result, validate.errors);
    // true null
}

{
    const result = validate({
        name: "Yuto",
        additionalProp: 12345,
    });
    console.log(result, validate.errors);
    // false [
    //   {
    //     instancePath: '/additionalProp',
    //     schemaPath: '#/properties/additionalProp/type',
    //     keyword: 'type',
    //     params: { type: 'string' },
    //     message: 'must be string'
    //   }
    // ]
}

{
    const result = validate({
        additionalProp: "value1",
    });
    console.log(result, validate.errors);
    // false [
    //   {
    //     instancePath: '',
    //     schemaPath: '#/required',
    //     keyword: 'required',
    //     params: { missingProperty: 'name' },
    //     message: "must have required property 'name'"
    //   }
    // ]
}

{
    const result = validate({
        name: "Yuto",
    });
    console.log(result, validate.errors);
    // false [
    //   {
    //     instancePath: '',
    //     schemaPath: '#/required',
    //     keyword: 'required',
    //     params: { missingProperty: 'additionalProp' },
    //     message: "must have required property 'additionalProp'"
    //   }
    // ]
}

Yes, it works as expected.

Refer a nested property from another schema

If we want to refer to a property that is in a deep level, we need to give an ID to the property first. “contactInfo” is an object and assume that we want to make “tel” property mandatory. We need to add $id property in the contactInfo.

const schema = {
    $schema: "https://json-schema.org/draft/2020-12/schema",
    $id: "/base",
    type: "object",
    properties: {
        ...
        contactInfo: {
            $id: "/base/contactInfo",
            type: "object",
            properties: {
                tel: {
                    type: "string",
                    pattern: "\\d{3}-\\d{4}-\\d{4}"
                },
                address: {
                    type: "string",
                },
                postalCode: {
                    type: "integer",
                    maximum: 99999
                },
            }
            ...

We can refer to the object like this.

const schema2 = {
    $schema: "https://json-schema.org/draft/2020-12/schema",
    $id: "/schema2",
    $ref: "base#",
    type: "object",
    properties: {
        contactInfo: {
            $ref: "base/contactInfo",
            type: "object",
            required: ["tel"]
        }
    }
};

We need to write the same object name because the “required” property must be defined at the same level. $ref: "base/contactInfo" deploys the contents like $defs. We can of course add additional parameters as well.

const myAjv = new Ajv2020({ allErrors: true });
addFormats(myAjv);
myAjv.addSchema(schema);
const validate = myAjv.compile(schema2);
{
    const result = validate({
        name: "Yuto",
        contactInfo: {
            tel: "000-1111-2222",
            address: "somewhere",
            postalCode: 12345,
            additional: "additional-value",
        }
    });
    console.log(result, validate.errors);
    // true null
}
{
    const result = validate({
        name: "Yuto",
        contactInfo: {
            address: "somewhere",
            postalCode: 12345,
        }
    });
    console.log(result, validate.errors);
    // false [
    //   {
    //     instancePath: '/contactInfo',
    //     schemaPath: '#/properties/contactInfo/required',
    //     keyword: 'required',
    //     params: { missingProperty: 'tel' },
    //     message: "must have required property 'tel'"
    //   },
    //   {
    //     instancePath: '/contactInfo',
    //     schemaPath: '#/properties/contactInfo/required',
    //     keyword: 'required',
    //     params: { missingProperty: 'additional' },
    //     message: "must have required property 'additional'"
    //   }
    // ]
}

There are two errors shown since both “tel” and “additional” properties are mandatory.

Show Multiple errors

Ajv shows only the first error property by default but I guess we want to know all errors in many cases because we don’t want to do the same process many times. Load a file, fix an error, load a file and … It’s annoying.
Set true to allErrors property in this case.

const myAjv = new Ajv2020({
    allowUnionTypes: true,
    allErrors: true,
});
addFormats(myAjv);
const validate = myAjv.compile(schema);
const result = validate({
    version: "string-version",
    name: 123,
    languages: "French",
});
console.log(result, validate.errors);
// false [
//   {
//     instancePath: '/version',
//     schemaPath: '#/properties/version/type',
//     keyword: 'type',
//     params: { type: 'number' },
//     message: 'must be number'
//   },
//   {
//     instancePath: '/name',
//     schemaPath: '#/properties/name/type',
//     keyword: 'type',
//     params: { type: 'string' },
//     message: 'must be string'
//   },
//   {
//     instancePath: '/languages',
//     schemaPath: '#/properties/languages/type',
//     keyword: 'type',
//     params: { type: 'array' },
//     message: 'must be array'
//   }
// ]

End

The schema json file can be written in a different file. If we don’t have to give the schema to our user, it’s better to write it in JavaScript/TypeScript because the compiler or IDE can detect the error while coding.
If we need to give it to our user, write the schema in json file.

Comments

Copied title and URL