Photo by Pankaj Patel

Introduction

When dealing with deserialization of JSON, it's always a good idea to validate that it infact deserialized correctly. But, how do you do that using System.Text.Json?

We have the following json that we fetch from a third party, meaning we have no control of it. Usually, the JSON contains all the properties that we care about. But sometimes, some properties that we deem required are missing and we get a bunch of exceptions when trying to use the missing properties.

How can we validate that the JSON was infact deserialized correctly?
A good approach would be to validate the received JSON against a JSON Schema but I've only done that using Newtonsoft, I'm not sure if there's a System.Text.Json solution available?

This is the JSON we're dealing with:

[
  {
    "brand": "Ferrari",
    "model": "Enzo",
    "horsePower": 651,
    "customParts": [
      {
        "partNumber": "ABC123",
        "price": 1999.99,
        "manufactured":"2021-01-12"
      }
    ]
  },
  {
    "brand": "Ferrari",
    "model": "F50",
    "horsePower": 512,
    "customParts": [
      {
        "partNumber": "DEF123",
        "price": 3050.99,
        "manufactured":"2021-02-03"
      }
    ]
  },
  {
    "brand": "Ferrari",
    "model": "488 Spider",
    "horsePower": 661,
    "customParts": [
      {
        "partNumber": "GHI123",
        "price": 999.99,
        "manufactured": "2021-07-10"
      }
    ]
  }
]

YOLO-solution

As you can problably guess by the name of this "solution", this is not a solution I recommend😆. I just included this since it's, unfortunately, very common to just code for the happy path when it comes to deserialization of json (and other stuff as well).

We create two classes that maps against the JSON.

public class CarDto
{
    public string Brand { get; set; }
    public string Model { get; set; }
    public uint Horsepower { get; set; }
    public List<CustomPartDto> CustomParts { get; set; }
}
public class CustomPartDto
{
    public string PartNumber { get; set; }
    public decimal Price { get; set; }
    public DateOnly Manufactured { get; set; }
}

We then write a test to ensure that we can deserialize the JSON correctly:
EmbeddedResourceQuery is a helper class I use to read embedded resources, I've written about that here.

public class DeserializationTests
{
    private readonly EmbeddedResourceQuery _embeddedResourceQuery;

    public DeserializationTests()
    {
        _embeddedResourceQuery = new EmbeddedResourceQuery();
    }
    
    [Fact]
    public async Task CanDeserializeValidCarsJson()
    {
        await using var json = _embeddedResourceQuery.Read<EmbeddedResourceQuery>("_03Deserialization.Cars.json");

        var result = await JsonSerializer.DeserializeAsync<List<CarDto>>(json, CarsJsonSerializerOptions.Options);

        result.ShouldNotBeNull();
        result.Count.ShouldBe(3);
        result[0].Brand.ShouldBe("Ferrari");
        result[0].Model.ShouldBe("Enzo");
        result[0].Horsepower.ShouldBe(651u);
        result[0].CustomParts.Count.ShouldBe(1);
        result[0].CustomParts[0].PartNumber.ShouldBe("ABC123");
        result[0].CustomParts[0].Price.ShouldBe(1999.99M);
        result[0].CustomParts[0].Manufactured.ShouldBe(DateOnly.Parse("2021-01-12"));
        
        result[1].Brand.ShouldBe("Ferrari");
        result[1].Model.ShouldBe("F50");
        result[1].Horsepower.ShouldBe(512u);
        result[1].CustomParts.Count.ShouldBe(1);
        result[1].CustomParts[0].PartNumber.ShouldBe("DEF123");
        result[1].CustomParts[0].Price.ShouldBe(3050.99M);
        result[1].CustomParts[0].Manufactured.ShouldBe(DateOnly.Parse("2021-02-03"));
        
        result[2].Brand.ShouldBe("Ferrari");
        result[2].Model.ShouldBe("488 Spider");
        result[2].Horsepower.ShouldBe(661u);
        result[2].CustomParts.Count.ShouldBe(1);
        result[2].CustomParts[0].PartNumber.ShouldBe("GHI123");
        result[2].CustomParts[0].Price.ShouldBe(999.99M);
        result[2].CustomParts[0].Manufactured.ShouldBe(DateOnly.Parse("2021-07-10"));
    }
}

The test is green, great.
Now, the third party API we integrate with is a really shaky API, sometimes the CustomParts list is missing. Let's see what happens when the JSON looks like this:

[
  {
    "brand": "Ferrari",
    "model": "Enzo",
    "horsePower": 651
  },
  {
    "brand": "Ferrari",
    "model": "F50",
    "horsePower": 512
  },
  {
    "brand": "Ferrari",
    "model": "488 Spider",
    "horsePower": 661
  }
]
[Fact]
public async Task ShouldThrowArgumentNullExceptionWhenTryingToConsumeCustomPartsWhenCustomPartsIsMissingFromJson()
{
    await using var json =
        _embeddedResourceQuery.Read<EmbeddedResourceQuery>("_03Deserialization.CarsWithoutRequiredProperties.json");
        
    var result = await JsonSerializer.DeserializeAsync<List<CarDto>>(json, CarsJsonSerializerOptions.Options);

    var exception = Should.Throw<ArgumentNullException>(() => result!.First().CustomParts.First());
    exception.Message.ShouldBe("Value cannot be null. (Parameter 'source')");
}

This test is also green, meaning that we get an ArgumentNullException when trying to access the first CustomPart after the deserialization has already happened.

How can we catch this error earlier?

Constructor validation

System.Text.Json has supported immutable types for a while.

Lets change our CarDto and CustomPartDto to the following:

public class CarDto
{
    public CarDto(string brand, string model, List<CustomPartDto> customParts, uint horsepower)
    {
        Brand = brand ?? throw new ArgumentNullException(nameof(brand));
        Model = model ?? throw new ArgumentNullException(nameof(model));
        CustomParts = customParts ?? throw new ArgumentNullException(nameof(customParts));
        Horsepower = horsepower;
    }

    public string Brand { get; }
    public string Model { get; }
    public uint Horsepower { get; }
    public List<CustomPartDto> CustomParts { get; }
}
public class CustomPartDto
{
    public CustomPartDto(string partNumber, decimal price, DateOnly manufactured)
    {
        PartNumber = partNumber ?? throw new ArgumentNullException(nameof(partNumber));
        Price = price;
        Manufactured = manufactured;
    }

    public string PartNumber { get; }
    public decimal Price { get; }
    public DateOnly Manufactured { get; }
}

We are now validating the parameters in our constructor. If something we deem required is null, we will throw an exception. This means that it's impossible to create a CarDto with a null CustomParts list.

Let's verify this with a test:

[Fact]
public async Task ShouldThrowIfRequiredPropertyIsNotSet()
{
    await using var json =
            _embeddedResourceQuery.Read<EmbeddedResourceQuery>("_03Deserialization.CarsWithoutRequiredProperties.json");

    var exception = await
        Should.ThrowAsync<ArgumentNullException>(
                async () => await JsonSerializer.DeserializeAsync<List<CarDto>>(json, CarsJsonSerializerOptions.Options));

    exception.Message.ShouldBe("Value cannot be null. (Parameter 'customParts')");
}

We now get an exception when trying to deserialize, it's now impossible to create a CarDto in an invalid state. As a bonus, we also get a better error message here compared to the ArgumentNullException in the YOLO solution.

Required keyword (C# 11)

The previous approach has been my prefered solution for a while. However, that might change now when C# 11 is readily available.

Let's change our CarDto and CustomPartDto again.

public class CarDto
{
    public required string Brand { get; init; }
    public required string Model { get; init; }
    public required uint Horsepower { get; init; }
    public required List<CustomPartDto> CustomParts { get; init; }
}
public class CustomPartDto
{
    public required string PartNumber { get; init; }
    public required decimal Price { get; init; }
    public required DateOnly Manufactured { get; init; }
}

Notice the required keyword that we've added on all properties.

Let's add a test and see what happens when we try do deserialize without the CustomParts array.

[Fact]
public async Task ShouldThrowIfRequiredPropertyIsNotSet()
{
    await using var json =
        _embeddedResourceQuery.Read<EmbeddedResourceQuery>("_03Deserialization.CarsWithoutRequiredProperties.json");

    var exception = await
        Should.ThrowAsync<JsonException>(
                async () => await JsonSerializer.DeserializeAsync<List<CarDto>>(json, CarsJsonSerializerOptions.Options));

    exception.Message.ShouldBe(
        "JSON deserialization for type 'JOS.TipsAndTrix._03Deserialization.Required.CarDto' was missing required " +
        "properties, including the following: customParts");
}

We still get an exception, but now the message contains even more details:
...CarDto was missing required properties, including the following: customParts.

Conclusion

I've shown two approaches to protect yourselve against "bad" json. It's better to get an exception when trying to deserialize the json compared to getting an obscure null reference exception later on in your code. Fail fast!

All code can be found over at GitHub