The problem

We're using EF Core to handle our database. We're using it to migrate our database schema/data in all our environments. In our dev environments we also use it to seed and clear/reset the database.

We want to fullfill the following requirements:

  • Keep the migrations in a separate project.
  • The migrator should be a separate application.

The following DbContext will be used in this post:

public class MyDbContext : DbContext
{
    public MyDbContext(DbContextOptions<MyDbContext> options) : base(options)
    {
    }

    public DbSet<Brand> Brands { get; set; } = null!;
    public DbSet<Car> Cars { get; set; } = null!;

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.ApplyConfigurationsFromAssembly(typeof(MyDbContext).Assembly);
    }
}

public class BrandEntityConfiguration : IEntityTypeConfiguration<Brand>
{
    public void Configure(EntityTypeBuilder<Brand> builder)
    {
        builder.HasKey(x => x.Id);
        builder.Property(x => x.Name).IsRequired().HasMaxLength(256);
    }
}

public class CarEntityConfiguration : IEntityTypeConfiguration<Car>
{
    public void Configure(EntityTypeBuilder<Car> builder)
    {
        builder.HasKey(x => x.Id);
        builder.HasOne<Brand>(x => x.Brand).WithMany().IsRequired();
        builder.Property(x => x.Model).IsRequired().HasMaxLength(256);
    }
}

public class Brand
{
    public Brand(Guid id, string name)
    {
        Id = id;
        Name = name;
    }

    public Guid Id { get; }
    public string Name { get; }
}

public class Car
{
    private Car(Guid id, string model) : this(id, null!, model)
    {
        // Only used by EF.
    }

    public Car(Guid id, Brand brand, string model)
    {
        Id = id;
        Brand = brand;
        Model = model;
    }

    public Guid Id { get; }
    public Brand Brand { get; private set; }
    public string Model { get; }
}

Meet the Migrator

The migrator is a simple console application. To enable easy handling of the command line arguments, System.CommandLine is used. The arguments migrate, seed and clear are mapped to different commands (Migrate/Seed/ClearDatabaseCommand).

var builder = Host.CreateDefaultBuilder(args);
builder.UseDefaultServiceProvider(options =>
{
    options.ValidateScopes = true;
    options.ValidateOnBuild = true;
});
builder.ConfigureServices((ctx, services) =>
{
    services.AddScoped<MigrateDatabaseCommand>();
    services.AddScoped<ClearDatabaseCommand>();
    services.AddScoped<SeedDatabaseCommand>();
    services.AddDatabase(ctx.Configuration);
});

IHost app = builder.Build();
var hostEnv = app.Services.GetRequiredService<IHostEnvironment>();
ILogger logger = app.Services.GetRequiredService<ILogger<Program>>();
logger.LogInformation("Running in {HostEnv} mode", hostEnv.EnvironmentName);

await using var scope = app.Services.CreateAsyncScope();
var serviceProvider = scope.ServiceProvider;
var rootCommand = new RootCommand("JOS.Database.Migrator");
AddMigrateDatabaseCommand(rootCommand, serviceProvider);
AddClearDatabaseCommand(rootCommand, serviceProvider);
AddSeedDatabaseCommand(rootCommand, serviceProvider);
await rootCommand.InvokeAsync(args);

static void AddMigrateDatabaseCommand(RootCommand rootCommand, IServiceProvider serviceProvider)
{
    var command = new Command("migrate", "Migrates the database");
    command.SetHandler(() => serviceProvider.GetRequiredService<MigrateDatabaseCommand>().Execute());
    rootCommand.Add(command);
}

static void AddClearDatabaseCommand(RootCommand rootCommand, IServiceProvider serviceProvider)
{
    var command = new Command("clear", "Clears all tables and resets id counter, foreign keys etc");
    command.SetHandler(() => serviceProvider.GetRequiredService<ClearDatabaseCommand>().Execute());
    rootCommand.Add(command);
}

static void SeedDatabaseCommand(RootCommand rootCommand, IServiceProvider serviceProvider)
{
    var command = new Command("seed", "Seeds database with pre-defined data");
    command.SetHandler(() => serviceProvider.GetRequiredService<SeedDatabaseCommand>().Execute());
    rootCommand.Add(command);
}

In the ConfigureServices method, the commands are registered to enable dependency injection of dependencies.
The AddDatabase method resides in the JOS.Database project and looks like this:

public static class DatabaseServiceCollectionExtensions
{
    public static void AddDatabase(this IServiceCollection services, IConfiguration configuration)
    {
        services.AddDbContext<MyDbContext>((_, options) =>
        {
            options.ConfigureDbContext(configuration);
        });
        services.AddScoped<DbContext>(x => x.GetRequiredService<MyDbContext>());
    }

    private static DbContextOptionsBuilder ConfigureDbContext(
        this DbContextOptionsBuilder builder, IConfiguration configuration)
    {
        var connectionString = GetPostgresConnectionString(configuration);
        return builder.UseNpgsql(connectionString, optionsBuilder =>
        {
            optionsBuilder.MigrationsAssembly("JOS.Database");
        }).UseSnakeCaseNamingConvention();
    }

    private static string GetPostgresConnectionString(IConfiguration configuration)
    {
        return configuration.GetRequiredValue<string>("Postgres:ConnectionString");
    }
}

One thing to note here is the AddDbContext call directly followed by the AddScoped<DbContext> call. The AddScoped<DbContext> is added to allow for injection of IEnumerable<DbContext> in our different commands.

We're also specifying which assembly that holds the migration files (JOS.Database).

Create migrations

The migrations are created using the ef dotnet tool

The following command creates an initial migration and stores the migrations in a separate project, JOS.Database.

dotnet ef migrations add Initial -c MyDbContext -o Migrations --project ./src/JOS.Database --startup-project ./src/JOS.Database.Migrator

The startup-project parameter used in the command is important:

The startup project is the one that the tools build and run. The tools have to execute application code at design time to get information about the project, such as the database connection string and the configuration of the model.

The project parameter combined with the o (output-dir) parameter decides where the migration files will be placed.

Migrate database

The MigrateDatabaseCommand looks like this:

public class MigrateDatabaseCommand
{
    private readonly IEnumerable<DbContext> _dbContexts;
    private readonly ILogger<MigrateDatabaseCommand> _logger;

    public MigrateDatabaseCommand(IEnumerable<DbContext> dbContexts, ILogger<MigrateDatabaseCommand> logger)
    {
        _dbContexts = dbContexts ?? throw new ArgumentNullException(nameof(dbContexts));
        _logger = logger ?? throw new ArgumentNullException(nameof(logger));
    }

    public async Task Execute()
    {
        foreach(var dbContext in _dbContexts)
        {
            var dbContextName = dbContext.GetType().Name;
            _logger.LogInformation("Migrating {DbContextName}...", dbContextName);
            await dbContext.Database.MigrateAsync();
            _logger.LogInformation("Migration of {DbContextName} done", dbContextName);
        }
    }
}

It loops through all registered instances of DbContext and calls MigrateAsync, and that's it.

Seed database

public class SeedDatabaseCommand
{
    private readonly MyDbContext _dbContext;

    public SeedDatabaseCommand(MyDbContext dbContext)
    {
        _dbContext = dbContext ?? throw new ArgumentNullException(nameof(dbContext));
    }

    public async Task Execute()
    {
        var bmw = new Brand(Guid.Parse("570f8b18-d515-4b46-b177-771b0ac5a809"), "BMW");
        var bmwCars = new List<Car>
        {
            new(Guid.Parse("75a4085b-d5d7-4187-9999-1910f24c98d5"), bmw, "M3"),
            new(Guid.Parse("2bb7f756-976e-4305-a8b0-dd6487ce9078"), bmw, "iX M6")
        };
        var mercedes = new Brand(Guid.Parse("22bcb03f-92ad-4864-a821-64c41bc3229e"), "Mercedes");
        var mercedesCars = new List<Car>
        {
            new(Guid.Parse("73115269-711e-4617-aadf-54eb0faf9c94"), mercedes, "G-Wagon"),
            new(Guid.Parse("b439d707-f523-44c0-8a2d-3a835a402355"), mercedes, "S")
        };

        _dbContext.Add(bmw);
        _dbContext.Add(mercedes);
        _dbContext.AddRange(bmwCars);
        _dbContext.AddRange(mercedesCars);
        await _dbContext.SaveChangesAsync();
    }
}

Here we're adding a bunch of predefined brands and cars. It's just to showcase that we have total control of the data that gets added to the database, no need to write any custom SQL. We can just use the DbContext as usual.

Clear database

public class ClearDatabaseCommand
{
    private readonly IEnumerable<DbContext> _dbContexts;
    private readonly ILogger<ClearDatabaseCommand> _logger;

    public ClearDatabaseCommand(IEnumerable<DbContext> dbContexts, ILogger<ClearDatabaseCommand> logger)
    {
        _dbContexts = dbContexts ?? throw new ArgumentNullException(nameof(dbContexts));
        _logger = logger ?? throw new ArgumentNullException(nameof(logger));
    }

    public async Task Execute()
    {
        foreach(var dbContext in _dbContexts)
        {
            var dbSets = dbContext
                         .GetType()
                         .GetProperties(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance)
                         .Where(IsDbSet)
                         .Select(x => ((IQueryable<object>)x.GetValue(dbContext)!, x.Name));
            var remainingTables = new Queue<(IQueryable<object> DbSet, string Name)>(dbSets);

            while(remainingTables.TryDequeue(out var dbSet))
            {
                try
                {
                    var deletedRows = await dbSet.DbSet.ExecuteDeleteAsync();
                    _logger.LogInformation(
                        "Removed {NumberOfRows} rows from DbSet {DbSetName}", deletedRows, dbSet.Name);
                }
                catch(DbException)
                {
                    remainingTables.Enqueue(dbSet);
                }
            }
        }
    }

    private static bool IsDbSet(PropertyInfo propertyInfo)
    {
        return propertyInfo.PropertyType.IsGenericType &&
               (typeof(DbSet<>).IsAssignableFrom(propertyInfo.PropertyType.GetGenericTypeDefinition()));
    }
}

This is a fun one. The approach used here is a bit..."bruteforcy" at best. But since this code will only be used in local/dev environments, that's fine. The problem with this code is that it calls ExecuteDeleteAsync on whatever table comes first. In our case, that's not a problem since we're using Cascade on all our relationships so we don't need to worry about any foreign key constraints etc.

If you would like to do something like this in production, I think it would be better to write custom SQL or use a tool like Respawn.

Anyhow, the code above works like this:

  1. Find all the DbSet properties on the DbContext.
  2. Cast it to IQueryable<object> to allow us to use the ExecuteDeleteAsync method.
  3. Call the ExecuteDeleteAsync method to remove all rows in the table.
  4. If it fails, re-add the DbSet to the back of the queue so that it can be retried. This is the bruteforce I was talking about earlier. Since we're catching DbExceptions, we hope that whatever relation that made the call failed will have been removed before trying it again. There's a potential of an infinite loop here, but as I said, this code will only be used in local/dev environments.

I haven't found any database agnostic way of resetting auto incrementing columns, but in our case we don't need to worry about that since we're always using guid as keys.

Init container

The main reason for using a different application for migrating our database is that we can use it as an init container.

Before our apps are deployed in production, Kubernetes will run our migrator as an init container (as a job). Another common approach is to let the main application handle the migration of the database on startup, but that doesn't scale well when using multiple instances of the application.

By using an init container, you don't run into those kind of problems.