Seeding Entity Framework Database from CSV

Ever since Entity Framework Code First was released, we have had a very simple and convenient mechanism for seeding our databases. Most examples show us only how to seed the database by adding a few items directly from .NET code. This approach may be cumbersome if we are dealing with a larger amount of data. A good example of this is populating tables for Countries and Provinces/States. In this post, I will show how we can populate data from CSV files.

Assume our model consists of 2 classes: Country and ProvinceState.

public class Country
{
public int Id { get; set; }
public string Code { get; set; }
public string Name { get; set; }
public virtual IList<ProvinceState> ProvinceStates { get; set; }
}

public class ProvinceState
{
public int Id { get; set; }
public string Code { get; set; }
public string Name { get; set; }

public virtual Country Country { get; set; }
}

Our database context simply has a DbSet for both classes:

public class LocationContext : DbContext
{
public IDbSet<Country> Countries { get; set; }
public IDbSet<ProvinceState> ProvinceStates { get; set; }

}

1. Seeding with Code First Migrations

If you are using Code First Migrations, you can seed our database by adding code to the Seed method of the migration’s Configuration class. The advantage of using this method is the Seed method is executed at the time the database is updated instead of at runtime. The method will be executed anytime a developer runs the Update-Database command or anytime we deploy a new version of the application using Web Deploy.

You can learn how to setup and use Code First Migrations over on MSDN.

Loading Data from CSV

First, let’s create a CSV named countries.csv that contains the data we want to load into our Countries table. The file might look something like this:

We will also create a second CSV file named provincestates.csv with the headers CountryCode, Code, and Name:

Next, we will include these files in our project in Visual Studio and set the Build Action to Embedded Resource for both files.

Install the CsvHelper Nuget package, a handy package for reading data from csv files.

Install-Package CsvHelper

Now we will modify the Seed method as follows:

protected override void Seed(SeedingDataFromCSV.Domain.LocationContext context)
{

Assembly assembly = Assembly.GetExecutingAssembly();
string resourceName = "SeedingDataFromCSV.Domain.SeedData.countries.csv";
using (Stream stream = assembly.GetManifestResourceStream(resourceName))
{
using (StreamReader reader = new StreamReader(stream, Encoding.UTF8))
{
CsvReader csvReader = new CsvReader(reader);
csvReader.Configuration.WillThrowOnMissingField = false;
var countries = csvReader.GetRecords<Country>().ToArray();
context.Countries.AddOrUpdate(c => c.Code, countries);
}
}

resourceName = "SeedingDataFromCSV.Domain.SeedData.provincestates.csv";
using (Stream stream = assembly.GetManifestResourceStream(resourceName))
{
using (StreamReader reader = new StreamReader(stream, Encoding.UTF8))
{
CsvReader csvReader = new CsvReader(reader);
csvReader.Configuration.WillThrowOnMissingField = false;
while (csvReader.Read())
{
var provinceState = csvReader.GetRecord<ProvinceState>();
var countryCode = csvReader.GetField<string>("CountryCode");
provinceState.Country = context.Countries.Local.Single(c => c.Code == countryCode);
context.ProvinceStates.AddOrUpdate(p => p.Code, provinceState);
}
}
}
}

Now any time we run the Update-Database command, the Countries table will be synchronized with the data in the countries.csv file.

And the ProvinceStates table will be synchronized with the data in the provincestates.csv file.

This sample code is available on GitHub:

dpaquette/SeedingDatabaseFromCSV

Seeding with a Database Initializer

Another option for seeding data is to create a database initializer for your context. The database initializer also has a seed method where you could use the same code as above to load data from a CSV file.

You can find more information on here.