Ever since Entity Framework Code First was released, we have had a very simple and convenient mechanism for seeding our databases. Most examples show us only how to seed the database by adding a few items directly from .NET code. This approach may be cumbersome if we are dealing with a larger amount of data. A good example of this is populating tables for Countries and Provinces/States. In this post, I will show how we can populate data from CSV files.
Assume our model consists of 2 classes: Country and ProvinceState.
public class Country |
Our database context simply has a DbSet for both classes:
public class LocationContext : DbContext |
1. Seeding with Code First Migrations
If you are using Code First Migrations, you can seed our database by adding code to the Seed method of the migration’s Configuration class. The advantage of using this method is the Seed method is executed at the time the database is updated instead of at runtime. The method will be executed anytime a developer runs the Update-Database command or anytime we deploy a new version of the application using Web Deploy.
You can learn how to setup and use Code First Migrations over on MSDN.
Loading Data from CSV
First, let’s create a CSV named countries.csv that contains the data we want to load into our Countries table. The file might look something like this:
We will also create a second CSV file named provincestates.csv with the headers CountryCode, Code, and Name:
Next, we will include these files in our project in Visual Studio and set the Build Action to Embedded Resource for both files.
Install the CsvHelper Nuget package, a handy package for reading data from csv files.
Install-Package CsvHelper
Now we will modify the Seed method as follows:protected override void Seed(SeedingDataFromCSV.Domain.LocationContext context)
{
Assembly assembly = Assembly.GetExecutingAssembly();
string resourceName = "SeedingDataFromCSV.Domain.SeedData.countries.csv";
using (Stream stream = assembly.GetManifestResourceStream(resourceName))
{
using (StreamReader reader = new StreamReader(stream, Encoding.UTF8))
{
CsvReader csvReader = new CsvReader(reader);
csvReader.Configuration.WillThrowOnMissingField = false;
var countries = csvReader.GetRecords<Country>().ToArray();
context.Countries.AddOrUpdate(c => c.Code, countries);
}
}
resourceName = "SeedingDataFromCSV.Domain.SeedData.provincestates.csv";
using (Stream stream = assembly.GetManifestResourceStream(resourceName))
{
using (StreamReader reader = new StreamReader(stream, Encoding.UTF8))
{
CsvReader csvReader = new CsvReader(reader);
csvReader.Configuration.WillThrowOnMissingField = false;
while (csvReader.Read())
{
var provinceState = csvReader.GetRecord<ProvinceState>();
var countryCode = csvReader.GetField<string>("CountryCode");
provinceState.Country = context.Countries.Local.Single(c => c.Code == countryCode);
context.ProvinceStates.AddOrUpdate(p => p.Code, provinceState);
}
}
}
}
Now any time we run the Update-Database command, the Countries table will be synchronized with the data in the countries.csv file.
And the ProvinceStates table will be synchronized with the data in the provincestates.csv file.
This sample code is available on GitHub:
dpaquette/SeedingDatabaseFromCSV
Seeding with a Database Initializer
Another option for seeding data is to create a database initializer for your context. The database initializer also has a seed method where you could use the same code as above to load data from a CSV file.
You can find more information on here.