Entity Framework is an ORM layer for .NET applications. Sometimes, we import data from a flat file. Technically, this boils down to: Delete all records from the table. And insert new records into the table.
Import data
Below is the first cut of code that I wrote. A logical use of Entity framework. But unexpectedly, yields very poor performance.
// delete data foreach(Employee emp in context.Employees) { context.Employees.Remove(emp); } context.SaveChanges(); // insert new data foreach(string line in lines) { context.Employees.Add(emp); } context.SaveChanges();
We delete records using the Remove
method of DbSet
class. Then add new records using the Add
method.
Deleting the records
Run the above code snippet with 10000, 100000 and million records. The performance decreases exponentially. Deleting each record from the context yields very poor performance. For each record, a SQL query is fired. Let us optimise deleting the records. We will use a single SQL query.
context.Database.ExecuteSQLCommand("DELETE FROM Employees");
There is a significant performance boost. From 105 minutes, the time taken has reduced to 60 minutes. But, that is not enough. We will now optimise adding the records.
Adding new records
Entity Framework does a lot of work in the background. It detects changes in data by comparing existing data with new data. Detect changes can be turned off by setting a configuration property, AutoDetectChangesEnabled
, to false. While adding records, we know that there is no existing data. So, it is safe to turn off change detection. Remember to turn on the setting after adding records.
context.Configuration.AutoDetectChangesEnabled = false; foreach(string line in lines) { context.Employees.Add(emp); } context.SaveChanges(); context.Configuration.AutoDetectChangesEnabled = true;
With the above optimisation, it took just 15 minutes to import data from the file.