BulkInsert extension method in Entity Framework
Install Z.EntityFramework.Extensions:
Now, I’ll show you how to use the Z.EntityFramework.Extensions package and perform bulk insert, update, and delete operations through Entity Framework. First, open the NuGet Package Manager Console window and search for the Z.EntityFramework.Extensions package. Select“Z.EntityFramework.Extensions“, then select “Project” and select the latest version, and finally click the “Install” button as shown in the image below.
After clicking the Install button, it will take some time and install the Z.EntityFramework.Extensions DLL into your project.
Note: This Z.EntityFramework.Extensions extends our DbContext object with high-performance bulk operations such as BulkSaveChanges, BulkInsert, BulkUpdate, BulkDelete, BulkMerge, etc. It supports SQL Server, MySQL, Oracle, PostgreSQL, SQLite, and more!
Advantages of using Entity Framework extensions:
- Easy to use.
- flexible.
- Improve performance.
- Improve application responsiveness.
- Reduce database load by reducing the number of database round-trips.
BulkInsert extension method:
Z.EntityFramework.Extensions provides two methods, BulkInsert and BulkInsertAync, which allow us to insert a large number of entities into the database at once. The syntax for using the bulk insert extension method is as follows:
context.BulkInsert(listStudents);
context.BulkInsertAsync(listStudents, cancellationToken);
Learn an example of using the BulkInsert method of the Entity Framework extension
In the following example, we will use the BulkInsert extension method to insert a list of students (i.e. newStudents) into the database. Here, we don’t need to call SaveChanges method while performing bulk insert operation. In this example, using a single round trip, the context class performs the INSERT operation.
using System; using System.Collections.Generic; using System.Linq; namespaceDBFirstApproach { class Program { static void Main(string[] args) { Console.WriteLine("BulkInsert Method Started"); IList<Student> newStudents = new List<Student>() { new Student() { FirstName = "John", LastName = "Taylor", StandardId = 1 }, new Student() { FirstName = "Sara", LastName = "Taylor", StandardId = 1 }, new Student() { FirstName = "Pam", LastName= "Taylor", StandardId = 1 }, }; using (var context = new EF_Demo_DBEntities()) { context.Database.Log = Console.Write; //Easy to use context.BulkInsert(newStudents); } Console.WriteLine("BulkInsert Method Completed"); GetStudents("Taylor"); Console.Read(); } public static void GetStudents(string LastName) { using (var context = new EF_Demo_DBEntities()) { var studentsList = context.Students.Where(std => std.LastName.Equals(LastName, StringComparison.InvariantCultureIgnoreCase)); foreach (var std in studentsList) { Console.WriteLine($"FirstName : {std.FirstName}, LastName : {std.LastName}, StandardId : {std.StandardId}"); } } } } }
When you run the above code, you will get the following output. If you look closely at the SQL query, you will see that it is using SQL Merge to perform a BULK operation. Since it reduces the number of round trips to the database server, it greatly improves application performance.
BulkInsert Method Started Opened connection at 12-12-2022 19:45:45 + 05:30 -- Executing Command: /* SELECT server information */ SELECT @@VERSION /* SELECT table information */ SELECT DestinationTable.Name AS DestinationName , (SELECT 1 WHERE EXISTS ( SELECT 1 FROM sys.triggers AS X WHERE X.parent_id = A.object_id AND X.is_disabled = 0 AND OBJECTPROPERTY(X.object_id, 'ExecIsInsertTrigger') = 1 ) ) AS HasInsertTrigger, (SELECT 1 WHERE EXISTS ( SELECT 1 FROM sys.triggers AS X WHERE X.parent_id = A.object_id AND X.is_disabled = 0 AND OBJECTPROPERTY(X.object_id, 'ExecIsUpdateTrigger') = 1 ) ) AS HasUpdateTrigger, ( SELECT 1 WHERE EXISTS ( SELECT 1 FROM sys.triggers AS X WHERE X.parent_id = A.object_id AND X.is_disabled = 0 AND OBJECTPROPERTY(X.object_id, 'ExecIsDeleteTrigger') = 1 ) ) AS HasDeleteTrigger FROM (SELECT @Table_0 AS Name) AS DestinationTable LEFT JOIN sys.synonyms AS B ON B.object_id = OBJECT_ID(DestinationTable.Name) AND COALESCE(PARSENAME(base_object_name,4), @@SERVERNAME) = @@SERVERNAME AND COALESCE(PARSENAME(base_object_name,3), DB_NAME(DB_ID())) = DB_NAME(DB_ID()) INNER JOIN sys.tables AS A ON A.object_id = OBJECT_ID(DestinationTable.Name) OR A.object_id = OBJECT_ID(B.base_object_name) ORDER BY DestinationName /* SELECT column information */ SELECT DestinationTable.Name AS DestinationName , C.name AS ColumnName , C.column_id AS ColumnOrder, C.precision AS Precision, C.scale AS Scale, C.max_length AS MaxLength , C.collation_name AS Collation , C.Is_Identity AS IsIdentity, ( CASE WHEN EXISTS ( SELECT 1 FROM sys.index_columns AS X WHERE X.index_id = B.index_id AND X.object_id = B.object_id AND X.column_id = C.column_id ) THEN 1 ELSE 0 END ) AS IsPrimaryKey , C.system_type_id AS System_Type_Id , D.Name AS TypeName, (CASE WHEN E.base_object_name IS NOT NULL THEN 1 ELSE 0 END) AS IsSynonym, D.is_user_defined, F.name, CASE WHEN C.default_object_id = 0 THEN 'ZZZ_NO_DEFAULT' ELSE ISNULL(OBJECT_DEFINITION(C.default_object_id), 'ZZZ_ERROR_DEFAULT_ZZZ') END AS DefaultValueSql, C.is_nullable FROM (SELECT @Table_0 AS Name) AS DestinationTable LEFT JOIN sys.synonyms AS E ON E.object_id = OBJECT_ID(DestinationTable.Name) AND COALESCE(PARSENAME(base_object_name,4), @@SERVERNAME) = @@SERVERNAME AND COALESCE(PARSENAME(base_object_name,3), DB_NAME(DB_ID())) = DB_NAME(DB_ID()) INNER JOIN sys.tables AS A ON A.object_id = OBJECT_ID(DestinationTable.Name) OR A.object_id = OBJECT_ID(E.base_object_name) LEFT JOIN sys.indexes AS B ON B.object_id = A.object_id AND B.is_primary_key = 1 INNER JOIN sys.columns AS C ON C.object_id = A.object_id INNER JOIN sys.types AS D ON D.system_type_id = C.system_type_id AND D.user_type_id = C.user_type_id INNER JOIN sys.schemas AS F ON D.schema_id = F.schema_id ORDER BY DestinationName, ColumnOrder -- @Table_0: [dbo].[Student] (Type = String, Size = 15) -- CommandTimeout:30 -- Executing at 12-12-2022 19:45:45 -- Completed at 12-12-2022 19:45:45 -- Result: SqlDataReader -- Executing Command: MERGE INTO [dbo].[Student] AS DestinationTable USING ( SELECT TOP 100 PERCENT * FROM (SELECT @0_0 AS [StudentId], @0_1 AS [FirstName], @0_2 AS [LastName], @0_3 AS [StandardId], @0_4 AS ZZZ_Index UNION ALL SELECT @1_0 AS [StudentId], @1_1 AS [FirstName], @1_2 AS [LastName], @1_3 AS [StandardId], @1_4 AS ZZZ_Index UNION ALL SELECT @2_0 AS [StudentId], @2_1 AS [FirstName], @2_2 AS [LastName], @2_3 AS [StandardId], @2_4 AS ZZZ_Index) AS StagingTable ORDER BY ZZZ_Index ) AS StagingTable ON 1 = 2 WHEN NOT MATCHED THEN INSERT ( [FirstName], [LastName], [StandardId] ) VALUES ( [FirstName], [LastName], [StandardId] ) OUTPUT $action, StagingTable.ZZZ_Index, INSERTED.[StudentId] AS [StudentId_zzzzinserted] ; -- @0_0: 0 (Type = Int32, Size = 4) -- @0_1: John (Type = AnsiString, Size = 100) -- @0_2: Taylor (Type = AnsiString, Size = 100) -- @0_3: 1 (Type = Int32, Size = 4) -- @0_4: 0 (Type = Int32, Size = 0) -- @1_0: 0 (Type = Int32, Size = 4) -- @1_1: Sara (Type = AnsiString, Size = 100) -- @1_2: Taylor (Type = AnsiString, Size = 100) -- @1_3: 1 (Type = Int32, Size = 4) -- @1_4: 1 (Type = Int32, Size = 0) -- @2_0: 0 (Type = Int32, Size = 4) -- @2_1: Pam (Type = AnsiString, Size = 100) -- @2_2: Taylor (Type = AnsiString, Size = 100) -- @2_3: 1 (Type = Int32, Size = 4) -- @2_4: 2 (Type = Int32, Size = 0) -- CommandTimeout:120 -- Executing at 12-12-2022 19:45:45 -- Completed at 12-12-2022 19:45:46 -- Result: 3 rows Closed connection at 12-12-2022 19:45:46 + 05:30 BulkInsert Method Completed FirstName: John, LastName: Taylor, StandardId: 1 FirstName: Sara, LastName: Taylor, StandardId: 1 FirstName: Pam, LastName: Taylor, StandardId: 1
Performance comparison of SaveChanges and BulkInsert in Entity Framework:
Let’s take a look at the performance benchmark between Entity Framework SaveChanges method and BulkInsert extension method through an example. As we all know, the Entity Framework SaveChanges method will generate and execute a separate SQL query for each entity, i.e. multiple database passes, while the BulkInsert extension method performs the same task using a single database pass.
We will batch insert 1000 students using two methods namely AddRange method and BulkInsert extension method and will measure the time required by both methods to complete the task. For better understanding, take a look at the following example. In the example below, don’t consider using the FirstTimeExecution method for performance testing because we know that when we perform something for the first time, it will take more time.
using System; using System.Collections.Generic; using System.Diagnostics; namespaceDBFirstApproach { class Program { static void Main(string[] args) { //Don't consider below for performance Testing //This warmup FirstTimeExecution(); //Generate 1000 Students List<Student> studentList = GenerateStudents(1000); Stopwatch SaveChangesStopwatch = new Stopwatch(); Stopwatch BulkInsertStopwatch = new Stopwatch(); using (EF_Demo_DBEntities context1 = new EF_Demo_DBEntities()) { // Add the Student Collection using the AddRange Method context1.Students.AddRange(studentList); SaveChangesStopwatch.Start(); context1.SaveChanges(); SaveChangesStopwatch.Stop(); Console.WriteLine($"SaveChanges, Entities : {studentList.Count}, Time Taken : {SaveChangesStopwatch.ElapsedMilliseconds} MS"); } using (EF_Demo_DBEntities context2 = new EF_Demo_DBEntities()) { // BulkInsert BulkInsertStopwatch.Start(); context2.BulkInsert(studentList, options => options.AutoMapOutputDirection = false); // performance can be improved with options BulkInsertStopwatch.Stop(); Console.WriteLine($"BulkInsert, Entities : {studentList.Count}, Time Taken : {BulkInsertStopwatch.ElapsedMilliseconds} MS"); } Console.Read(); } public static void FirstTimeExecution() { List<Student> stduentsList = GenerateStudents(20); // SaveChanges using (var context = new EF_Demo_DBEntities()) { context.Students.AddRange(stduentsList); //Call the SaveChanges Method to INSERT the data into the database context.SaveChanges(); // Delete the Newly Inserted Data context.BulkDelete(stduentsList); } // BulkInsert using (var context = new EF_Demo_DBEntities()) { context.BulkInsert(stduentsList, options => options.AutoMapOutputDirection = false); // Delete the Newly Inserted Data context.BulkDelete(stduentsList); } } public static List<Student> GenerateStudents(int count) { var listOfStudents = new List<Student>(); for (int i = 0; i < count; i + + ) { listOfStudents.Add(new Student() { FirstName = "FirstName_" + i, LastName = "LastName_" + i, StandardId = 1 }); } return listOfStudents; } } }
Now, run the above code and you will get the following output. As you can see, the SaveChanges method takes 490 milliseconds to insert 1000 entities into the database, while the BulkInsert Extension method only takes 20 milliseconds to insert the same 1000 entities into the database. Therefore, you can imagine how dangerous the Entity Framework AddRange method is when performance is considered.
Note: There are many factors that affect baseline times to consider, such as indexes, column types, latencies, constraints, etc.
Why is the BulkInsert extension method faster than SaveChanges?
Inserting 1000 entities for initial load or file import is a typical scenario. The SaveChanges method makes handling this situation very slow/impossible due to the number of database round-trips required. SaveChanges performs a database round trip for each entity being inserted. So if 10,000 entities need to be inserted, 10,000 database round-trips will be performed, which will make it slower.
The BulkInsert counterpart requires as few database round-trips as possible. For example, behind the scenes in SQL Server, executing SqlBulkCopy to insert 10,000 entities is the fastest method available.