ADO.NET 2.0 Tutorial : SqlBulkCopy Revisited for Transferring Data at High Speeds

ADO.NET 2.0 Tutorial : SqlBulkCopy Revisited for Transferring Data at High Speeds

by David Hayden ( .NET Developer )

 

I answered a question in the MSDN Forums today regarding SqlBulkCopy and for some reason I was under the impression that you had to give SqlBulkCopy an IDataReader as a source of data. I don't have a clue as to where that little bit of nonsense came from :)

It turns out the SqlBulkCopy.WriteToServer Method, which is responsible for writing the data to a database table, will accept the following as a source of data:

  • DataTable
  • DataRow[]
  • IDataReader

So, if you have a DataTable to begin with, no sense using the DataTable.CreateDataReader() method when SqlBulkCopy will accept a DataTable directly.  Here is a bit of code, but note that I would not waste the time in populating a DataTable if you are pulling the records from an existing database. The question in the MSDN Forums had to deal with the records already existing in a DataTable and I am just creating a DataTable to show it as a data source.  See my other post:

for another example.

 

// Get data in a DataTable
DataTable table = new DataTable();

string northwindConnectionString = "...Northwind...";

using (SqlConnection connection =
new SqlConnection(northwindConnectionString)) { using (SqlCommand command = connection.CreateCommand()) { command.CommandText = "SELECT * FROM Categories"; connection.Open(); using (IDataReader dr = command.ExecuteReader
(CommandBehavior.CloseConnection)) { table.Load(dr); } } }
// Upload DataTable to a Database Table string destinationConnectionString = "..."; using (SqlBulkCopy copy =
new SqlBulkCopy(destinationConnectionString)) { copy.DestinationTableName = "Categories"; copy.WriteToServer(table); }

 

While I am talking about SqlBulkCopy again, I should mention Pablo Castro's, the Program Manager for the ADO.NET Team at Microsoft, great explanation as to why SqlBulkCopy is faster than using normal insert commands:

"There are a number of reasons why bulk-copy is faster. Here is a summary:

  • No per-row statement execution. When you do multiple inserts without bulk-copy, each insert is a statement in itself (regardless of whether it's batched together with other statements). With bulk-copy, we don't incur the cost of executing a statement for each row, the whole copy operation is a single thing.
  • No multiple network round-trips. Once the bulk-insert operation is setup, we send rows from the client to the server continously, without going back-and-forth over the wire.
  • Server storage engine also can greatly optimize how rows are inserted when performing a bulk-copy operation. How much can be optimized depends a lot on the recovery model the tarder database is set to; in "simple" and "bulk logged" the overhead of logging is greatly reduced during bulk-copy operations, helping a lot with performance.

Now, as to "how" that happens, it's hard to describe without going down to the details of the SQL Server client-server protocol. The short story is that we setup a BCP operation by sending a special statement to the server that includes metadata about the row-set we're about to set; that switches the session to bulk-copy mode, at which point the client and start sending the row stream to the server one after the other; as rows come the server does minimal processing in the upper layers, rows go straight to the storage layer. Once the client is done it sends a "done" marker and the operation is completed in the server. This operation can happen in "batches" of rows or all at once." - Pablo Castro

ADO.NET 2.0 is just plain cool. I could talk about it for days :)

Source:  David Hayden ( .NET Developer )

 

posted on Wednesday, March 08, 2006 7:16 PM

Main

News

Green Tea

.NET Development

Enterprise Library

Patterns & Practices