LINQ to SQL - Performance Tradeoffs: Less Database Roundtrips but Duplicated Data in Each Row??
by David Hayden ( Microsoft MVP C# ), Filed: LINQ to SQL
I talked about how LINQ to SQL will prefetch or lazy load relationships via a LinqDataSource based on setting the LoadOptions on the DataContext in the following post:
LinqDataSource - High Performance Queries Using DataLoadOptions - Avoiding Database Roundtrips
Avoiding Database Roundtrips has its performance advantages, but you certainly can't use it as the only judge of data access layer performance. You also don't want to get into the habit of doing premature optimization, which I often look at as sacrificing maintainability for performance improvements based on no performance requirements to do so.
Still, we can't be using LINQ to SQL in production applications if we don't have a clue how it generates queries to the database. Keeping in mind that I am playing here to get a feel for how LINQ to SQL generates queries, etc., I was quite amazed at how LINQ to SQL will handle returning a blog and prefetching its categories.
First thing I did was generate some code that specified fetching a Blog and its categories with a BlogId = 1. Notice the use of the DataLoadOptions to specify I want Categories prefetched with the Blog:
using (BlogDataContext context = new BlogDataContext())
{
DataLoadOptions options = new DataLoadOptions();
options.LoadWith<Blog>(c => c.Categories);
context.LoadOptions = options;
Blog blog = context.Blogs.Single<Blog>(c => c.BlogId == 1);
}
If the Lambda Expressions above are a bit confusing, they are just a simpler way of writing anonymous delegates that I talked about in the following post:
C# 3.0 Features - C# 3.0 Examples - Query Expressions - Anonymous Types - Lamba Expressions - Extension Methods
So the question is how do the results come back from the prefetch? The results may or may not suprise you:
Notice that the results are returned in a single resultset, with the Blog data duplicated and returned 5 times in this case - once for each Category associated with the Blog. I didn't ask for a Count, but obviously LINQ to SQL felt it was necessary.
In this case, the repetition of the Blog Data is not real heavy, but think about if you had say a large description field ( NVARCHAR(MAX)) or binary data of some sort being returned several times. This could cause large amounts of data being sent over and over again.
Interesting stuff.
Also Read:
Source: David Hayden ( Microsoft MVP C# )
Filed: LINQ to SQL