Deferred execution with LINQ

The execution of queries on collection of objects in C# is an operation that we can do of course using LINQ, the popular component introduced 10 year ago in the .NET framework.

With LINQ we can write queries on objects with a SQL-like sintax, it give us a lot of flexibility but, as all the powerful tools, we need to take care about some aspects that can influence the performance of our query.

Recently I faced with a problem in a piece of code that in the first version had bad performances; this code simply mapped a list of object in another one; apparently the code was correct but the execution of a list of thousand objects was very slow, about ten times more than expected; the problem was the deferrend execution of LINQ.

The models

Suppose that I have a person model like this:

class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
public Address Address { get; set; }
}

class Address
{
public string Street { get; set; }
public string Zip { get; set; }
public string City { get; set; }
}

I want to map this model in another one:

class PersonModel
{
public string FirstName { get; set; }
public string LastName { get; set; }
public string Address { get; set; }
}

So I want to denormalize an object person with an address in another one with a flat rapresentation of the address itself.

The code

Now suppose that I have a list of 1000 persons and I want to map them in person models, I can do this with a simple LINQ select statement:

var personsModels = persons.Select(p => new PersonModel()
{
FirstName = p.FirstName,
LastName = p.LastName,
Address = $"{p.Address.Street} ({p.Address.Zip}) {p.Address.City}"
});

Now I can use NUnit to check the result, like this:

for (int i = 0; i < personsModels.Count(); i++)
{
Assert.IsNotEmpty(personsModels.ElementAt(0).Address);
}

The time of execution of this test is around 200 milliseconds for 1000 object.

At the first time sounds good but is not true; first of all 200 milliseconds to map 1000 object is unacceptable; second, what happens if, instead of the deferred execution I adopt the immediate execution, like this:

var personsModels = persons.Select(p => new PersonModel()
{
FirstName = p.FirstName,
LastName = p.LastName,
Addess = $"{p.Address.Street} ({p.Address.Zip}) {p.Address.City}"
}).ToList();

The result is that the same code is executed in 20 milliseconds, 10 time less that the previous test.

The explanation of this is that in the first version, every time the assert try to retrieve the object, the select statement will be re-evaluated to the entire persons list; this because what I have written in the first piece of code is only an expression, that will be evaluated deferred when requested.

With the second one, the evaluation of the expression will be executed immediately and the results materialized in a list.

This could be a common error that also expert programmer could do and find a similar error later could not be easy.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s