Beware of multiple enumeration of IEnumerable

Posted on February 17, 2023

We've all been there, one day or another, debugging a performance issue in production. It's hard to predict and especially difficult to find when it's our first time. Fortunately, though, there is a lot we can do to prevent such issues and today we'll talk about the IEnumerable interface in C#.

If you're using ReSharper or Rider from JetBrains, you're probably already aware of the famous warning: "Possible multiple enumeration of IEnumerable". Microsoft did finally also implemented the warning: https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca1851. It's something, I would say, a lot of developers are easily overlooking. Though, indeed, like mentioned inside JetBrains' article, not every warning, in a sense, are truly by definition a mark of something is wrong in the code. It's a reminder that it shouldn't be ignored and be taken seriously.

I'm still often being asked today: "But Dave, what does it mean, why do I have this warning? What can I do to resolve it?" These are all legitimate questions. To help you understand why it's important, I'll take an example using Entity Framework. Assuming we have the following DbContext & table:

public class ExampleDbContext : DbContext
{
    public virtual DbSet<User> Users { get; set; }
}

public class User
{
    public long Id { get; set; }
    public string FullName { get; set; }
    public string PhoneNumber { get; set; }
}

And the following service:

public interface IExampleService
{
    IEnumerable<User>? GetUsers();
}

public class ExampleService : IExampleService
{
    private readonly ExampleDbContext _exampleDbContext;

    public ExampleService(ExampleDbContext exampleDbContext)
    {
        _exampleDbContext = exampleDbContext;
    }
    
    public IEnumerable<User>? GetUsers()
    {
        return _exampleDbContext.Users;
    }
}

Here is an example I've already seen in the past which is a code smell:

// Assuming IExampleService is injected in the IoC container. 
IExampleService exampleService = default;

var myUsers = exampleService.GetUsers();

if (myUsers.Any())
{
    foreach (var myUser in myUsers)
    {
        // do something per user.
    }
}

Normally, under JetBrains Rider or ReSharper, you would see the following error underlining the variable "myUsers":

In our previous example, we're asking Entity Framework to do two database queries, one for letting us know how many users we have and the second one to give all users. The reason this part of the code is behaving like this is because of the way IEnumerable reacts when being called. Contrary to what we expect it would do, IEnumerable types are lazily loading what's inside when requested to do so. They are "Enumerating" their content one by one when only strictly requested. And when I say strictly when only requested, it's because that's kind of the case. See, for the following line of code:

var myUsers = exampleService.GetUsers();

It will technically do "nothing". Try it under your debugger. You can put a breakpoint inside the ExampleService and you will never break to this point getting past the variable assignation. You will though if you forcibly load the IEnumerable in your debugger. When you ask the IEnumerable to give you if there is "Any" objects inside it, it will execute the enumeration, thus the underlying concrete code responsible to load the content within it. Same as for the majority, if not all, Linq extensions (Where, Select, Sum, Aggregate, etc.). So, all in all, since we call "Any" and that we iterate over our enumerable in a foreach loop, we're enumerating our IEnumerable twice, thus asking Entity Framework to give our users twice, causing two SQL queries. I think you can see where I'm going with that.

In the past, I've resolved production performance issues by simply enumerating first with "ToArray" or "ToList". There was occasions where I've found bit of code doing N+1 requests against the database.

So, if we want to prevent that, with the previous example, we could have done something as the following to prevent multiple enumerations:

// Assuming IExampleService is injected in the IoC container. 
IExampleService exampleService = default;

var myUsers = exampleService.GetUsers();
var users = myUsers as User[] ?? myUsers.ToArray();

if (users.Any())
{
    foreach (var myUser in users)
    {
        // do something per user.
    }
}

Since the variable users is the result of an enumeration of my IEnumerable, doing further manipulation of the array only plays with the in-memory values.

Some are asking me if they should be using IEnumerable at all. As much as I understand the concern, its usage still depends on what you do inside your code. It's not bad per se, it's just that you shouldn't be using it systematically for everything. See, for our previous example, there's two additional ways I see you could expose your users through the ExampleService. You could decide, design wise, to allow additional queries to be done against this API call, so in this case I would put an IQueryable<User> return type. This explicitly tell your consumer, hey be aware that you need to conclude the query with a "ToArray" or something like that. Alternatively, you could simply return an array of User or, if you want to be cool, a IReadOnlyCollection<User>. Using this method assumes that by consuming your API, you shouldn't have to do another enumeration afterwards, so the in-memory list of users is enough.