EF Core Tricks for Bulk Reading Large Data Sets

Working with Entity Framework Core (EF Core) is fantastic for building modern applications, but when data sets grow large, performance bottlenecks quickly appear.

One common challenge? Filtering and reading thousands or even millions of records efficiently. The default LINQ Where + Contains approach works fine for smaller lists, but it struggles when scaled.

This is where Entity Framework Extensions (EF Extensions) shines. With powerful bulk reading methods such as WhereBulkContains, WhereBulkNotContains, and BulkRead, developers can handle massive queries with minimal overhead.

The Problem with Standard Where + Contains

Let’s start with a familiar example:

C#
using Microsoft.EntityFrameworkCore;

List<int> targetProductIds = [1, 2, 3 /* … thousands more */];

var products = await dbContext.Products
    .Where(p => targetProductIds.Contains(p.Id))
    .ToListAsync();

This is simple and effective – until productIds grows huge. At that point, EF Core generates a SQL IN clause with thousands of values, leading to:

  • Slow SQL execution
  • High memory usage
  • SQL parameter count limits
  • No support for composite keys

Installing Entity Framework Extensions

To install Entity Framework Extensions, you can add the package from NuGet:

You can install it using the .NET CLI:

dotnet add package Z.EntityFramework.Extensions.EFCore

Or from the Package Manager Console in Visual Studio:

Install-Package Z.EntityFramework.Extensions.EFCore

Once installed, you’ll unlock the optimized bulk methods we’re about to explore.

WhereBulkContains

The WhereBulkContains method retrieves entities from the database that exist in a given list. Instead of generating a massive IN clause, it works like an INNER JOIN under the hood .

When you pass a large collection, EF Extensions creates a temporary table and populates it using its highly optimized BulkInsert method. Then it performs an INNER JOIN between your main table and this temporary table.

This design allows you to query with an unlimited number of items, without hitting SQL parameter limits or suffering from slow execution plans.

C#
using Z.EntityFramework.Extensions;

 var promotionProductCodes = promotionCsvRows
     .Select(r => r.ProductCode)
     .ToList();

 var productsInPromotion = await dbContext.Products
     .WhereBulkContains(promotionProductCodes, p => p.Code)
     .ToListAsync();

With WhereBulkContains, you’re not limited to a single type of input. It can handle unlimited items and supports a wide range of list formats:

  • Basic types – e.g., List, List
  • Entity types – e.g., List
  • Anonymous types – e.g., new { ProductId = 1 }
  • Dynamic/Expando objects – for flexible key/value structures

On top of that, you can define custom joins with one or multiple properties, making it suitable for composite or surrogate keys.

For example, suppose your marketing team uploads a CSV file with 50,000 product codes that are eligible for an upcoming promotion. You need to quickly fetch those products from the database to apply discounts. With the default Contains, the SQL IN statement would become unwieldy and slow. With WhereBulkContains, EF Extensions offloads the list into a temporary table and runs an efficient join, allowing the query to scale effortlessly.

Is It Faster than Contains?

Not always. For small lists, Contains might be quicker since it uses a simple IN clause.

But WhereBulkContains wins when:

  • Lists contain thousands or millions of items.
  • You need flexibility with key mappings.
  • Queries involve complex scenarios like surrogate keys.

WhereBulkNotContains

The WhereBulkNotContains method retrieves entities from the database that do not exist in a given list. Under the hood, it works like a WHERE NOT EXISTS statement.

For example:

C#
using Z.EntityFramework.Extensions;

var excludedProductIds = externalSnapshot
      .Select(x => x.ProductId)
      .ToList();

var productsToArchive = await dbContext.Products
     .WhereBulkNotContains(excludedProductIds, p => p.Id)
     .ToListAsync();

This method is essentially the counterpart of WhereBulkContains, optimized for exclusion queries. It’s especially useful when dealing with large external lists and identifying entities not present in them.

BulkRead

Sometimes you don’t just want to filter, but need to load entities directly from the database using a provided list. That’s when BulkRead becomes handy:

C#
using Z.EntityFramework.Extensions;

List<Order> erpOrders = [
    new() {  Id = Guid.NewGuid() },
    new() { Id = Guid.NewGuid() },
    // … thousands from ERP export
  ];
  
 var existingOrders = await dbContext.BulkReadAsync(erpOrders);

Internally, it uses WhereBulkContains followed by ToListAsync, so it’s essentially a streamlined way to query based on a list of entities.

Imagine syncing orders from an external ERP system. The ERP exports a JSON file with thousands of order numbers that were just processed. Before updating them in your application, you want to fetch all matching orders from your EF Core database. By using BulkRead, you can pass the list of order numbers directly, and EF Extensions will efficiently retrieve all the corresponding records in a single optimized batch. This approach is much faster than repeatedly querying orders one by one or relying on Contains for large lists.

What’s the Difference Between These Methods?

Here’s a quick breakdown:

  • WhereBulkContains – Retrieve entities from the database that exist in a given list (works like an INNER JOIN).
  • WhereBulkNotContains – Retrieve entities from the database that do not exist in a given list (works like WHERE NOT EXISTS).
  • BulkRead – Calls WhereBulkContains internally and retrieves entities directly.

So, while they’re closely related, each serves a slightly different purpose depending on whether you want to include, exclude, or fetch entities directly from a provided list.

Limitations

Currently supported databases:

  • SQL Server
  • PostgreSQL

Not supported:

  • ExecuteUpdate and ExecuteDelete (already covered by UpdateFromQuery and DeleteFromQuery)
  • Inheritance mapping strategies like TPC, TPH, and TPT

Summary

The standard Where + Contains approach works for small data sets, but it quickly becomes inefficient when the data size grows. Entity Framework Extensions provides bulk reading methods that handle these challenges head-on.

  • WhereBulkContains – Retrieves matching records directly from the database based on values in a given list, and runs as an INNER JOIN behind the scenes.
  • WhereBulkNotContains – Retrieves records from the database that are not part of the given list, working like a NOT EXISTS condition in SQL.
  • BulkRead – Loads entities from the database in bulk by internally calling WhereBulkContains, returning all matching records in a single efficient query.

Takeaways

When you’re working with a small set of IDs, stick with the standard Contains. It’s simple and fast. But once you’re dealing with thousands of items, WhereBulkContains is the smarter option. It scales to any size, supports flexible joins, and avoids SQL parameter limits.

For situations where you need to exclude items, WhereBulkNotContains handles the logic efficiently without relying on fragile negated Contains clauses. And when you already have a batch of entities or keys, BulkRead is the cleanest way to hydrate them all in a single query.

Keep in mind that these methods are supported on SQL Server and PostgreSQL only. For large-scale applications on those providers, EF Extensions can drastically simplify your queries and improve performance.

Found this article useful? Share it with your network and spark a conversation.