The N+1 Query Problem in EF Core: Benchmarking Every Fix on PostgreSQL
Actual BenchmarkDotNet numbers for N+1 vs Include, split queries, projections, and compiled queries in EF Core 10 on PostgreSQL 16.
The N+1 Query Problem in EF Core: Benchmarking Every Fix on PostgreSQL
Every EF Core blog post about N+1 queries tells you the same thing: avoid it, use Include() or split queries, and for extra performance use compiled queries. That advice is mostly right, partly wrong, and completely misses the headline.
Here’s what the data actually says on EF Core 10 and PostgreSQL 16: the gap between N+1 and any fix is 48×–140×. The gap between the best and worst fix is about 3×. Once you’ve killed the N+1, which specific fix you pick barely matters for performance. And two pieces of the standard advice don’t survive contact with modern EF Core: split queries can be slower than plain Include() on realistic data shapes, and compiled queries save around 3% — not the dramatic win they were in EF Core 6.
We benchmarked six approaches to the same load-a-hundred-orders-with-items problem — lazy loading (N+1), Include(), split query, projection, compiled projection, and raw SQL — on PostgreSQL 16 via Testcontainers, on .NET 10, using BenchmarkDotNet for statistical rigor. The full project is open source so you can run it yourself.
What We Tested
The data model is a realistic e-commerce schema: 10,000 orders, each with an average of 5 order items, pointing to a catalog of 1,000 products across 50 categories. Every scenario loads the same 100 orders with their full item-and-product tree, so the amount of data returned is identical across approaches. The only variable is how EF Core assembles the result.
Lazy loading (N+1) — the anti-pattern. Proxies issue one query to load the 100 orders, then one query per navigation property access as your code walks the graph. On this data shape, that works out to roughly 500 SQL statements per operation.
Eager loading (Include) — .Include(o => o.Items).ThenInclude(i => i.Product). One query, one round-trip, JOINs pulling everything back in a single result set.
Split query — the same Include() graph with .AsSplitQuery() appended. EF Core issues multiple queries (one per collection) instead of a single cartesian JOIN. Standard advice says this avoids the duplication caused by JOIN multiplication.
Projection (Select) — .Select() into a flat DTO with only the fields you actually need. One optimized SQL query with no entity tracking.
Compiled projection — the same projection, pre-compiled with EF.CompileAsyncQuery(). Historically a meaningful optimization in EF Core 6; we measure whether that still holds.
Raw SQL (manual JOIN) — FromSqlRaw with a hand-written JOIN statement. The performance floor — as close as you can get to Npgsql without abandoning EF Core entirely.
Methodology
Same standard as Post 1 of this series — full transparency on how these numbers were produced.
Hardware (primary): Hetzner CPX42 — 8 vCPUs (AMD EPYC-Genoa), 16 GB RAM, 320 GB NVMe SSD, eu-central datacenter, Ubuntu 24.04.3 LTS. A €25.49/month shared-vCPU cloud instance, representative of the infrastructure many .NET teams actually deploy to.
Hardware (secondary): Apple M4 Pro, 14 cores, macOS 26.4.1, Arm64 RyuJIT. Bare-metal laptop for cross-platform validation.
Runtime: .NET SDK 10.0.104 (Linux) / 10.0.102 (macOS), BenchmarkDotNet v0.15.4, configured with FullConfig — 100 iterations, 3 launches, 15 warmup iterations per benchmark.
Database: PostgreSQL 16 via Testcontainers, seeded with 10,000 orders / 50,000 order items / 1,000 products / 50 categories. Each benchmark run gets a fresh container — no stale caches, no shared state between scenarios.
What we measure: Mean latency per operation, memory allocated, standard deviation, and SQL query count via an EF Core command interceptor. The query count is what makes the N+1 story visible — it’s the difference between 1 SQL statement and 500+.
A note on variance: The N+1 benchmark has wide variance on Linux — standard deviation around 28% of the mean. That’s not a measurement bug. Running 500+ sequential queries through a containerized network inside a shared-vCPU cloud VM amplifies jitter in a way that a single JOIN query doesn’t. This is exactly the behavior N+1 exhibits in production — unstable, unpredictable, tail-latency-hostile. The high variance is part of the finding.
The full benchmark project, including the seed data and the MonitoredDbContext that counts SQL statements, is on GitHub: dotnet-performance-guide/02-ef-core-n-plus-one. Clone it, run dotnet run -c Release, and get your own numbers.
The Results
Full BenchmarkDotNet output from the Linux x64 run:
| Approach | Mean | StdDev | SQL Queries | Allocated | Ratio vs N+1 |
|---|---|---|---|---|---|
| Lazy loading (N+1) | 336.9 ms | 95,134 μs | ~500 | 10,843 KB | 1.00× |
| Split query | 7.0 ms | 1,977 μs | 3 | 1,232 KB | 0.021× |
| Eager loading (Include) | 5.4 ms | 1,331 μs | 1 | 889 KB | 0.016× |
| Projection (Select) | 2.9 ms | 491 μs | 1 | 344 KB | 0.0085× |
| Compiled projection | 2.4 ms | 178 μs | 1 | 327 KB | 0.0072× |
| Raw SQL (manual JOIN) | 2.4 ms | 496 μs | 1 | 312 KB | 0.0072× |
Every row except N+1 completes in single-digit milliseconds. N+1 takes 337 milliseconds — more than a third of a second to load 100 orders. That’s the entire story in one table.
The N+1 gap dwarfs everything else
Look at the Ratio column. N+1 is 48× slower than the slowest fix (split query at 7.0 ms) and 140× slower than the fastest (compiled projection at 2.4 ms). There is no middle ground here — no fix is 10% better than N+1, or 2× better. They’re all one to two orders of magnitude better.
The allocation story is even worse. N+1 burns 10.8 MB of managed memory per 100-order load. Projection uses 344 KB. That’s a 31× difference in garbage collection pressure for the same work.
And the SQL count makes it visceral: approximately 500 round-trips to PostgreSQL for the N+1 case versus a single round-trip for five of the six fixes. Every one of those 500 queries is a network hop, a query parse, a query plan lookup, and a result serialization. They add up fast.
The practical implication: if you’re debating between Include() and projection because you want your EF Core queries to be faster, you’re optimizing the 2% that remains. If you haven’t yet fixed the N+1, you’re looking at the wrong problem. Fix the N+1 first; then argue about which fix.
Surprise #1: Split query is slower than Include here
Conventional EF Core advice: when you have multiple collection Include()s, use .AsSplitQuery() to avoid cartesian explosion. Multiplying a parent’s rows by all its child collections produces duplicated data and bloats the result set.
Our numbers say split query (7.0 ms) is about 30% slower than plain Include() (5.4 ms) on this workload.
Why? Cartesian explosion only matters when child collections are large relative to parents. Each order in our seed data has around 5 items, each pointing to a single product. The JOIN doesn’t produce a wasteful cross product — it produces a result set about 5× the size of the orders alone, which is perfectly reasonable. Meanwhile, split query pays for three separate round-trips to PostgreSQL (one per collection level) without any cartesian savings to offset the extra latency.
Split query is a specific optimization for specific data shapes. If your child collections average 50+ items per parent, or you’re Include-ing multiple sibling collections that would each multiply the result set, split query wins. On typical business-domain data — small-to-medium collections, single-depth hierarchies — Include is faster.
Practical guidance: don’t reflexively reach for .AsSplitQuery() because a blog post said to. Measure on your actual data shape. If you’re loading orders-with-items where each order has 3–10 items, Include is probably your answer.
Surprise #2: Compiled queries save 3%, not 30%
In EF Core 6 and earlier, EF.CompileQuery and EF.CompileAsyncQuery were significant optimizations. Compiling the LINQ expression tree once and reusing it avoided substantial per-execution overhead. Every performance article from that era told you to compile your hot-path queries.
In EF Core 10, the plan cache has become aggressive enough that the gap has collapsed. Our compiled projection runs at 2.43 ms versus 2.88 ms for the uncompiled version — a roughly 16% improvement on Linux, closer to 3% on macOS. The allocation savings are similarly modest (327 KB vs 344 KB, about 5% less).
But compiled queries still win one thing decisively: stability. Standard deviation for the compiled projection is 178 μs versus 491 μs for the uncompiled version. That’s a 2.8× reduction in variance, which matters for p99 latency much more than it matters for p50. If your service has latency SLOs on the tail — the slow requests real users actually feel — compiled queries can be worth it for predictability alone.
Practical guidance: if you’re migrating from EF Core 6 or 7 to .NET 10, don’t expect the historical compiled-query wins to show up in your throughput numbers. Use EF.CompileAsyncQuery when you care about query latency stability, not when you care about average-case speed.
Raw SQL is the floor — and EF projection gets 99% of the way there
Raw SQL matches compiled projection at 2.43 ms on Linux, and the allocation gap is ~5% (327 KB for compiled projection vs 312 KB for FromSqlRaw). Even uncompiled projection at 2.88 ms is within 18% of the raw SQL floor.
The implication runs against a common assumption that “EF Core is slow.” For projections specifically, on PostgreSQL, the ORM overhead is small enough to be nearly irrelevant. You keep the type safety, the parameterization, the LINQ composition, and the schema-aware migrations, and you pay a tax measured in microseconds.
This is the opposite story from what we’ll tell in an upcoming EF Core vs Dapper vs Raw SQL post, where bulk inserts and complex JOIN patterns show much bigger ORM gaps. For the specific case of N+1 fixes with projection, though, EF Core’s output is remarkably close to the theoretical floor.
The Headline Finding
Fix the N+1. Don’t agonize over which fix.
The gap between N+1 and any fix is 48×–140×. The gap between the best fix and the worst fix is roughly 3×. If you’re spending architecture meetings debating Include() vs split query vs projection while an N+1 runs in production, you’re rearranging deck chairs. Kill the N+1 first. The other decisions are refinements.
What the Allocation Numbers Mean in Production
10.8 MB of managed memory per 100-order load is catastrophic at any meaningful throughput. That’s not Gen0 territory — objects that large and that numerous push straight into Gen1 and sometimes Gen2, which is where garbage collection gets expensive.
Consider a service handling 50 such operations per second — a modest load for any production API. With N+1, you’re allocating ~540 MB/s of garbage. The .NET runtime’s generational GC can handle that, but it will pause your process to do it, and those pauses show up as visible spikes in your p99 and p999 latency. You’ll see them in APM dashboards as unexplained latency walls, and you’ll see them in user complaints as “the app feels slow sometimes.”
Projection at 344 KB per operation keeps you firmly in Gen0 territory — the GC reclaims it cheaply, concurrently with your request handling, without stop-the-world pauses. Same throughput, ~17 MB/s of allocations instead of 540 MB/s. The GC barely notices.
The infrastructure cost implication is direct. Every client we’ve audited who said “we need a bigger cloud instance” turned out to be either burning GC time on over-allocation, thrashing connection pools, or both. N+1 queries are the single most common cause in the first bucket. Fixing them has directly enabled one Code Majesty Tech client to reduce cloud infrastructure from $2,000/month to $120/month — a 94% reduction — because the smaller, cheaper instance was no longer spending half its CPU on garbage collection.
That’s the real-world value of the allocation column in the table above. The mean latency column tells you how fast one operation is. The allocation column tells you how much your cloud bill is going to be.
Cross-Platform Validation: Apple M4 Pro
The same benchmarks on macOS — Apple M4 Pro, 14 cores, .NET SDK 10.0.102, Arm64 RyuJIT:
| Approach | Mean | StdDev | Allocated | Ratio vs N+1 |
|---|---|---|---|---|
| Lazy loading (N+1) | 79.0 ms | 2,431 μs | 10,833 KB | 1.00× |
| Split query | 2.8 ms | 495 μs | 1,234 KB | 0.035× |
| Eager loading (Include) | 1.4 ms | 70 μs | 890 KB | 0.017× |
| Projection (Select) | 1.1 ms | 95 μs | 346 KB | 0.013× |
| Compiled projection | 1.0 ms | 88 μs | 329 KB | 0.013× |
| Raw SQL (manual JOIN) | 0.8 ms | 71 μs | 313 KB | 0.010× |
Absolute numbers are roughly 3–4× lower than on Hetzner — Apple Silicon bare-metal versus a shared-vCPU Linux VM. That’s expected and not the point.
What matters is whether the pattern holds. It does, on every finding:
- N+1 is still catastrophic (60–100× slower than any fix on macOS).
- Split query is still slower than Include (2.8 ms vs 1.4 ms — actually a wider gap in relative terms than on Linux).
- Compiled projection still saves single-digit percent over uncompiled projection (1.03 ms vs 1.05 ms).
- Raw SQL is still at or near the floor.
One platform difference worth noting: on macOS, raw SQL beats compiled projection more decisively (0.82 ms vs 1.03 ms — a 20% gap), whereas on Linux they essentially tie. Not large enough to change the story — projection is still within 25% of raw SQL either way — but honest to mention.
The N+1 variance is also much tighter on macOS (3% StdDev vs 28% on Linux). Bare-metal I/O and a local loopback Docker network produce more consistent round-trip times than a shared-vCPU cloud VM. In production, expect your numbers to look more like the Linux run.
How to Detect N+1 Queries in Your Codebase
You can’t fix what you can’t see. Before writing any of the code in this post, you need to know whether your application actually has N+1 queries — and if so, where.
EF Core logging is the first line of defense. Configure the Microsoft.EntityFrameworkCore.Database.Command log category at Information level and every SQL statement EF Core executes will appear in your logs. If you see hundreds of near-identical SELECT * FROM "OrderItems" WHERE "OrderId" = @p0 statements per HTTP request, you have an N+1. This is the cheapest, most reliable detection method and it works in any environment where you can read logs.
Third-party profilers make it visual. MiniProfiler integrates directly with EF Core and shows you query counts per request in the browser. For production environments, APM tools like Application Insights, Datadog, or New Relic will flag endpoints with abnormally high database query counts — a classic N+1 signature.
Don’t trust the debugger. Lazy loading triggers on property access, and stepping through code in the debugger often masks the problem — the debugger evaluates lazy navigation properties as part of displaying the object, which makes the queries execute before you “get to” the line where your code touches them. You’ll see a handful of queries in the log and conclude you’re fine, right up until you look at the logs from a real request running at full speed.
The quickest smoke test: pick one endpoint in your application, enable EF Core command logging locally, make one request, and count the SQL statements. If a single user-facing request triggers more than ~10 SQL queries, dig in. If it triggers more than 50, you almost certainly have at least one N+1 in the path.
So Which Fix Should You Pick?
Clear guidance, not “it depends.”
If you’re writing new queries: start with projection (.Select() into a DTO). It’s the fastest, the smallest, and it forces you to be explicit about what data you actually need. Projections are also easier to test than entity graphs, and they don’t carry change tracking overhead. This should be your default.
If you already have .Include() in production and it’s working fine: leave it. The 2× allocation gap versus projection rarely matters when the absolute numbers are already in the low milliseconds. Refactoring working Include queries into projections for a few hundred microseconds is low-ROI work. Spend the engineering time on the next N+1 instead.
If you’re using split queries because a blog post told you to: measure first. On typical business-domain data — orders with a handful of items, invoices with a few line entries, posts with some tags — Include is faster. Split query is a specific optimization for cartesian-explosion scenarios with large sibling collections. Don’t apply it by default.
If you’re reaching for compiled queries for raw speed: don’t. EF Core 10’s plan cache closed most of that gap. Compiled queries are still useful for stability — lower p99 latency, tighter variance — but not for average-case throughput.
If you have an N+1 in production: fix it. Any fix. Today. The single biggest performance decision you can make in an EF Core codebase is to not have N+1 queries. Which fix you pick second is a detail.
N+1 queries are one of the two most common causes of “our .NET app is slow” in the production audits we’ve run at Code Majesty Tech. The other one — connection pool exhaustion — is a different post.
Frequently Asked Questions
What is the N+1 query problem in EF Core?
N+1 refers to the pattern where loading a list of N entities results in N+1 SQL queries instead of one: one query to load the parents, then one query per parent to load each child collection as your code walks the navigation properties. It happens most often with EF Core’s lazy-loading proxies, where accessing a .Orders.Items property triggers a hidden SQL call. In our benchmark, loading 100 orders with their items produces ~500 SQL queries in the N+1 case versus 1 query with Include() or projection.
How do I fix the N+1 problem in Entity Framework Core?
The short answer is: use projection (.Select() into a DTO). In our .NET 10 benchmarks on PostgreSQL 16, projection was 140× faster than N+1, 2× faster than Include(), and matched raw SQL within 20%. Eager loading with .Include().ThenInclude() is a good default if you need the full entity graph and don’t want to write DTOs. Both approaches fix the N+1; projection is strictly faster.
Is Include() faster than AsSplitQuery() in EF Core 10?
For typical business-domain data with small child collections, yes. In our benchmark — orders with ~5 items each — plain Include() ran at 5.4 ms versus 7.0 ms for split query on PostgreSQL 16. Split query pays for extra round-trips without cartesian savings when collections are small. Split query wins on wider data shapes (large child collections, multiple sibling collections that would multiply the result set), but don’t apply it by default. Measure on your actual data first.
Do compiled queries still matter in EF Core 10?
For raw throughput, barely. Compiled projection was only ~3% faster than uncompiled projection in our .NET 10 benchmark on macOS, and ~16% faster on Linux — far less than the historical 30%+ wins in EF Core 6. EF Core 10’s plan cache has become aggressive enough that manual compilation adds little. Compiled queries still reduce variance significantly (178 μs standard deviation vs 491 μs uncompiled), so they’re worth using when you care about p99 latency stability, not average-case speed.
How do I detect N+1 queries in my .NET application?
The most reliable method is EF Core’s built-in logging. Set the Microsoft.EntityFrameworkCore.Database.Command log category to Information and every SQL statement will appear in your logs — if one HTTP request produces hundreds of near-identical queries, you have an N+1. For visual detection in development, use MiniProfiler; for production monitoring, APM tools like Application Insights, Datadog, or New Relic will flag high-query-count endpoints. Avoid relying on the debugger — stepping through code masks lazy-loading triggers.
Got a sluggish .NET API? N+1 queries are the #1 cause we see in production audits. Book a 1-hour consulting session and we’ll identify your bottlenecks live — $80/hr, zero commitment.
Need hands-on help refactoring your EF Core data access layer? Our sprint-based development engagements start at $60/hr — no long-term contracts, pause or cancel anytime.
Related Posts
FastEndpoints vs Minimal API vs Controllers: .NET 10 Performance Benchmarks
BenchmarkDotNet results comparing FastEndpoints, Minimal APIs, and MVC Controllers on .NET 10 with EF Core + PostgreSQL. Real numbers, not theory.
Azure DevOps Deployment Best Practices for .NET and .NET Core Applications
Learn essential Azure DevOps deployment best practices for .NET and .NET Core applications, including build pipelines, release pipelines, automated testing, and configuration management.
Duende IdentityServer: OpenID Connect and OAuth 2.0 for .NET
An introduction to Duende IdentityServer, the enterprise-grade framework for implementing OpenID Connect and OAuth 2.0 in .NET applications, covering key concepts like tokens, clients, and authentication flows.