How to Detect and Fix N+1 Queries in Spring Boot Before Production

April 15, 2026 · 10 min read

Senior Java Architect | Expert in High-Load Systems & JVM Internals

TL;DR: The hardest part of an N+1 bug is not fixing it. The hardest part is noticing it before production. A Spring Boot endpoint can return the correct JSON, pass unit tests, and still execute 243 SQL queries instead of 1. This post shows how to detect N+1 patterns from real traces, choose the right fix in Spring Data JPA, and verify that the optimization did not change the API output.

Why N+1 Queries Keep Reaching Production

N+1 problems survive longer than they should because they rarely break correctness first. They break cost, latency, and scale.

A typical flow looks safe in code review:

load a list of parent entities
access a lazy collection or related entity inside a loop
serialize the result into a response

The endpoint still returns 200 OK. The data looks right. The unit tests stay green because repositories and external dependencies are mocked. But in production, the request does this:

1 query to load the parent list
N more queries to load each child or nested relation

At small volume, this looks like a non-event. Under real load, it becomes:

slow endpoints
high database CPU
connection pool pressure
noisy latency spikes
unexpected cloud cost

That is why N+1 is a runtime problem first. You need visibility into the actual SQL sequence, not just the Java code.

What an N+1 Query Looks Like in Spring Boot

The classic pattern appears in JPA/Hibernate-backed Spring Boot services.

Imagine a Student entity with a lazy association to courses:

@Entity
public class Student {

    @Id
    private Long id;

    private String firstName;
    private String lastName;

    @ManyToMany(fetch = FetchType.LAZY)
    @JoinTable(
        name = "enrollment",
        joinColumns = @JoinColumn(name = "student_id"),
        inverseJoinColumns = @JoinColumn(name = "course_id")
    )
    private List<Course> courses;
}

And a service like this:

public List<StudentDto> getStudents() {
    return studentRepository.findAll().stream()
            .map(studentMapper::toDto)
            .toList();
}

If studentMapper::toDto touches student.getCourses() and each course touches course.getTeacher(), Hibernate may execute:

1 query for all students
1 query per student for courses
1 query per course set for teacher data

The code looks harmless. The database says otherwise.

A Real Runtime Pattern

This is the kind of shape BitDive sees in traces:

SELECT * FROM student;

SELECT c.*, t.*
FROM enrollment e
JOIN course c ON c.id = e.course_id
LEFT JOIN teacher t ON t.id = c.teacher_id
WHERE e.student_id = ?;

SELECT c.*, t.*
FROM enrollment e
JOIN course c ON c.id = e.course_id
LEFT JOIN teacher t ON t.id = c.teacher_id
WHERE e.student_id = ?;

SELECT c.*, t.*
FROM enrollment e
JOIN course c ON c.id = e.course_id
LEFT JOIN teacher t ON t.id = c.teacher_id
WHERE e.student_id = ?;

Same query shape, different bind parameter. Repeated dozens or hundreds of times inside one request. That is the signature.

How to Detect N+1 Queries Before Production

1. Count SQL Queries Per Request, Not Per Test Suite

The right unit of analysis is one business request.

If GET /students returns 50 students and runs:

1 query for the root list
50 queries for child collections

you already have an N+1 pattern, even if the response time still "looks fine" on a laptop.

2. Look for Repeated Query Shapes with Different IDs

The giveaway is repetition:

same SELECT
same joins
same projection
only the WHERE ... = ? bind value changes

This is why raw logs are not enough. Grepping logs is slow and noisy. You want one request trace with the SQL sequence grouped in order.

3. Compare Query Count to the Result Cardinality

If one endpoint returns:

1 order and runs 2 queries, that may be normal
100 students and runs 243 queries, that is not normal

The important ratio is not only "slow or fast." It is "how many queries did this request need to produce one response?"

4. Compare Before and After, Not Just Before

Most teams detect N+1 only after the regression has already shipped. A safer pattern is:

capture the current request trace
apply the optimization
capture the same request again
compare SQL count, SQL shape, timings, and response payload

That way you prove both sides:

performance improved
business behavior stayed the same

Why Unit Tests Usually Miss N+1

A mock-heavy unit test cannot see:

real SQL count
lazy loading behavior
serialization side effects
transaction boundaries
mapper access to nested associations

It only sees the objects you fed into the mock.

This is why N+1 bugs often appear after an innocent request like:

"add courses to the response"
"include teacher name in the DTO"
"return full details instead of IDs"

The Java change is small. The SQL explosion is invisible unless you capture the runtime path.

The 4 Most Useful Fixes in Spring Boot

There is no single universal fix. The right choice depends on the endpoint shape.

1. `@EntityGraph` for Repository-Level Fetch Plans

If one repository method needs a richer graph than the default mapping, @EntityGraph is often the cleanest fix.

public interface StudentRepository extends JpaRepository<Student, Long> {

    @EntityGraph(attributePaths = {"courses", "courses.teacher"})
    @Query("SELECT DISTINCT s FROM Student s ORDER BY s.lastName ASC, s.firstName ASC")
    List<Student> findAllWithCoursesAndTeachersEntityGraph();
}

Then the service becomes:

public List<Student> findAll() {
    return studentRepository.findAllWithCoursesAndTeachersEntityGraph();
}

Use this when:

one endpoint needs the richer graph
the fetch plan is request-specific
you want the intention close to the repository method

2. `JOIN FETCH` for Explicit Query Control

If the query is highly specific and you want exact control in JPQL, a fetch join is straightforward.

@Query("""
    select distinct s
    from Student s
    left join fetch s.courses c
    left join fetch c.teacher
    order by s.lastName, s.firstName
""")
List<Student> findAllWithCoursesAndTeachers();

Use this when:

the endpoint is read-heavy
the fetch shape is fixed
you want the join strategy visible in the query itself

Be careful with very wide graphs. One big join can trade N+1 for row explosion.

3. DTO Projections for Read Models

If the endpoint is purely read-oriented and does not need managed entities, a projection is often better than loading a rich entity graph and serializing it.

Use a DTO projection when:

you control the exact response shape
you only need a subset of columns
the endpoint is query-driven, not domain-behavior-driven

This can be the cleanest fix for list endpoints and reporting APIs.

4. Batch Fetching for Some Association Patterns

In some cases, batch fetching can reduce the number of secondary selects without forcing one huge join. This is useful when one giant fetch join would create too many duplicate rows.

Use it carefully. Batch fetching can improve the pattern, but it does not make the query plan obvious to reviewers. You still need trace-level evidence that the request improved.

What Not to Do

Do Not Default Everything to `FetchType.EAGER`

EAGER is not a real N+1 strategy. It is a blunt mapping decision. It can:

hide performance problems in one path
create oversized object graphs in another
trigger unnecessary joins everywhere

Fix the request shape, not the whole domain model by default.

Do Not Trust a Faster Response Alone

If one request went from 240 queries to 4, that is good. But you still need to verify:

did the response body change?
did nested fields disappear?
did ordering change?
did headers or error handling drift?

Performance optimization that silently changes the API contract is still a regression.

A Safer N+1 Workflow with BitDive

The most reliable workflow is:

Step 1: Capture the Baseline Request

Trigger the real endpoint with BitDive attached and inspect:

SQL sequence
query count
timings
response body

Step 2: Identify the Repeated Query Pattern

BitDive shows the full execution tree and SQL sequence for one request, so you can see exactly where the repeated selects begin.

For a concrete walkthrough, see Cursor AI & BitDive: Diagnosing N+1 Queries with Runtime Context.

Step 3: Apply the Smallest Reasonable Fix

Typical fixes:

add @EntityGraph
replace findAll() with a fetch-specific repository method
switch the endpoint to a DTO projection
adjust the association access path

Step 4: Capture the Same Request Again

Run the same endpoint after the change and capture a second trace.

Step 5: Compare Before and After

This is the proof step. Compare:

SQL count
SQL shape
response payload
method return values
downstream behavior

For example:

Metric	Before	After
Response time	94ms	13ms
SQL queries	243	1
Response body	identical	identical

That is a safe optimization. Faster and behaviorally stable.

Step 6: Preserve It as Regression Memory

Once the optimized request is verified, turn that execution into replay-based protection so the N+1 pattern does not come back six weeks later.

That is where trace-based testing becomes part of performance engineering, not just functional testing.

Why This Matters for AI-Assisted Development

N+1 is a perfect example of where AI needs runtime context.

An agent reading source code can see:

studentRepository.findAll()
student.getCourses()
a mapper calling nested getters

What it cannot see from static code alone is:

the actual SQL sequence
how many times the query repeats
whether the "fix" changed the API output

That is why BitDive's runtime context for AI agents matters. The agent can inspect the trace, apply a targeted fix, then compare before and after traces to verify it improved the query plan without breaking the response.

Find N+1 Queries from Real Runtime Traces

BitDive captures SQL activity, method calls, and response payloads for one real request. Detect repeated query patterns, apply a focused fix, and verify that performance improved without changing the API contract.

Explore code-level observability

Why N+1 Queries Keep Reaching Production​

What an N+1 Query Looks Like in Spring Boot​

A Real Runtime Pattern​

How to Detect N+1 Queries Before Production​

1. Count SQL Queries Per Request, Not Per Test Suite​

2. Look for Repeated Query Shapes with Different IDs​

3. Compare Query Count to the Result Cardinality​

4. Compare Before and After, Not Just Before​

Why Unit Tests Usually Miss N+1​

The 4 Most Useful Fixes in Spring Boot​

1. @EntityGraph for Repository-Level Fetch Plans​

2. JOIN FETCH for Explicit Query Control​

3. DTO Projections for Read Models​

4. Batch Fetching for Some Association Patterns​

What Not to Do​

Do Not Default Everything to FetchType.EAGER​

Do Not Trust a Faster Response Alone​

A Safer N+1 Workflow with BitDive​

Step 1: Capture the Baseline Request​

Step 2: Identify the Repeated Query Pattern​

Step 3: Apply the Smallest Reasonable Fix​

Step 4: Capture the Same Request Again​

Step 5: Compare Before and After​

Step 6: Preserve It as Regression Memory​

Why This Matters for AI-Assisted Development​