When Hibernate Shouts at Me

Seriously, not only humans can shout at you — sometimes your code does too!

When? You may ask.

The Problem Link to heading

It happens when you do something so bad in code that even the logs beg you to look at them. When I first saw the log firstResult/maxResults specified with collection fetch; applying in memory!, I ignored it — many times, I must admit. Now I feel bad about it. But what is this log even about?

It happens in the part of the code responsible for fetching a collection from the database, where entities are linked through a One-to-Many relationship.

Look at the following example:

import jakarta.persistence.*

@Entity
class Author(
    @Id 
    @GeneratedValue 
    val id: Long? = null,

    val name: String,

    @OneToMany(mappedBy = "author", fetch = FetchType.LAZY, cascade = [CascadeType.ALL])
    val books: MutableList<Book> = mutableListOf()
)

@Entity
class Book(
    @Id 
    @GeneratedValue 
    val id: Long? = null,

    val title: String,

    @ManyToOne(fetch = FetchType.LAZY) 
    @JoinColumn(name = "author_id")
    val author: Author
)

Related repo:

import org.springframework.data.domain.Pageable
import org.springframework.data.jpa.repository.JpaRepository
import org.springframework.data.jpa.repository.Query
import org.springframework.stereotype.Repository

@Repository
interface AuthorRepository : JpaRepository<Author, Long> {

    @Query("select a from Author a left join fetch a.books order by a.id")
    fun findPageWithBooks(pageable: Pageable): List<Author>
}

Query:

authorRepository.findPageWithBooks(PageRequest.of(0, 20))

Executing query above will cause firstResult/maxResults specified with collection fetch; applying in memory!.

Why it happens? Link to heading

JOIN FETCH a.books creates a Cartesian product (one author × many books → many rows). Therefore, Hibernate cannot properly limit the result to 20 authors, because 20 rows != 20 authors. That’s why it loads everything into memory and post-processes the results afterwards.

How to fix it? Link to heading

Firstly let’s refactor our code that has one query method into two ones. It allows us to first fetch only the IDs of the authors.
Later, we would fetch the rest part we need - authors with books only for IDs loaded from the first step.

Fixed repo:

import org.springframework.data.jpa.repository.EntityGraph
import org.springframework.data.jpa.repository.JpaRepository
import org.springframework.data.jpa.repository.Query
import org.springframework.data.jpa.repository.QueryHints
import org.springframework.data.repository.query.Param
import jakarta.persistence.QueryHint

interface AuthorRepository : JpaRepository<Author, Long> {

    @Query("select a.id from Author a order by a.id")
    fun findPageOfIds(pageable: Pageable): Page<Long>

    @Query(
        """
        select a 
        from Author a 
        left join fetch a.books 
        where a.id in :ids 
        order by a.id
        """
    )
    fun fetchByIdsWithBooks(@Param("ids") ids: List<Long>): List<Author>
}

Now service needs to do glueling in order to provide complete result:

import org.springframework.data.domain.PageRequest
import org.springframework.stereotype.Service

@Service
class AuthorService(private val repo: AuthorRepository) {

    fun findPageWithBooks(page: Int, size: Int): List<Author> {
        val ids = repo.findPageOfIds(PageRequest.of(page, size)).content
        if (ids.isEmpty()) {
            return emptyList()
        }
        return repo.fetchByIdsWithBooks(ids)
    }
}

It works, because pagination happens on simple Author table and later fetching of related collection is constrained to small ID list. Since Hibernate 6, you do not longer need to hint Hibernate via @QueryHint annotation that it should not pass DISTINCT keyword to SQL due to reasons described here: https://github.com/hibernate/hibernate-orm/blob/6.0/migration-guide.adoc#distinct