fd Blog

Daniel Hilgarth on software development

Why the Repository Pattern Is Still Valid

Introduction - what is the Repository pattern?

The Repository pattern has been around for quite a while. It was described as follows:

Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.

The Repository pattern is a powerful pattern to separate the domain layer from the concrete implementation of the data access layer (DAL).

The data access layer uses code specific to the data storage and data access method being used. This can be any number of things:

  1. It can be a certain Object/Relational Mapper (ORM) like NHibernate or Entity Framework.
  2. It can be a bunch of SQL statements.
  3. It can use Web services or XML files instead of a database etc.

Depending on these, the DAL will have a specific implementation.
Now, if the domain layer would use that implementation directly, it would contain implicit knowledge about the details of the data storage.
But that’s none of its business! The responsibility of the domain layer is not to know about the details of the data access.

And that’s where the Repository pattern comes into play: It provides a clean interface to the actual DAL implementation. This enables the domain layer to use the DAL without needing to know the implementation details.

A classic repository would have an interface like this:

public interface IEmployeeRepository
{
    Employee GetById(int id);
    IEnumerable<Employee> GetTeamLeaders();
    IEnumerable<Employee> GetByName(string name);
    IEnumerable<Employee> GetByGender(Gender gender);
    // ...
}	

The Repository pattern and LINQ

With the advent of LINQ and the closely related interface IQueryable<T>, many no longer saw the need for the Repository pattern.
You could just expose an IQueryable<T> instance and the client ( = domain layer) could write its LINQ queries.

The result would be code in the domain layer that looked like this:

session.Query<Employee>().Where(x => x.Gender == Gender.Female && x.IsTeamLeader)

Another thing people started doing was to “improve” the Repository pattern to the Generic Repository (anti-)pattern, which looks something like this:

public interface IRepository<T>
{
    IQueryable<T> Get();
    // and/or:
    IEnumerable<T> Find(Expression<Func<T, bool>> predicate);
    
    void Add(T entity);
    void Update(T entity);
    void Delete(T entity);
}

It is obvious that the Get method is basically the same as Session.Query<T>.
The Find method is just the same. It is basically a shortcut for Get<T>().Where(predicate).
Let’s put all three of them below each other just to show how similar they are:

session.Query<Employee>().Where(x => x.Gender == Gender.Female && x.IsTeamLeader)
repository.Get().Where(x => x.Gender == Gender.Female && x.IsTeamLeader)
repository.Find(x => x.Gender == Gender.Female && x.IsTeamLeader)

Disadvantages when using IQueryable<T>

There are several disadvantages to all three of these equivalent approaches:

  1. The domain layer will be littered with these rather verbose queries.
  2. Reuse of the queries is only possible via helper or extension methods that encapsulate a certain query. It could then look like this:

    session.Query<Employee>().ThatAreFemaleTeamLeaders()
    

    That looks a lot like the repository pattern, but it now puts the burden of implementing those methods on the domain layer instead of the DAL.

  3. Because the domain layer is aware of IQueryable<T>, it is very hard to optimize specific parts.
    Let’s assume that there is a query that is - for whatever reason - too slow when using the ORM. Using the Repository pattern, it would be very simple to implement parts of the query using stored procedures or plain old SQL.

Problems when using IQueryable<T>

The previous chapter showed the disadvantages. Now we come to the real problems:

  1. You effectively tie your domain layer to the specific implementation of the DAL by using IQueryable<T> - or a method that requires an expression tree like the Find method from the Generic Repository.
    It looks like you don’t do that, because you use the general IQueryable<T> interface. But in fact, none of the database accessing implementations of this interface support every possible scenario.
    Want an example? OK, what is wrong with the following query?

    session.Query<Employee>().Sum(x => x.Sales)
    

    Well, there is nothing wrong with it - unless you use NHibernate. NHibernate’s IQueryable<T> implementation doesn’t understand that overload of Sum. As soon as you enumerate the result of this query, you will get a runtime exception. You will have to write it like this:

    session.Query<Employee>().Select(x => x.Sales).Sum()
    

    And - boom - you just introduced code specific to the ORM to your domain layer.
    There is more. It looks like the following code should work. It certainly compiles:

    session.Query<Employee>().Where(x => MethodPerformingSomeCalculation(x.Sales) > 0)
    

    This won’t work if the DAL accesses a database, because MethodPerformingSomeCalculation can’t be translated to SQL - it has no known counterpart in SQL.
    You would have to find some other way to get the data you want.
    And again, you will have written code in the domain layer that accounts for details of the DAL.

    That’s not what I call a clean abstraction. In fact, it leaks a lot. If you are interested, Mark Seemann has a whole article about these problems.

  2. Another big problem is testability.
    Because LINQ queries are all across your domain layer, you will have a hard time verifying that the domain classes query the correct things. You would need to mock IQueryable<T> and analyze the expression trees that are passed in. That’s non-trivial.

Conclusion

All those points present a strong case against using IQueryable<T> and/or the Generic Repository (anti-)pattern and for using the “classical” Repository pattern.

Even with LINQ, it is still the way to go to separate the domain layer from the data access layer.

Comments