One of the projects I have at work is a web application, and it uses the repository pattern. Normally, there’s no issue here, the repository pattern is a pretty fundamentally solid pattern: encapsulate your data-access/update logic and expose only that which is necessary for the application.
However, as so often happens, in this case the repository pattern was misused and abused. We have a large, monolithic interface with 70 method definitions on it, doing everything from returning a couple objects to updating others and saving side-effects to yet another. On top of this, we’re using Entity Framework which is already in-and-of-itself a repository.
Listen folks, it’s 2022, we have better ways to implement these design patterns.
The ultimate problem
The reality is that the first time anyone does anything it’s not as good as the second or third, or as good as someone else who may have been doing it for years or decades. I am no exception to this. This Repository
/IRepository
pair was designed by someone who had never designed one before, and it was expanded to the point where it is now a massive concoction. It happens, we all try new things and mess it up, I’m not upset about it at all.
What I want to do is explore an alteration of the pattern that I, personally, like to use. It’s basically the repository pattern in much smaller pieces.
Generic repositories never work
On the whole, generic repositories never work. That is, repositories where the implementer tries to create/read/update/delete generically. I’ve seen dozens of attempts, and most look well-thought-out, and some even work moderately well, but they always come with problems. Instead of implementing them generically, I find that the mechanism that I enjoy the most is to develop small repositories for each object that has to be interacted with.
Where ours went wrong
Our repository started great, it was small and it did what was necessary. However, as more and more work became necessary the repository grew to mammoth size. Some functions return IQueryable<T>
objects, some return Task<T>
, some return T
. Of course, the T
can vary from a single entity to List<T>
to even more monstrous objects. Generic-city, effectively.
The only answer in our case is to slowly refactor the code-base to split the massive repository up, but the million-dollar question is: how? What do we break it into? Do we make all the interfaces smaller and separate the big repository out into partial classes? That’s a mess but it works. Do we make each repository a sub-repository of the big one?
The answer, in our case, is the latter.
How we split it up
Our answer to splitting the repository up was simple: create a repository for each object, and have the main repository create an instance of each.
Our main IRepository
has functions for everything, I’ve outlined some contrived examples:
Task<Document> GetDocumentAsync(long id);
Task<List<Payment>> GetPaymentsAsync(Document document);
Task<List<Document>> GetOpenDocumentsAsync(Customer customer);
void RecalculateOutstandingBalance(Document document);
These four are enough to run through our example. In our case, we had a repository that looked something like this:
public class Repository : IRepository
{
private DatabaseContext Context { get; set; }
public Repository(DatabaseContext context)
{
Context = context;
}
public async Task<Document> GetDocumentAsync(long id) =>
await Context.Documents
.Include(x => x.Payments)
.Include(x => x.Customers)
.SingleOrDefaultAsync(x => x.Id == id);
public async Task<List<Payment>> GetPaymentsAsync(Document document) =>
await Context.Payments
.Include(x => x.Document).ThenInclude(x => x.Payments)
.Where(x => x.Document == document)
.ToListAsync();
public async Task<List<Document>> GetOpenDocumentsAsync(Customer customer) =>
await Context.Documents
.Include(x => x.Payments)
.Include(x => x.Customers)
.Where(x => x.Customer == customer && x.OutstandingBalance > 0m)
.ToListAsync();
public void RecalculateOutstandingBalance(Document document)
{
document.OutstandingBalance = document.Amount
- document.Payments.Sum(x => x.AppliedAmount)
- document.Credits.Sum(x => x.AppliedAmount);
}
}
These are much simplified and contrived, but we’ll use them as a trivial example.
First and foremost: we had to decide what to split the functions apart on. In our case, we split them on the main object they were returning. In the case of a void
then it’s the object they do the work on.
As an example, we would have:
public interface IDocumentsRepository
{
Task<Document> GetDocumentAsync(long id);
Task<List<Document>> GetOpenDocumentsAsync(Customer customer);
void RecalculateOutstandingBalance(Document document);
}
public interface IPaymentsRepository
{
Task<List<Payment>> GetPaymentsAsync(Document document);
}
public interface IRepository
{
IPaymentsRepository Payments { get; }
IDocumentsRepository Documents { get; }
}
The trick here was to split the payments and documents each into their own repository interfaces (and classes, by transitive extension) which allowed for the file sizes to shrink a million times over. The repository was modified quite a bit:
public class DocumentsRepository : IDocumentsRepository
{
private DatabaseContext Context { get; }
public DocumentsRepository(DatabaseContext context)
{
Context = context;
}
public async Task<Document> GetDocumentAsync(long id) =>
await Context.Documents
.Include(x => x.Payments)
.Include(x => x.Customers)
.SingleOrDefaultAsync(x => x.Id == id);
public async Task<List<Document>> GetOpenDocumentsAsync(Customer customer) =>
await Context.Documents
.Include(x => x.Payments)
.Include(x => x.Customers)
.Where(x => x.Customer == customer && x.OutstandingBalance > 0m)
.ToListAsync();
public void RecalculateOutstandingBalance(Document document)
{
document.OutstandingBalance = document.Amount
- document.Payments.Sum(x => x.AppliedAmount)
- document.Credits.Sum(x => x.AppliedAmount);
}
}
public class PaymentsRepository : IPaymentsRepository
{
private DatabaseContext Context { get; }
public PaymentsRepository(DatabaseContext context)
{
Context = context;
}
public async Task<List<Payment>> GetPaymentsAsync(Document document) =>
await Context.Payments
.Include(x => x.Document).ThenInclude(x => x.Payments)
.Where(x => x.Document == document)
.ToListAsync();
}
public class Repository : IRepository
{
public IDocumentsRepository Documents { get; }
public IPaymentsRepository Payments { get; }
public Repository(DatabaseContext contexts)
{
Documents = new DocumentsRepository(context);
Payments = new PaymentsRepository(context);
}
}
In this new version we reduce the amount of overhead the Repository
class has, and in addition we allow unit testing to be performed much more smoothly. Instead of having to mock a massive IRepository
interface, we mock the small ones when and where we need.
Of course, if this starts looking familiar, that’s because this is exactly what Entity Framework is: the DbSet<T>
acts as each sub-repository as part of the main database context. We’ve traded one Repository pattern for another. That said, I don’t consider the repository pattern implemented by Entity Framework a “clean” repository, because it’s too specific to Entity Framework in and of itself. This is where I mention that ‘generic’ repositories really never work: Entity Framework is another example of this. While it’s very powerful and does a lot, I don’t find it as clean for wide-spread use. Whenever I use it I always feel like I end up at a point where I have to repeat myself all over to get the same data returned.
In the end
At the end of the day the design pattern anyone uses is up to them and their team. Team agreement is critical, as long as everyone has the opportunity to express concerns and ideas that’s what makes the system work.