Impedance Mismatch and System Evolution

time to read 4 min | 619 words

Greg Young is talking about the Impedance Mismatch, replying to Stephen Fortes post about Impedance Mismatch from a while back.

Greg and I have a slightly different views regarding the details of many of the things he is talking about, but we are in agreement overall. You might say we take different approaches to the problem, but with the same overall direction.

What I would like to talk about in this post is specifically this statement from Stephen:

My first problem with ORMs in general is that they force you into a "objects first" box. Design your application and then click a button and magically all the data modeling and data access code will work itself out. This is wrong because it makes you very application centric and a lot of times a database model is going to support far more than your application.

Let me start by saying that I absolutely reject this statement:

[The] database model is going to support far more than your application

The database model is private to the application, and is never shared with the outside world. If you need to access to my data, here is the service URL, have fun reading from it.

However, one point that does make sense to deal with is dealing with the data model that is designed using objects first approach, and the implications from the point of view of good DB design.

I have stated many times before that the way I tend to work is by just writing my domain model in my language of choice, and than asking NHibernate to generate the database for me. This is a really good way to get an environment where changing the model has a very rapid feedback cycle, but it doesn't create an optimized data model.

In does tend to create a good / reasonable data models, and I went to production with such models many times. But it is not the best model that you can have. There are cases where you can see that the one to one mapping that this approach use will generate sub optimal solution, and you need to take steps to deal with it.

This tend to be split into a couple of refactoring options:

  • Non schema breaking changes - this include such things as adding indexes, jobs for applying statistics, and general maintenance and tweaking.
  • Schema breaking changes - this tends to include denormalization, splitting tables, merging tables and in general moving things around.

In most cases, the first option is a mandatory one. I don't add speculative indexes into my model. I only add them as a result of a performance test that shows an indication of a problem in a particular area. I would strongly suggest not going to production without doing this kind of review and applying the appropriate indexes.

The second option is a more interesting one, because it means that you have recognized a better way to model the data in a way that allows better read/write/query. This is also something that most people are extremely reluctant to do.

I have found that in most cases (by no means all, but a decided majority), I can deal with this issue in the mapping layer, and not touch the domain model itself.

End result is that I still get a good model in both the domain model and the data model, but my focus is on the functionality and the business requirement, which tend to speed things up significantly.