Ayende @ Rahien

filter by tags archive

architecture (614) rss
bugs (451) rss
challanges (123) rss
community (381) rss
databases (481) rss
design (896) rss
development (642) rss
hibernating-practices (71) rss
miscellaneous (592) rss
performance (397) rss
programming (1086) rss
raven (1455) rss
ravendb.net (539) rss
reviews (184) rss

2025
- July (5)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

Couchbase vs RavenDB Performance at Rakuten Kobo Whitepaper

Apr 18 2009

How Microsoft should release guidance?

time to read 3 min | 463 words

Tweet Share Share 9 comments

Tags:

Phil Haack has a post about code sample taxonomy, in which he asks how Microsoft can ship high quality sample apps:

Obviously, this [shipping high quality samples] is what we should be striving for, but what do we do in the meantime? Stop shipping samples? I hope not.

Again, I don’t claim to have the answers, but I think there are a few things that could help. One twitter response made a great point:

a reference app is going to be grilled. Even more if it comes from the mothership. Get the community involved *before* it gets pub

Getting the community involved is a great means of having your code reviewed to make sure you’re not doing anything obviously stupid. Of course, even in this, there’s a challenge. Jeremy Miller made this great point recently:

We don't have our own story straight yet. We're still advancing our craft. By no means have we reached some sort of omega point in our own development efforts.

In other words, even with community involvement, you’re probably going to piss someone off.

Um, no.

I feel that Jeremy’s point has been pulled out of context. While we may have disagreements about what constitute the future directions in software development, that isn’t actually what cause the community to reject the bad samples from Microsoft.

No one said a word about Kobe because it didn’t use NHibernate, or wasn’t built with messaging or using DDD.

What we reject in those sample app is that they are, put simply, bad coding. There is nothing amorphous about this, we aren’t talking about subjective feeling, we are talking about very concrete, objective and real metrics that we can apply.

Bad Code

Not some omega point.

Bad Code

Not lack of understanding of the platform because the developers writing that were new to ASP.Net MVC.

Bad Code

That would have been bad code if it was written 5 years ago or 15 years ago.

That is the problem.

And I think that there is quite a simple answer for that. Stop shipping official guidance package without involvement from the community.

Create a special CodePlex project where you are being explicit about putting things for review, not for publishing. After the guidance has been published, it is probably too late already.

Get some feedback from the community, then you can publish this in the usual places.

Apr 18 2009

The DAL should go all the way to UI

time to read 4 min | 675 words

Tweet Share Share 73 comments

Tags:

My recent post caused quite a few comments, many of them about two major topics. The first is the mockability of my approach and the second is regarding the separation of layers. This post is about the second concern.

A typical application architecture usually looks like this:

This venerable structure is almost sacred for many people. It is also, incidentally, wrong.

The main problem is that the data access concerns don’t end up in the business layer. There are presentation concerns that affect that as well. Let us take a look at a common example. I want to show the invoices for a user:

Given that requirement, I quickly build my interface, implement it and throw it over the wall to the UI team to deal with it*.

Here is the interface that I came up with:

GetInvoicesForUser(userId)

Great, isn’t it? It satisfy all the business requirements that I have.

Except, my UI can’t actually work with this. We have to have paging there as well, and the only way to do paging using this API is to do that on the client side, which is probably bad. I grumble a little bit but change the interface to:

GetInvoicesForUser(userId, pageNum, pageSize)

Done, right?

Well, not really. I have a new UI requirement now, the user want to be able to sort the grid by any column. Now I grumble even more, because this is harder, but I create the following interface:

GetInvoicesForUser(userId, pageNum, pageSize, orderByCol, orderByDesc)

And then they want to order by multiple columns, and then they…

Do you notice a theme here? A lot of the data access concerns that I have are not actually concerns that the layer above me has, they are one layer removed.

But there is a more important problem here, I am violating (repeatedly) the Open Closed Principle. As a reminder:

software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification

My data access is not open for extension, and it is certainly not closed for modification.

The problem is that I am trying to create a hard isolation between two pieces of the system that need to work very closely together, and as long as we are going to insist on this strict separation, we are going to have problems.

My current approach is quite different. I define queries, which contains the actual business logic for the query, but I pass that query onward to higher levels of my application. Any additional processing on the query (projections, paging, sorting, etc) can be done. In that way, my queries are closed for modification and open for extension, since they can be further manipulated by higher level code.

A good example of that would be:

public class LatePayingCustomerQuery
{
	public DateTime? DueDate {get;set;}
	public decimal? MinAmount {get;set;}
	public deciaml MaxAmount {get;set;}
	
	public IQueryable<Customer> GetQuery(ISession session)
	{
		var payments = 	from payment in s.Linq<Payment>()
			 		where payment.IsLate
					select payment;
	
		if(DueDate != null)
			payments = payments.Where( payment => payment.DueDate >= DueDate.Value );
			
		if(MinAmount != null)
			payments = payments.Where( payment => payment.Amount >= MinAmount.Value );
		
		if(MaxAmount != null)
			payments = payments.Where( payment => payment.Amount <= MaxAmount.Value );
		
		return 	from customer in s.Linq<Customer>
				where payments.Any( payment => customer == payment.Customer )
				select customer;
	}
}

I hope that this clears the air a bit about how I approach data access using NHibernate.

* it is a joke, nitpickers will be shot.

Apr 18 2009

Mocking NHibernate

time to read 2 min | 208 words

Tweet Share Share 15 comments

Tags:

My recent post caused quite a few comments, many of them about two major topics. The first is the mockability of my approach and the second is regarding the separation of layers.

This post is about the first concern. I don’t usually mock my database when using NHibernate. I use an in memory database and leave it at that. There are usually two common behaviors for loading data from the database. The first is when you need to load by primary key, usually using either Get or Load, the second is using a query.

If we are using Get or Load, this is extremely easy to mock, so I won’t touch that any further.

If we are using a query, I categorically don’t want to mock that. Querying is a business concern, and should be treated as such. Setting up an in memory database to be used with NHibernate is easy, you’ll have to wait until the 28th for the post about that to be published, but it is basically ten lines of code that you stick in a base class.

As such, I have no real motivation to try to abstract that away, I gain a lot, and lose nothing.

Apr 18 2009

Posts vs. Comments

time to read 6 min | 1018 words

Tweet Share Share 2 comments

Tags:

Blog

Just some interesting correlation:

Yesterday was the second most active day comments wise, eclipsed only by my single foray into politics.

Another interesting statistics:

# Posts / Day	1	2	3	4	5	6	7	8	9	10	11	12	13	14	17	22
# of Days	336	287	200	165	104	64	28	32	11	11	5	7	3	1	1	1

This table shows the relation between the number of posts per day and the number of times that I posted that number.

So, for example, you can see that my record is (once and never again!) 22 posts per day, and that I quite frequently post more than a single post per day.

Apr 17 2009

Repository is the new Singleton

time to read 5 min | 901 words

Tweet Share Share 76 comments

Tags:

I mentioned in passing that I don’t like the Repository pattern anymore much, and gotten a lot of responses to that. This is the answering post, and yes, the title was chosen to get a rise out of you.

There are actually two separate issues that needs to be handled here. One of them is my issues with the actual pattern and the second is the pattern usage. There most commonly used definition for Repository is defined in Patterns of Enterprise Application Architecture:

A system with a complex domain model often benefits from a layer, such as the one provided by Data Mapper, that isolates domain objects from details of the database access code. In such systems it can be worthwhile to build another layer of abstraction over the mapping layer where query construction code is concentrated. This becomes more important when there are a large number of domain classes or heavy querying. In these cases particularly, adding this layer helps minimize duplicate query logic.

A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection. Client objects construct query specifications declaratively and submit them to Repository for satisfaction. Objects can be added to and removed from the Repository, as they can from a simple collection of objects, and the mapping code encapsulated by the Repository will carry out the appropriate operations behind the scenes. Conceptually, a Repository encapsulates the set of objects persisted in a data store and the operations performed over them, providing a more object-oriented view of the persistence layer. Repository also supports the objective of achieving a clean separation and one-way dependency between the domain and data mapping layers.

Note that while the actual pattern description defined in PoEAA and DDD are very nearly identical, the actual reasoning behind it is different, and the DDD repository is limited to aggregate roots only.

So, what is the problem with that?

The problem with this pattern is that it totally ignores the existence of mature persistence technologies, such as NHibernate. NHibernate already provides an illusion of in memory access, in fact, that is its sole reason of existing. Declarative queries, check. OO view on the persistence store, check. One way dependency between the domain and the data store, check.

So, what do I gain by using the repository pattern when I already have NHibernate (or similar, most OR/M have matching capabilities by now)?

Not much, really, expect as additional abstraction. More than that, the details of persistence storage are:

Complex
Context sensitive
Important

Trying to hide that behind a repository interface usually lead us to a repository that has method like:

FindCustomer(id)
FindCustomerWithAddresses(id)
FindCustomerWith..

It get worse when you have complex search criteria and complex fetch plan. Then you are stuck either creating a method per each combination that you use or generalizing that. Generalizing that only means that you now have an additional abstraction that usually map pretty closely to the persistent storage that you use.

From my perspective, that is additional code that doesn’t have to be written.

Wait, I can hear you say, but repositories encapsulate queries, and removing query logic duplication is one of the reasons for them in the first place.

Well, yes, but encapsulation of queries should be done in the repository. Queries are complex, and you want to encapsulate them in their own object. In most cases, I have something like this:

GetQuery takes an ISession an return ICriteria, which mean that my code gets the chance to set paging, ordering, fetching strategies, etc. That is not the responsibility of the query object, and trying to hide it only add additional abstraction that doesn’t actually give me anything.

I mentioned that I have two problems with the repository pattern, the second being the way it is being used.

Quite frankly, and here I fully share the blame, the Repository pattern is popular. A lot of people use it, mostly because of the DDD association. I am currently in the opinion that DDD should be approached with caution, since if you don’t actually need it (and have the prerequisites for it, such as business expert to work closely with or an app that can actually benefit from it), it is probably going to be more painful to try using DDD than without.

More than that, the way that most people use a Repository more closely follows the DAO pattern, not the Repository pattern. But Repository sounds more cool, so they call it that.

My current approach for data access now is:

When using a database, use NHibernate’s ISession directly
Encapsulate complex queries into query objects that construct an ICriteria query that I can get and manipulate further
When using something other than a database, create a DAO for that, respecting the underlying storage implementation
Don’t try to protect developers

Let us see how many call for my lynching we get now…

Apr 17 2009

What do I expect from guidance?

time to read 2 min | 386 words

Tweet Share Share 7 comments

Tags:

Microsoft

Phil Haack has a post about code sample taxonomy. He suggests the following taxonomy:

Prototype
Demo
Sample
Production
Reference

I am not sure that I agree with his division. From my point of view, demo code need to focus on one single aspect, mostly because we are going to talk around that, and doing that means that I don’t have the time to deal with anything but the aspect that I need to talk about.

Sample code is a more fleshed out example.

I don’t make any distinction between reference and sample apps, maybe the size is different but that is about it.

What is important to follow is that for all of them, except maybe prototype (and I was bitten by that before), I have the same code quality expectations.

Note that I am saying code quality vs. production readiness. Most of the code that you see in demos or sample apps is not production ready. And that is fine.

Production readiness include such things as:

Parameter validation
Security
Error handling
Failover & recovery

And a lot more “taxes” that we need to deal with.

That does not include code quality. Code quality is a metric of its own, and it is one that you cannot give up. I would be personally ashamed to put out something that didn’t meet my own code quality metrics.

Sometimes, being production ready means that you have to give up on code quality, here is one example, that is understood and acceptable in certain circumstances.

What is not acceptable is to say that this is “Xyz” type of code sample, therefore code quality doesn’t matter.

And code quality isn’t really some amorphous thing. It is a very well defined metric that can be automatically checked. FxCop or NDepend are good tools to have, Simian will give you a lot of value all on its own.

When you put something out as an official Microsoft guidance release, the bar is higher, because whatever you do will be used by other people.

Apr 17 2009

Kobe – Architectural Overview

time to read 3 min | 449 words

Tweet Share Share 17 comments

Tags:

WTF?!

Sometimes, you have to make a distinction between the system architecture and the system implementation. It is possible to have a great architecture and have the implementers butcher it by not paying attention at some points.

Sometimes, the problem in the code implementation is so bad that it becomes an architectural problem. Kobe is one such example.

We can start with the pervasive Primitive obsession that is all over the map. The common use of “string thisIsReallyGuid”, Copy & paste programming, horrible Exception Handling and total lack of respect for the fundamentals of OO.

All of those tie back together to the lack of architectural oversight over the project. I have the feeling that at no point someone stopped and said: “there has got to be a better way of doing this.”

DRY is a true principle in development, it transcend programming languages and paradigms, about the only one that I can think of that does so. If you violate DRY, you are in for a world of trouble, sooner than later.

SOLID is a set of principles used to build good OO applications. If you are using OO language and paradigms and fail to pay attention to SOLID, it will hurt you at some point. Not as soon as if you have violated DRY, but soon enough.

8.5% of Kobe is copy & pasted code. And that is with the sensitivity dialed high, if we set the threshold to 3, which is what I commonly do, is goes up to 12.5%. A lot of the duplicated code relates to exception handling and caching. Those are classic cases for using an aspect to handle cross cutting concerns. That would reduce a lot of the duplicated code.

The other problems that I have with Kobe is its total lack of respect for SOLID principles. You know what, let us throw solid out the window, I am going to focus on a single principle, Single Responsibility Principle. Just about any method that you care to name does more than one thing, in many cases, they do a lot more than one thing.

This is absolutely wrong.

The “repositories” in Kobe has so many responsibilities that they are just shy of having a singularity around them.

Overall, Kobe feels very amaturish, as if it was written without true understanding of what is expected of quality code, and without really understanding the underlying platform, frameworks or the infrastructure that they are using.

I don’t like it at all, and it is a horrible sample application. In my considered opinioned, it should be pulled from the site, rewritten, and only then published again.

Apr 17 2009

Kobe – an example of exception handling done wrong

time to read 3 min | 426 words

Tweet Share Share 10 comments

Tags:

WTF?!

In Kobe, pretty much all exception handling is this:

catch (Exception ex)
{
    bool rethrow = ExceptionPolicy.HandleException(ex, "Service Policy");
    if (rethrow && WebOperationContext.Current == null)
    {
        throw;
    }
    return null;
}

These lines appear in just about any method in the application.

They are wrong. In fact, they are wrong on more than one level. Violating DRY is a big offence, but even ignoring that, they are still wrong.

Exception handling is not something that you should take lightly, and there is a strong tie between the way that you write the code and the way errors are handled. The whole purpose of this exception handling is to avoid throwing an exception by setting some configuration parameter.

The problem with this is that this purpose is wrong. This is a case of writing the code without actually trying to understand what the implications are. You cannot do this sort of thing, because the calling code never expects a null, it expect a valid result or an exception.

This approach bring us back to the bad old days of error codes. Not to mention that even if we actually checked for a null return value, what the hell is the error that actually caused this to happen. Oh, it was put in the log, so now we need to go and look at the log file and try to match timestamps (if we even can).

Exception handling should appear in exactly two places:

When an error is expected (making a web request call, for example) and there is some meaningful behavior to be done in the case of a failure (such as retrying after some delay)
On a system boundary, in which case you need to make a decision about how you are going to expose the error to the outside world.

It is important to note that I explicitly used the term system boundary, instead of service boundary, which is more commonly used. Exception handling within a system should not try to hide errors. Error should propagate all the way to the system boundary, because otherwise we lose a lot of important information.

System boundary is defined as whatever is visible to the actual user or other systems.

Oh, and exception handling in the system boundary is a system infrastructure problem, it is not something that you have to write code for all over the place.

Apr 17 2009

Kobe – When the documentation is the only delivery that matters

time to read 2 min | 251 words

Tweet Share Share 3 comments

Tags:

WTF?!

When comparing the Kobe code base to the Kobe documentation, I am starting to get a sinking feeling. The documentation is really nice, the code is anything but.

When you put out guidance, the documentation is nice, but it is the code that people are going to look it, learn and try to emulate. And when you put out something that doesn’t meet basic parameters for good software, that is a huge problem.

With Oxite, Microsoft had the excuse of it being slipped out without anyone noticing. This time it is official, it has passed the proper review procedure, and is explicitly marketed as “intended to guide you with the planning, architecting, and implementing of Web 2.0 applications and services.”

Sorry, that is entirely unacceptable.

Putting something like that out there as guidance is actively doing harm to the community as a whole. It means that people who would look at that and try to use what they see there are going to get bitten by bad practices, bad architecture and overly complex and redundant approaches.

Phil Haack said that: “I was part of a high-level review and I felt most of the app that I saw was reasonably put together.”

I submit that high level review of such guidance packages is absolutely not sufficient. You have to get people to do a full review cycle of that, which include the code, the documentation, the reasoning and anything else that is going out there.

Apr 17 2009

Kobe – Data Access done wrong

time to read 4 min | 650 words

Tweet Share Share 21 comments

Tags:

WTF?!

From the Kobe documentation:

The Repository pattern is applied to mediate access to the data store through a Repository provider.

From PoEAA:

A system with a complex domain model often benefits from a layer, such as the one provided by Data Mapper (165), that isolates domain objects from details of the database access code. In such systems it can be worthwhile to build another layer of abstraction over the mapping layer where query construction code is concentrated. This becomes more important when there are a large number of domain classes or heavy querying. In these cases particularly, adding this layer helps minimize duplicate query logic.

I think that Repository is a fad, which is why I don’t use it much anymore myself, and looking at the usage semantics of the “repositories” in Kobe, they certainly don’t fit the bill.

Repositories are here to encapsulate query logic, that is not the case with Kobe. Here is a partial view of some of the repositories that it have:

In general, looking at the any of the repositories in detail, I really want to cry. I mean, just take a look:

More to the point, let us look at GroupRepository.AcceptGroupInvite (I removed all the copy & paste crap to concentrate on the actual code):

GroupInvite groupInvite = _context.GroupInviteSet
                                 .Where(gi => gi.GroupInviteId == groupInviteId)
                                 .FirstOrDefault();

if (groupInvite == null)
    throw new DataException("Invalid Invite Id", null);

groupInvite.InvitationStatuses = _context.InvitationStatusSet
                                        .Where(s => s.InvitationStatusName == "Accepted")
                                        .FirstOrDefault(); ;

_context.SaveChanges(true);

There are so many things wrong with this approach that I don’t even know where to start.

You can see how the architects or developers resented having to use an ORM. If they couldn’t have stored procedures in the database, then by Jove and his keyboard, they will write stored procedures in code.

Kobe uses Entity Framework for data access, and they treat it as if it was a dataset. As an OR/M aficionado, I am insolated. This shows complete lack of understanding how Entity Framework work, doing a lot more work than you should, brute force & big hammer approach.

From an architectural perspective, most of what Kobe is doing is calling down to the repository to perform the data access details. Business logic, such as there is, is spread between the data access, the controllers and the service layer, in such a way that make is very hard to have a good idea about what is actually happening in the application.

Data access details are quite pervasive throughout the application, and the whole feeling of the application is of a circa 2001 sample app using stored procedures and hand rolled data access layers.

Oren Eini

Oren Eini

CEO of RavenDB

How Microsoft should release guidance?

The DAL should go all the way to UI

Mocking NHibernate

Posts vs. Comments

Repository is the new Singleton

What do I expect from guidance?

Kobe – Architectural Overview

Kobe – an example of exception handling done wrong

Kobe – When the documentation is the only delivery that matters

Kobe – Data Access done wrong

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed