Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,524
|
Comments: 51,151
Privacy Policy · Terms
filter by tags archive
time to read 1 min | 93 words

I keep getting asked by people: “What is the configuration option to make NHibernate run faster?”

People sort of assume that NHibernate is configured to be slow by default because it amuses someone.

Well, while there isn’t a “Secret_incantation” = “chicken/sacrifice” option in NHibernate, there is this one:

image

And it pretty much does the same thing.

No, I won’t explain why. Go read the docs.

time to read 1 min | 165 words

So here is my near term schedule:

  • A 2 days NHibernate course in London on Feb 25 – those are intense 2 days where I am going to jump you from novice to master in NHibernate, including understanding what is going on and much more importantly, why?!
  • A 3 days RavenDB course in London on Feb 27 – in this course we are going to go over RavenDB 2.x from top to bottom, from how it works, to all the little tricks and hidden details that you wished you knew. We are going to cover architecture, design, implementation and a lot of goodies. Also, there is going to be a lot of fun, and I promise that you go back to work with a lot less load on your shoulders.
  • A 2 days condensed RavenDB course in Stockholm on Mar 4 – Like the previous one, but with a sharper focus, and the guarantee that you will go Wow and Oh! on a regular basis.
time to read 1 min | 156 words

I’ll be spending the last week on November in London, at the Skills Matter offices.

On the 26 Nov, I’ll be giving a 3 days NHibernate course.

And on the 29 Nov, I’ll be giving 2 full days of RavenDB awesomeness. This course is scheduled to run along the same time as the RavenDB 1.2 release, which leads me to the In The Brains session I’ll be giving along the way.

What is new in RavenDB 1.2

RavenDB 1.0 was exciting and fun. RavenDB 1.2 builds on top of that and adds a whole host of really nice features.
Come to hear about the new Changes API, or how you can use evil patching to make the database bow to your wished. Learn how you can add encryption and compression to your database in a few minutes, and watch how operational tasks became even simpler. In short, come and see all of the new stuff for RavenDB!

time to read 10 min | 1818 words

I was asked to comment on this topic:

image

Here is the link to the full article, but here is the gist:

Performance.
The culprit of v5 is our usage of NHibernate. This doesn't mean that NHibernate is evil, just that it has turned out to be the wrong fit for our project (in part due to our architecture and db schema, in part due to lack of proper NHibernate experience on the team). It's a badly tamed NHibernate that is caused the many N+1 SQL statements. The long term solution is to replace NHibernate with native, optimized SQL calls but it's a bigger operation that we'll look into over the summer. The short team solution is to continue with finding as many bottlenecks as we can as tune those, combined with introducing a Lucene based cache provider which will cache queries and results which will survive app pool restarts (much like the v4 XML cache). The architecture has been prepared for the latter and we'll have this in place by the end of May.


Another thing that causes massive performance issues is that we're seeing a trend where people are calling ToList() on razor queries to compensate for the lack of LINQ features in our implementation. Unfortunately, this causes most (if not all) of the database to be loaded into memory, naturally causing big performance issues (and an enormous number of N+1 SQL calls). This leads us to...

And a while later:

V5 couldn't be fixed. We needed to completely rewrite the entire business layer to ever get v5 into being the type of product people expect from Umbraco

I marked the parts that I intend to respond to in yellow, green highlight is for some additional thoughts later on, keep it in mind.

Now, to be fair, I looked at Umbraco a few times, but I never actually used it, and I never actually dug deep.

The entire problem can be summed in this page, from the Umbraco documentation site. This is the dynamic model that Umbraco exposes to the views to actually generate the HTML. From the docs:

DynamicModel is a dynamic object, this gives us a number of obvious benefits, mostly a much lighter syntax for accessing data, as well as dynamic queries, based on the names of your DocumentTypes and their properties, but at the same time, it does not provide intellisense.

Any time you provide the view with a way to generate queries, it is a BIG issue. For the purpose of discussion, I’ll focus on the following example,  also taken from Umbraco docs:

   1:  @inherits PartialViewMacroPage
   2:  @using Umbraco.Cms.Web
   3:  @using Umbraco.Cms.Web.Macros
   4:  @using Umbraco.Framework
   5:   
   6:  @{
   7:      @* Get the macro parameter and check it has a value - otherwise set to empty hive Id *@
   8:      var startNodeID = String.IsNullOrEmpty(Model.MacroParameters.startNodeID) ? HiveId.Empty.ToString() : Model.MacroParameters.startNodeID;
   9:  }
  10:   
  11:  @* Check that startNodeID is not an empty HiveID AND also check the string is a valid HiveID *@
  12:  @if (startNodeID != HiveId.Empty.ToString() && HiveId.TryParse(startNodeID).Success)
  13:  {
  14:      @* Get the start node as a dynamic node *@
  15:      var startNode = Umbraco.GetDynamicContentById(startNodeID);
  16:   
  17:      @* Check if the startNode has any child pages, where the property umbracoNaviHide is not True *@    
  18:      if (startNode.Children.Where("umbracoNaviHide != @0", "True").Any())
  19:      {
  20:          <ul>
  21:              @* For each child page under startNode, where the property umbracoNaviHide is not True *@
  22:              @foreach (var page in startNode.Children.Where("umbracoNaviHide != @0", "True"))
  23:              { 
  24:                  <li><a href="@page.Url">@page.Name</a></li>
  25:              }
  26:          </ul>
  27:      }    
  28:  }

This is a great example of why you should never do queries from the views. Put simply, you don’t have any idea what is going to happen. Let us start going through the onion, and see where we get.

Umbraco.GetDynamicContentById – Is actually UmbracoHelper.GetDynamicContentById, which looks like this:

image

I mean, I guess I could just stop right here, look at the first line, and it should tell you everything. But let us dig deeper. Here is what GetById looks like:

image

Note that with NHibernate, this is a worst practice, because it forces a query, rather than allow us to use the session cache and a more optimized code paths that Load or Get would use.

Also, note that this is generic on top of an IQueryable, so we have to go one step back and look at Cms().Content, which is implemented in:

image

I just love how we dispose of the Unit of Work that we just created, before it is used by any client. But let us dig a bit deeper, okay?

image

image

image

And we still haven’t found NHibernate! That is the point where I give up, I am pretty sure that Umbraco 5 is using NHibernate, I just can’t find how or where.

Note that in those conditions, it is pretty much settled that any optimizations that you can offer isn’t going to work, since you don’t have any good way to apply it, it is so hidden behind all those layers.

But I am pretty sure that it all end up with a Linq query on top of  a session.Query<Content>(), so I’ll assume that for now.

So we have this query in line 15, to load the object, then, on line 18, we issue what looks like a query:

if (startNode.Children.Where("umbracoNaviHide != @0", "True").Any())

We actually issue two queries here, one to load the Children collection, and the second in the Where(). It took a while to figure out what the Where was, but I am pretty sure it is:

image

Which translate the string to a Linq query, which then have to be translated to a string again. To be perfectly honest, I have no idea if this is executed against the database ( I assume so ) or against the in memory collection.

Assuming it is against the database, then we issue the same query again, two lines below that.

And this is in a small sample, think about the implications for a more complex page. When you have most of you data access so abstracted, it is very hard to see what is going on, let alone do any optimizations. And when you are doing data access in the view, the very last thing that the developer has in his mind is optimizing data access.

This is just setting up for failure. Remember what I said about: replace NHibernate with native, optimized SQL calls, this is where it gets interesting. Just about every best practice of NHibernate was violated here, but the major problem is not the NHibernate usage. The major problem is that there is very little correlation between what is happening and the database. And the number of queries that are likely to be required to generate each page are truly enormous.

You don’t issues queries from the view, that is the #1 reason for SELECT N+1, and when you create a system where that seems to be the primary mode of operation, that is always going to cause you issues.

Replace NHibernate with optimized SQL calls, but you are still making tons of direct calls to the DB from the views.

I actually agree with the Umbraco decision, it should be re-written. But the starting point isn’t just the architectural complexity. The starting point should be the actual interface that you intend to give to the end developer.

time to read 3 min | 497 words

We have been head down for a while, doing some really cool things with RavenDB (sharding, read striping, query intersection, indexing reliability and more). But that meant that for a while,things that are not about writing code for RavenDB has been more or less on auto-pilot.

So here are some things that we are planning. We will increase the pace of RavenDB courses and conference presentation. You can track it all in the RavenDB events page.

Conferences

RavenDB Courses

NHibernate Courses

Not finalized yet

  • August 2012.
    • User groups talks in Philadelphia & Washington DC by Itamar Syn-Hershko.
    • One day boot camp for moving from being familiar with RavenDB to being a master in Chicago.
  • September 2012.
    • RavenDB Course in Austin, Texas.

Consulting Opportunities

We are also available for on site consulting in the following locations and times. Please contact us directly if you would like to arrange for one of RavenDB core team to show up at your door step. Or if you want me to do architecture or NHibernate consulting.

  • Oren Eini – Malmo, June 26 – 27.
  • Oren Eini – Berlin, July 2 – 4.
  • Itamar Syn-Hershko – New York, Aug 23.
  • Itamar Syn-Hershko – Chicago, Aug 30 or Sep 3.
  • Itamar Syn-Hershko – Austin, Sep 3.
  • Itamar Syn-Hershko – Toronto, Sep 9 – 10.
  • Itamar Syn-Hershko – London, Sep 11.
time to read 3 min | 490 words

This is a review of the S#arp Lite project, the version from Nov 4, 2011.

This project is significantly better than the S#arp Arch project that I reviewed a while ago, but that doesn’t mean that it is good. There is a lot to like, but frankly, the insistence to again abstract the data access behind complex base classes and repositories makes things much harder in the longer run.

If you are writing an application and you find yourself writing abstractions on top of CUD operations, stop, you are doing it wrong.

I quite like S#arp approach for querying, though. You expose things directly, and if it is ugly, you just wrap it in a dedicated query object. That is how you should be handling things.

Finally, whenever possible, push things to the infrastructure, it is usually pretty good and that is the right level of handling things like persistence, validation, etc. And no, you don’t have to write that, it is already there.

A lot of the code in the sample project was simply to manage persistence and validation (in fact, there was an entire project for that) that could be safely deleted in favor of:

public class ValidationListener : NHibernate.Event.IPreUpdateEventListener, NHibernate.Event.IPreInsertEventListener
{
    public bool OnPreUpdate(PreUpdateEvent @event)
    {
        if (!DataAnnotationsValidator.TryValidate(@event.Entity)) 
            throw new InvalidOperationException("Updated entity is in an invalid state");

        return false;
    }

    public bool OnPreInsert(PreInsertEvent @event)
    {
        if (!DataAnnotationsValidator.TryValidate(@event.Entity))
            throw new InvalidOperationException("Updated entity is in an invalid state");

        return false;
    }
}

Register that with NHibernate, and it will do that validation work for you, for example. Don’t try too hard, it should be simple, if it ain’t, you are either doing something very strange or you are doing it wrong, and I am willing to bet on the later.

To be clear, the problems that I had with the codebase were mostly with regards to the data access portions. I didn’t have any issues with the rest of the architecture.

time to read 2 min | 286 words

This is a review of the S#arp Lite project, the version from Nov 4, 2011.

Okay, after going over all of the rest of the application, let us take a look a the parts that actually do something.

the following are from the CustomerController:

image

It is fairly straight forward, all in all. Of course, the problem is that is isn’t doing much. The moment that it does, we are going to run into problems. Let us move a different controller, ProductController and the Index action:

image

Seems fine, right? Except that in the view…

image

As you can see, we got a Select N+1 here. I’ll admit, I actually had to spend a moment or two to look for it (hint, look for @foreach  in the view, that is usually an indication of a place that requires attention).

The problem is that we really don’t have anything to do about it. If we want to resolve this, we would have to create our own query object to completely encapsulate the query. But all we need is to just add a FetchMany and we are done, except that there is that nasty OR/M abstraction that doesn’t do much except make our life harder.

time to read 4 min | 674 words

This is a review of the S#arp Lite project, the version from Nov 4, 2011.

So far, we have gone over the general structure and the domain. Next on the list is going over the tasks project. No idea what this is. I would expect that to be some sort of long running, background tasks, but haven’t checked it yet.

Unfortunately, this isn’t the case. Tasks seems to be about another name for DAL. And to make matter worse, this is a DAL on top or a Repository on top of an OR/M.

And as if that wasn’t enough to put my teeth on edge, we got some really strange things going on there. Let us see how it goes.

image

Basically, a CudTask seems to be all about translating from the data model (I intentionally don’t call it domain model) to the view model. I spoke about the issues with repositories many times before, so I’ll suffice with saying that this is still wasteful and serve no real purpose and be done with it.

This TransferFormValuesTo() is a strange beast, and it took me a while to figure out what is going on here. let us look in the parent class to figure out what is going on there.

image

Let me count the things that are wrong here.

First, we have this IsTransient() method, why do we have that? All we need to do is just to call SaveOrUpdate and it will do it for us. Then the rest of the method sunk in.

The way this system works, you are going to have two instances of every entity that you load (not really true, by the way, because you have leakage for references, which must cause some really interesting bugs). One instance is the one that is managed by NHibernate, dealing with lazy loading, change management, etc. The second is the value that isn’t managed by NHibernate. I assume that this is an instance that you get when you bind the entity view the action parameters.

NHibernate contains explicit support for handling that (session.Merge), and that support is there for bad applications. You shouldn’t be doing things this way. Extend the model binder so it would load the entity from NHibernate and bind to that instance directly. You wouldn’t have to worry about all of this code, it would just work.

For that matter, the same goes for validation as well, you can push that into NHibernate as a listener. So all of this code just goes away, poof!

And then there is the Delete method:

image

I mean, is there a rule that says “developers should discard error information as soon as possible, because it is useless.” I mean, the next step is to see C# code littered with things like:

catch(Exception e)
{
   delete e; // early release for the memory held by the exception
}

The one good thing that I can say about the CudTasks is that at least they are explicit about not handling reads, and that reads seems to be handled properly so far (but I haven’t looked at the actual code yet).

time to read 4 min | 640 words

This is a review of the S#arp Lite project, the version from Nov 4, 2011.

In my previous post, I looked at the general structure, but not much more. In this one, we are going to focus on the Domain project.

We start with the actual domain:

image

I have only few comments about this sort of model:

  • This is a pure CRUD model, which is good, since it is simple and easy to understand, but one does wonder where the actual business logic of the system is. It might be that there isn’t any (we are talking about a sample app, after all).
  • The few methods that are there are also about data (in this case, aggregation, and Order.GetTotal() will trigger a lazy loaded query when called, which might be a surprise to the caller.
  • Probably the worst point of this object model is that it is highly connected, which encourages people to try to walk the object graphs where they should issue a separate query instead.

Next, let us look at the queries. We have seen one example where NHibernate low level API was hidden behind an interface, but that was explicitly called out as rare. So how does this get handled on a regular basis?

image

Hm… I have some issues here with regards to the naming. I don’t like the “Find” vs. “Query” naming. I would use WhereXyz to add a filter and SelectXyz to add a transformation. It would read better when writing Linq queries, but that is about it for the domain.

One thing that I haven’t touched so far is the entities base class:

image

And its parent:

image

I strongly support the notion of ComparableObject, this is recommended when you use NHibernate. But what is it about GetTypeSpecificSignatureProperties? What it actually does is select all the properties that has the [DomainSignature] attribute. But what would you want something like that?

Looking at the code, the Customer.FirstName and Customer.LastName have this attribute, looking at the code, I really can’t understand what went on here. This seems to be selected specifically to create hard to understand and debug bugs.

Why do I say that? The ComparableObject uses properties marked with [DomainSignature] for the GetHashCode() calculation. What this means is that if you change the customer name you change its hash code value. This hash code value is used for, among other things, finding the entity in the unit of work, so changing the customer name can cause NHibernate to loose track of it and behave in some really strange ways.

This is also violating one of the core principals of entities:

A thing with distinct and independent existence.

In other words, an entity doesn’t exists because of the particular values that are there for the first and last names. If those change, the customer doesn’t change. It is the same as saying that by changing the shirt I wear, I becomes a completely different person.

Domain Signature is something that I am completely opposed, not only for the implementation problems, but because it has no meaning when you start to consider what an entity is.

Next, we are going to explore tasks…

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. Challenge (75):
    01 Jul 2024 - Efficient snapshotable state
  2. Recording (14):
    19 Jun 2024 - Building a Database Engine in C# & .NET
  3. re (33):
    28 May 2024 - Secure Drop protocol
  4. Meta Blog (2):
    23 Jan 2024 - I'm a JS Developer now
  5. Production Postmortem (51):
    12 Dec 2023 - The Spawn of Denial of Service
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}