Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,582
|
Comments: 51,212
Privacy Policy · Terms
filter by tags archive
time to read 3 min | 588 words

I recently got an estimate for a feature that I wanted to add to NH Prof. It was for two separate features, actually, but they were closely related.

That estimate was for 32 hours.

And it caused me a great deal of indigestion. The problem was, quite simply, that even granting that there is the usual padding of estimates (which I expect), that timing estimate was off, way off. I knew what would be required for this feature, and it shouldn’t be anywhere near complicated enough to require 4 days of full time work. In fact, I estimated that it would take me a maximum of 6 hours and a probable of 3 hours to get it done.

Now, to be fair, I know the codebase (well, actually that isn’t true, a lot of the code for NH Prof was written by Rob & Christopher, and after a few days of working with them, I stopped looking at the code, there wasn’t any need to do so). And I am well aware that most people consider me to be an above average developer.

I wouldn’t have batted an eye for an estimate of 8 – 14 hours, probably. Part of the reason that I have other people working on the code base is that even though I can do certain things faster, I can only do so many things, after all.

But a time estimate that was 5 – 10 times as large as what I estimated was too annoying. I decided that this feature I am going to do on my own. And I decided that I wanted to do this on the clock.

The result is here:

image

This is actually total time over three separate sittings, but the timing is close enough to what I though it would be.

This includes everything, implementing the feature, unit testing it, wiring it up in the UI, etc.

The only thing remaining is to add the UI works for the other profilers (Entity Framework, Linq to SQL, Hibernate and the upcoming LLBLGen Profiler) . Doing this now…

image

And we are done.

I have more features that I want to implement, but in general, if I pushed those changes out, they would be a new release that customers can use immediately.

Nitpicker corner: No, I am not being ripped off. And no, the people making the estimates aren't incompetent. To be perfectly honest, looking at the work that they did do and the time marked against it, they are good, and they deliver in a reasonable time frame. What I think is going on is that their estimates are highly conservative, because they don't want to get into a bind with "oh, we run into a problem with XYZ and overrun the time for the feature by 150%".

That also lead to a different problem, when you pay by the hour, you really want to have estimates that are more or less reasonably tied to your payments. But estimating with hours has too much granularity to be really effective (a single bug can easily consume hours, totally throwing off estimation, and it doesn't even have to be a complex one.)

time to read 2 min | 361 words

An interesting thing happened recently, when I started to build the profiler, a lot of the features were what I call Core Features. Those were the things that without which, we wouldn’t have a product. Things like detecting SQL, merging it into sessions, providing reports, etc. What I find myself doing recently with the profiler is not so much building Core Features, but building UX features. In other words, now that we have this in place, let us see how we can make better use of this.

Case in point, the new features that were just released in build 713. They aren’t big, but they are there to improve how people are commonly using the products.

Renaming a session:

image

This is primarily useful if you are in a long profiling session and you want to mark a specific session with some notation:

image

Small feature, and individually not very useful. But you might have noticed that the sessions are marked with stars around them. They weren’t there is previous builds, so what are they?

image

They are a way to tell the profiler that you really like those sessions :-)

More to the point, such sessions will not be removed when you clear the current state. That lets you keep around the previous state of the application as a base line while you work to improve it. Beside, it makes it much easier to locate them visually.

And finally, as a quicker way to do that, you can just ask the profiler to clear all but the selected features.

image

Not big features, but nice ones, I think.

time to read 9 min | 1691 words

I thought it would be a good idea to see what sort of data access behavior LightSwitch applications have. So I hook it up with the EntityFramework Profiler and took it for a spin.

It is interesting to note that it seems that every operation that is running is running in the context of a distributed transaction:

image

There is a time & place to use DTC, but in general, you should avoid them until you really need them. I assume that this is something that is actually being triggered by WCF behavior, not intentional.

Now, let us look at what a simple search looks like:

image

This search results in:

image

That sound? Yes, the one that you just heard. That is the sound of a DBA somewhere expiring. The presentation about LightSwitch touted how you can search every field. And you certainly can. You can also swim across the English channel, but I found that taking the train seems to be an easier way to go about doing this.

Doing this sort of searching is going to be:

  • Very expensive once you have any reasonable amount of data.
  • Prevent usage of indexes to optimize performance.

In other words, this is an extremely brute force approach for this, and it is going to be pretty bad from performance perspective.

Interestingly, it seems that LS is using optimistic concurrency by default.

image

I wonder why they use the slowest method possible for this, instead of using version numbers.

Now, let see how it handles references. I think that I run into something which is a problem, consider:

image

Which generates:

image

This make sense only if you can think of the underlying data model. It certainly seems backward to me.

I fixed that, and created four animals, each as the parent of the other:

image

Which is nice, except that here is the SQL required to generate this screen:

-- statement #1
SELECT [GroupBy1].[A1] AS [C1]
FROM   (SELECT COUNT(1) AS [A1]
        FROM   [dbo].[AnimalsSet] AS [Extent1]) AS [GroupBy1]

-- statement #2
SELECT   TOP ( 45 ) [Extent1].[Id]              AS [Id],
                    [Extent1].[Name]            AS [Name],
                    [Extent1].[DateOfBirth]     AS [DateOfBirth],
                    [Extent1].[Species]         AS [Species],
                    [Extent1].[Color]           AS [Color],
                    [Extent1].[Pic]             AS [Pic],
                    [Extent1].[Animals_Animals] AS [Animals_Animals]
FROM     (SELECT [Extent1].[Id]                      AS [Id],
                 [Extent1].[Name]                    AS [Name],
                 [Extent1].[DateOfBirth]             AS [DateOfBirth],
                 [Extent1].[Species]                 AS [Species],
                 [Extent1].[Color]                   AS [Color],
                 [Extent1].[Pic]                     AS [Pic],
                 [Extent1].[Animals_Animals]         AS [Animals_Animals],
                 row_number()
                   OVER(ORDER BY [Extent1].[Id] ASC) AS [row_number]
          FROM   [dbo].[AnimalsSet] AS [Extent1]) AS [Extent1]
WHERE    [Extent1].[row_number] > 0
ORDER BY [Extent1].[Id] ASC

-- statement #3
SELECT [Extent1].[Id]              AS [Id],
       [Extent1].[Name]            AS [Name],
       [Extent1].[DateOfBirth]     AS [DateOfBirth],
       [Extent1].[Species]         AS [Species],
       [Extent1].[Color]           AS [Color],
       [Extent1].[Pic]             AS [Pic],
       [Extent1].[Animals_Animals] AS [Animals_Animals]
FROM   [dbo].[AnimalsSet] AS [Extent1]
WHERE  1 = [Extent1].[Id]

-- statement #4
SELECT [Extent1].[Id]              AS [Id],
       [Extent1].[Name]            AS [Name],
       [Extent1].[DateOfBirth]     AS [DateOfBirth],
       [Extent1].[Species]         AS [Species],
       [Extent1].[Color]           AS [Color],
       [Extent1].[Pic]             AS [Pic],
       [Extent1].[Animals_Animals] AS [Animals_Animals]
FROM   [dbo].[AnimalsSet] AS [Extent1]
WHERE  2 = [Extent1].[Id]

-- statement #5
SELECT [Extent1].[Id]              AS [Id],
       [Extent1].[Name]            AS [Name],
       [Extent1].[DateOfBirth]     AS [DateOfBirth],
       [Extent1].[Species]         AS [Species],
       [Extent1].[Color]           AS [Color],
       [Extent1].[Pic]             AS [Pic],
       [Extent1].[Animals_Animals] AS [Animals_Animals]
FROM   [dbo].[AnimalsSet] AS [Extent1]
WHERE  3 = [Extent1].[Id]

I told you that there is a select n+1 builtin into the product, now didn’t I?

Now, to make things just that much worse, it isn’t actually a Select N+1 that you’ll easily recognize. because this doesn’t happen on a single request. Instead, we have a multi tier Select N+1.

image

What is actually happening is that in this case, we make the first request to get the data, then we make an additional web request per returned result to get the data about the parent.

And I think that you’ll have to admit that a Parent->>Children association isn’t something that is out of the ordinary. In typical system, where you may have many associations, this “feature” alone is going to slow the system to a crawl.

time to read 2 min | 293 words

This post is to help everyone who want to understand what LightSwitch is going to do under the covers. It allows you to see exactly what is going on with the database interaction using Entity Framework Profiler.

In your LightSwitch application, switch to file view:

image

In the server project, add a reference to HibernatingRhinos.Profiler.Appender.v4.0, which you can find in the EF Prof download.

image

Open the ApplicationDataService file inside the UserCode directory:

image

Add a static constructor with a call to initialize the entity framework profiler:

public partial class ApplicationDataService
{
    static ApplicationDataService()
    {
        HibernatingRhinos.Profiler.Appender.EntityFramework.EntityFrameworkProfiler.Initialize();
    }
}

This is it!

You’re now able to work with the Entity Framework Profiler and see what sort of queries are being generated on your behalf.

image

time to read 7 min | 1268 words

After making EF Prof work with EF Code Only, I decided that I might take a look at how Code Only actually work from the perspective of the application developer. I am working on my own solution based on the following posts:

But since I don’t like to just read things, and I hate walkthroughs, I decided to take that into a slightly different path. In order to do that, I decided to set myself the following goal:

image

  • Create a ToDo application with the following entities:
    • User
    • Actions (inheritance)
      • ToDo
      • Reminder
  • Query for:
    • All actions for user
    • All reminders for today across all users

That isn’t really a complex system, but my intention is to get to grips with how things work. And see how much friction I encounter along the way.

We start by referencing “Microsoft.Data.Entity.Ctp” & “System.Data.Entity”

There appears to be a wide range of options to define how entities should be mapped. This include building them using a fluent interface, creating map classes or auto mapping. All in all, the code shows a remarkable similarity to Fluent NHibernate, in spirit if not in actual API.

I don’t like some of the API:

  • HasRequired and HasKey for example, seems to be awkwardly named to me, especially when they are used as part of a fluent sentence. I have long advocated avoiding the attempt to create real sentences in a fluent API (StructureMap was probably the worst in this regard). Dropping the Has prefix would be just as understandable, and look better, IMO.
  • Why do we have both IsRequired and HasRequired? The previous comment apply, with the addition that having two similarly named methods that appears to be doing the same thing is probably not a good idea.

But aside from that, it appears very nice.

ObjectContext vs. DbContext

I am not sure why there are two of them, but I have a very big dislike of ObjectContext, the amount of code that you have to write to make it work is just ridiculous, when you compare that to the amount of code you have to write for DbContext.

I also strongly dislike the need to pass a DbConnection to the ObjectContext. The actual management of the connection is not within the scope of the application developer. That is within the scope of the infrastructure. Messing with DbConnection in application code should be left to very special circumstances and require swearing an oath of nonmaleficence. The DbContext doesn’t require that, so that is another thing that is in favor of it.

Using the DbContext is nice:

public class ToDoContext : DbContext
{
    private static readonly DbModel model;

    static ToDoContext()
    {
        var modelBuilder = new ModelBuilder();
        modelBuilder.DiscoverEntitiesFromContext(typeof(ToDoContext));
        modelBuilder.Entity<User>().HasKey(x => x.Username);
        model = modelBuilder.CreateModel();
    }

    public ToDoContext():base(model)
    {
        
    }

    public DbSet<Action> Actions { get; set; }

    public DbSet<User> Users { get; set; }
}

Note that we can mix & match the configuration styles, some are auto mapped, some are explicitly stated. It appears that if you fully follow the builtin conventions, you don’t even need ModelBuilder, as that will be build for you automatically.

Let us try to run things:

using(var ctx = new ToDoContext())
{
    ctx.Users.ToList();
}

The connection string is specified in the app.config, by defining a connection string with the name of the context.

Then I just run it, without creating a database. I expected it to fail, but it didn’t. Instead, it created the following schema:

image

That is a problem, DDL should never run as an implicit step. I couldn’t figure out how to disable that, though (but I didn’t look too hard). To be fair, this looks like it will only run if the database doesn’t exists (not only if the tables aren’t there). But I would still make this an explicit step.

The result of running the code is:

image

Now the time came to try executing my queries:

var actionsForUser = 
    (
        from action in ctx.Actions
        where action.User.Username == "Ayende"
        select action
    )
    .ToList();

var remindersForToday =
    (
        from reminder in ctx.Actions.OfType<Reminder>()
        where reminder.Date == DateTime.Today
        select reminder
    )
    .ToList();

Which resulted in:

image

That has been a pretty brief overview of Entity Framework Code Only, but I am impressed, the whole process has been remarkably friction free, and the time to go from nothing to a working model has been extremely short.

time to read 1 min | 185 words

I just finish touching up a new feature for EF Prof, support for Entity Framework’s Code Only feature. What you see below is EF Prof tracking the Code Only Nerd Dinner example:

image

I tried to tackle the same thing in CTP3, but I was unable to resolve it. Using CTP4, it was about as easy as I could wish it.

Just for fun, the following screen shot looks like it contains a bug, but it doesn’t (at least, not to my knowledge). If you can spot what the bug is, I am going to hand you a 25% discount coupon for EF Prof. If you can tell me why it is not a bug, I would double that.

As an aside, am I the only one that is bothered by the use of @@IDNETITY by EF? I thought that we weren’t supposed to make use of that. Moreover, why write this complex statement when you can write SELECT @@IDENTITY?

time to read 1 min | 183 words

I was working with a client about a problem they had in integrating EF Prof to their application, when my caught the following code base (anonymized, obviously):

public static class ContextHelper
{
     private static Acme.Entities.EntitiesObjectContext _context;

     public static Acme.Entities.EntitiesObjectContext CurrentContext
     {
           get { return _context ?? (_context = new Acme.Entities.EntitiesObjectContext()); }
      }

}

That caused me to stop everything and focus the client’s attentions on the problem that this code can cause.

What were those problems?

time to read 1 min | 105 words

It has been a while since we had a new major feature for the profiler, but here it is:

image

The expensive queries report will look at all your queries and surface the most expensive ones across all the sessions. This can give you a good indication on where you need to optimize things.

Naturally, this feature is available across all the profiler profiles (NHibernate Profiler, Entity Framework Profiler, Linq to SQL Profiler and Hibernate Profiler).

time to read 9 min | 1626 words

I thought that it would be a good idea to take EF Prof for a spin one the sample MVC Music Store. It can illustrate some things about EF Prof and about the sample app.

The first step is to download EF Prof, once that is completed, extract the files, there is no install necessary.

We need to reference the HibernatingRhinos.Profiler.Appender in the application:

image 

Then, from the Global.asax Application_Start, we initialize the profiler:

image

That is all the setup that you need to do :-)

Now, we go to http://localhost:1397/, and we can see the following in EF Prof:

image

There are several interesting this to see here.

  • We executed three queries to load the main page.
  • We actually opened four object contexts to handle this request.
  • We have a query that generate a warning.

Let us deal with each one in turn:

Multiple queries for the same request. This requires analysis because often we are querying too much. But in this instance there is no problem, we are querying different things and we are doing so efficiently.

Four object contexts to handle a single request is bad. We can see that each query was actually executed through a different object context (and one was idle). There are several problems with having multiple object contexts per request:

  • Each object context would open its own connection to the database. I am not sure, but I think that they do so lazily, which means that a single page request resulted in three connections to the database.
  • Each object context implements its own unit of work. You might get two different instances that represent the same row in the database.
  • You can’t aggregate all your operations into a single SaveChanges() call, you have to make multiple trips to the database.
  • You require using System.Transactions and distributed transactions if you want to ensure a transaction boundary around your code.

In short, there are a lot of good reasons to go with request scoped object context, you should do so.

Now, let us look at that query with the warning:

image

EF Prof generated a warning about unbounded result set for this query. What does this mean? It means that if you had 100,000 genres, this query would attempt to load all of them. I am not an expert on music, but even I think that 100,000 genres are unlikely. The problem with these sort of queries is that it is likely that the number of genres will grow, and not adding a TOP or LIMIT clauses to the query means that you are open to problems when the data does grow.

And with that, let us see what happens when we look at a single album (http://127.0.0.1:1397/Store/Details/392):

image

This is pretty much the same as the before, with 5 queries required to process this required, but let us dig just a tiny bit deeper:

image

What we see here is that we have queries that are being generated from rendering the views. That usually trips a warning flag with me, because queries that are being generated from the view are likely to cause problems down the road, data access patterns should rarely change because of view changes, and that is usually and indication that at some point, we will have a SELECT N+1 here.

Now, let us try to add a new item to the cart (http://127.0.0.1:1397/ShoppingCart/AddToCart/669):

image

We can see that adding an item to the cart is a two steps process, first, we go to /ShoppingCart/AddToCart/669, then we are redirected to /ShoppingCart. Overall, we require 8 queries in two requests.

Let us look at the actual queries required to process AddToCart (note, however, that we are talking about two different object contexts):

image

Look at the first query, I don’t see us doing anything with the data here. Let us look a bit more closely on where it is generated:

image

And AddToCart looks like this:

image image

It seems that the ShoppingCart.AddToCart requires an album instance, take a look at it, and see what it does with it. The only thing it does is to use the album id. But we already know the album id, we used that to get the album instance in the first place.

It seems that the first query is there solely to check that the value exists and throw an error from the Single() method. I think we can optimize that:

image image

We removed that query, but we preserve the same behavior (if the application tries to save a new cart item with a missing album id, a FK violation would be thrown). Querying the database is one of the most expensive things that we can do, so it pays to watch out to where we can save in queries.

Now, let us add a few more items to the shopping cart and see what happens :-) I added 6 albums to my cart, and then went to (http://127.0.0.1:1397/ShoppingCart):

image

Wow! We require 10 queries to process this request, and you can see that EF Prof is urgently requesting that you’ll take a closer look at the last three.

image

EF Prof has detected that we have a SELECT N+1 error here, where we issue a query per album in the shopping cart. If we will look at the stack trace, we will find a familiar sight:

image

We have queries being generated from the views, in this case, from this code:

image

You can see that on line 47, as shown in the profiler, we are access the title property of the album, forcing a lazy load.

Tracking it further, we can see that Model.CartItems is set here:

image

And that GetCartItems is defined as:

image

And this generate the following query:

image

Well, that is easy enough to fix :-)

image

Which result in this query:

image

We now bring all the related albums in a single query. Which means that viewing the shopping cart result in:

image

Just four queries, instead of 10!

Not too shabby, for 15 minutes work with the profiler, even if I say so myself.

In general, when I am developing applications, the profiler is always running in the background, and I keep an eye on it to see if I am doing something that it warns me about, such as SELECT N +1, multiple contexts in the same request, too many calls to the database, etc.

time to read 1 min | 122 words

After trying it out on NH Prof, profiler subscriptions are now opened for all the profilers.

A profiler subscription allows you to pay a small monthly free (~16$) and get the full profiler capabilities along with the assurance of no upgrade cost when the next major version comes out.

In addition to the monthly subscription, I got requests for a yearly subscription. I am not sure that I quite follow the logic, but I am not going to make it harder for people to give me money, so that is available as well for all profilers.

FUTURE POSTS

  1. fsync()-ing a directory on Linux (and not Windows) - one day from now

There are posts all the way to Jun 09, 2025

RECENT SERIES

  1. Webinar (7):
    05 Jun 2025 - Think inside the database
  2. Recording (16):
    29 May 2025 - RavenDB's Upcoming Optimizations Deep Dive
  3. RavenDB News (2):
    02 May 2025 - May 2025
  4. Production Postmortem (52):
    07 Apr 2025 - The race condition in the interlock
  5. RavenDB (13):
    02 Apr 2025 - .NET Aspire integration
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}