Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,582
|
Comments: 51,212
Privacy Policy · Terms
filter by tags archive
time to read 1 min | 95 words

Well, it has been there for long enough :-)

In the tradition of finding meaningful dates for pushing things out of beta, I thought about making it the 1st of April, but I though better on it when I consider how it might be taken.

So on the 12th of April, EF Prof goes out of beta.

That means that the 30% beta discount will be over, which means that if you want it, you might want to hurry up :-)

time to read 2 min | 209 words

This is Josh’s feature, since we wrote most of the code for it together. Basically, it recognize a very common performance problem, queries that uses too many joins, such as this one:

image

Which would result in the following warning:

image

Queries with too many joins might be a performance problem. Each join requires the database to perform additional work, and the complexity and cost of the query grows rapidly with each additional join. While relational database are optimized for handling joins, it is often more efficient to perform several separate queries instead of a single query with several joins in it.

For OLTP systems, you should consider simplifying your queries or simplifying the data model. While I do not recommend avoiding joins completely, I strong discourage queries with large numbers of joins. Another issue to pay attention to is possible Cartesian products in queries contains joins, it is very easy to create such a thing and not notice it during development.

time to read 5 min | 827 words

This isn’t a new feature, because you can’t use it right now, but it is a really nice feature that we are working on, and I couldn’t resist showing it off hot “off the press”, so to speak.

Given the following query:

SELECT this_.id             AS id7_1_,
       this_.title          AS title7_1_,
       this_.subtitle       AS subtitle7_1_,
       this_.allowscomments AS allowsco4_7_1_,
       this_.createdat      AS createdat7_1_,
       posts2_.blogid       AS blogid3_,
       posts2_.id           AS id3_,
       posts2_.id           AS id0_0_,
       posts2_.title        AS title0_0_,
       posts2_.TEXT         AS text0_0_,
       posts2_.postedat     AS postedat0_0_,
       posts2_.blogid       AS blogid0_0_,
       posts2_.userid       AS userid0_0_
FROM   blogs this_
       LEFT OUTER JOIN posts posts2_
         ON this_.id = posts2_.blogid
WHERE  this_.id = 1 /* @p0 */

SELECT this_.id            AS id0_1_,
       this_.title         AS title0_1_,
       this_.TEXT          AS text0_1_,
       this_.postedat      AS postedat0_1_,
       this_.blogid        AS blogid0_1_,
       this_.userid        AS userid0_1_,
       comments2_.postid   AS postid3_,
       comments2_.id       AS id3_,
       comments2_.id       AS id2_0_,
       comments2_.name     AS name2_0_,
       comments2_.email    AS email2_0_,
       comments2_.homepage AS homepage2_0_,
       comments2_.ip       AS ip2_0_,
       comments2_.TEXT     AS text2_0_,
       comments2_.postid   AS postid2_0_
FROM   posts this_
       LEFT OUTER JOIN comments comments2_
         ON this_.id = comments2_.postid
WHERE  this_.blogid = 1 /* @p1 */

The profiler can show you the query plan using this UI:

image

And here is how the same query looks like using the query plan feature in Management Studio:

image

So, why implement it?

  • This isn’t limited to SQL Server, the profiler can display query plans for: SQL Server, Oracle, PostgreSQL and MySQL
  • This let you keep yourself in the flow, just hit a button to see the query plan, instead of copying the SQL, opening SSMS, displaying the query plan, etc.

Don’t discount the last one, making it easy is one of the core values of the profiler.

The idea is that if you make it easy enough, the barriers for using it goes away. If you can instantly see the query plan for a query, you are far more likely to look at it than if it takes 30 seconds to get that. At that point, you would only do it when you already have a performance problem.

time to read 2 min | 226 words

One of the things that makes working with the profiler easier is the fact that it gives you not just information, but information in context.

I was working with an app using Rhino Service Bus, and it really bothered me that I couldn’t immediately figure out what was the trigger for a session. When using ASP.Net or WCF, the profiler can show the URL that triggered the request, but when we are not using a url based mechanism, that turns out to be much harder.

So I set out to fix that, you can see the results below:

image

This session was generated by a message batch containing messages for MyBooks, MyQueue, etc.

The integration is composed of two parts, first, from the profiler perspective, you now have the ProfilerIntegration.CurrentSessionContext property, which allows you to customize how the profiler detects the current context.

The second part is the integration from the application framework itself, you can see how I did that for Rhino Service Bus, which will dynamically detect the presence of the profiler and fill the appropriate values. The result makes it a lot easier to track down what is going on.

time to read 1 min | 120 words

I got several requests for this, so I am making Uber Prof itself available for purchasing.

What is Uber Prof?

It is a short hand way of saying: All the OR/M profilers that we make.

An Uber Prof license gives you the ability to use:

And it will automatically give you the ability to use any additional profilers that we will create. And yes, there is an upgrade path if you already purchased a single profiler license and would like to upgrade to Uber Prof.

time to read 20 min | 3801 words

One of the things that I began doing since starting to work on multiple OR/M Profilers is to compare how all of them are handling a particular task. This is by no means a comparative analysis, but it is an interesting data point.

The scenario in question is loading a blog with all its posts and comments.

Let us start with NHibernate:

var blogs = s.CreateQuery(
    @"from Blog b 
        left join fetch b.Posts p 
        left join fetch p.Comments 
    where b.Id = :id")
    .SetParameter("id", 1)
    .List<Blog>();

Will generate the following SQL 

select blog0_.Id             as Id7_0_,
       posts1_.Id            as Id0_1_,
       comments2_.Id         as Id2_2_,
       blog0_.Title          as Title7_0_,
       blog0_.Subtitle       as Subtitle7_0_,
       blog0_.AllowsComments as AllowsCo4_7_0_,
       blog0_.CreatedAt      as CreatedAt7_0_,
       posts1_.Title         as Title0_1_,
       posts1_.Text          as Text0_1_,
       posts1_.PostedAt      as PostedAt0_1_,
       posts1_.BlogId        as BlogId0_1_,
       posts1_.UserId        as UserId0_1_,
       posts1_.BlogId        as BlogId0__,
       posts1_.Id            as Id0__,
       comments2_.Name       as Name2_2_,
       comments2_.Email      as Email2_2_,
       comments2_.HomePage   as HomePage2_2_,
       comments2_.Ip         as Ip2_2_,
       comments2_.Text       as Text2_2_,
       comments2_.PostId     as PostId2_2_,
       comments2_.PostId     as PostId1__,
       comments2_.Id         as Id1__
from   Blogs blog0_
       left outer join Posts posts1_
         on blog0_.Id = posts1_.BlogId
       left outer join Comments comments2_
         on posts1_.Id = comments2_.PostId
where  blog0_.Id = 1 /* @p0 */

This result in a fairly simple query plan:

image

However, you should note that this also result in a Cartesian product, which may not be what you wanted.

Linq to SQL doesn’t really provide a good way to express what I wanted, but it does get the job done:

var dataLoadOptions = new DataLoadOptions();
dataLoadOptions.LoadWith<Blog>(x => x.Posts);
dataLoadOptions.LoadWith<Post>(x => x.Comments);
using (var db = new BlogModelDataContext(conStr)
{
    LoadOptions =  dataLoadOptions
})
{
    db.Blogs.Where(x => x.Id == 1).ToList();
}

Interestingly enough, this does not generate a single query, but two queries:

-- statement #1
SELECT [t0].[Id],
       [t0].[Title],
       [t0].[Subtitle],
       [t0].[AllowsComments],
       [t0].[CreatedAt]
FROM   [dbo].[Blogs] AS [t0]
WHERE  [t0].[Id] = 1 /* @p0 */

-- statement #2
SELECT   [t0].[Id],
         [t0].[Title],
         [t0].[Text],
         [t0].[PostedAt],
         [t0].[BlogId],
         [t0].[UserId],
         [t1].[Id]       AS [Id2],
         [t1].[Name],
         [t1].[Email],
         [t1].[HomePage],
         [t1].[Ip],
         [t1].[Text]     AS [Text2],
         [t1].[PostId],
         (SELECT COUNT(* )
          FROM   [dbo].[Comments] AS [t2]
          WHERE  [t2].[PostId] = [t0].[Id]) AS [value]
FROM     [dbo].[Posts] AS [t0]
         LEFT OUTER JOIN [dbo].[Comments] AS [t1]
           ON [t1].[PostId] = [t0].[Id]
WHERE    [t0].[BlogId] = 1 /* @x1 */
ORDER BY [t0].[Id],
         [t1].[Id]

The interesting bit is that while there are two queries here, this method does not generate a Cartesian product, so I have to consider this a plus. What I would like to know is whatever this is intentionally so or just a result of the way Linq to SQL eager loading is structured.

The query plan for this is simple as well:

image

Finally, Entity Framework (both 3.5 and 4.0), using this code:

db.Blogs
    .Include("Posts")
    .Include("Posts.Comments")
    .Where(x => x.Id == 1)
    .ToList();

This code will generate:

SELECT   [Project2].[Id]             AS [Id],
         [Project2].[Title]          AS [Title],
         [Project2].[Subtitle]       AS [Subtitle],
         [Project2].[AllowsComments] AS [AllowsComments],
         [Project2].[CreatedAt]      AS [CreatedAt],
         [Project2].[C1]             AS [C1],
         [Project2].[C4]             AS [C2],
         [Project2].[Id1]            AS [Id1],
         [Project2].[Title1]         AS [Title1],
         [Project2].[Text]           AS [Text],
         [Project2].[PostedAt]       AS [PostedAt],
         [Project2].[BlogId]         AS [BlogId],
         [Project2].[UserId]         AS [UserId],
         [Project2].[C3]             AS [C3],
         [Project2].[C2]             AS [C4],
         [Project2].[Id2]            AS [Id2],
         [Project2].[Name]           AS [Name],
         [Project2].[Email]          AS [Email],
         [Project2].[HomePage]       AS [HomePage],
         [Project2].[Ip]             AS [Ip],
         [Project2].[Text1]          AS [Text1],
         [Project2].[PostId]         AS [PostId]
FROM     (SELECT [Extent1].[Id]             AS [Id],
                 [Extent1].[Title]          AS [Title],
                 [Extent1].[Subtitle]       AS [Subtitle],
                 [Extent1].[AllowsComments] AS [AllowsComments],
                 [Extent1].[CreatedAt]      AS [CreatedAt],
                 1                          AS [C1],
                 [Project1].[Id]            AS [Id1],
                 [Project1].[Title]         AS [Title1],
                 [Project1].[Text]          AS [Text],
                 [Project1].[PostedAt]      AS [PostedAt],
                 [Project1].[BlogId]        AS [BlogId],
                 [Project1].[UserId]        AS [UserId],
                 [Project1].[Id1]           AS [Id2],
                 [Project1].[Name]          AS [Name],
                 [Project1].[Email]         AS [Email],
                 [Project1].[HomePage]      AS [HomePage],
                 [Project1].[Ip]            AS [Ip],
                 [Project1].[Text1]         AS [Text1],
                 [Project1].[PostId]        AS [PostId],
                 CASE 
                   WHEN ([Project1].[C1] IS NULL) THEN CAST(NULL AS int)
                   ELSE CASE 
                          WHEN ([Project1].[Id1] IS NULL) THEN CAST(NULL AS int)
                          ELSE 1
                        END
                 END AS [C2],
                 CASE 
                   WHEN ([Project1].[C1] IS NULL) THEN CAST(NULL AS int)
                   ELSE CASE 
                          WHEN ([Project1].[Id1] IS NULL) THEN CAST(NULL AS int)
                          ELSE 1
                        END
                 END AS [C3],
                 [Project1].[C1]            AS [C4]
          FROM   [dbo].[Blogs] AS [Extent1]
                 LEFT OUTER JOIN (SELECT [Extent2].[Id]       AS [Id],
                                         [Extent2].[Title]    AS [Title],
                                         [Extent2].[Text]     AS [Text],
                                         [Extent2].[PostedAt] AS [PostedAt],
                                         [Extent2].[BlogId]   AS [BlogId],
                                         [Extent2].[UserId]   AS [UserId],
                                         [Extent3].[Id]       AS [Id1],
                                         [Extent3].[Name]     AS [Name],
                                         [Extent3].[Email]    AS [Email],
                                         [Extent3].[HomePage] AS [HomePage],
                                         [Extent3].[Ip]       AS [Ip],
                                         [Extent3].[Text]     AS [Text1],
                                         [Extent3].[PostId]   AS [PostId],
                                         1                    AS [C1]
                                  FROM   [dbo].[Posts] AS [Extent2]
                                         LEFT OUTER JOIN [dbo].[Comments] AS [Extent3]
                                           ON [Extent2].[Id] = [Extent3].[PostId]) AS [Project1]
                   ON [Extent1].[Id] = [Project1].[BlogId]
          WHERE  1 = [Extent1].[Id]) AS [Project2]
ORDER BY [Project2].[Id] ASC,
         [Project2].[C4] ASC,
         [Project2].[Id1] ASC,
         [Project2].[C3] ASC

The query plan for this seems overly complicated:

image

 

If you’ll look closely, you’ll see that it generate a join between Blogs, Posts and Comments, essentially creating a Cartesian product between all three.

I am not going to offer commentary on the results, but open a discussion on them.

time to read 2 min | 344 words

image This is a new feature available for NHibernate Profiler*, Linq to SQL Profiler and Entity Profiler. Basically, it detects when the same query is executed with different parameter sizes, which generate different query plan in the query cache.

Let us say that we issue two queries, to find users by name. (Note that I am using a syntax that will show you the size of the parameters, to demonstrate the problem).

We can do this using the following queries.

exec sp_executesql 
      N'SELECT * FROM Users WHERE Username = @username',
      N'@username nvarchar(3)',
      @username=N'bob'
exec sp_executesql 
      N'SELECT * FROM Users WHERE Username = @username',
      N'@username nvarchar(4)',
      @username=N'john'

This sort of code result in two query plans stored in the database query cache, because of the different parameter sizes. In fact, if we assume that the Username column has a length of 16, this single query may take up 16 places in the query cache.

Worse, if you have two parameters whose size change, such as username (length 16) and password (length 16), you may take up to 256 places in the query cache. Obviously, if you use more parameters, or if their length is higher, the number of places that a single query can take in the query cache goes up rapidly.

This can cause performance problems as the database need to keep track of more query plans (uses more memory) may need evict query plans from the cache, which would result in having to rebuild the query plan (increase server load and query time).

* Please note that detecting this in NHibernate requires the trunk version of NHibernate. And it is pretty useless there, since on the trunk, NHibernate will never generate this issue.

FUTURE POSTS

  1. fsync()-ing a directory on Linux (and not Windows) - one day from now

There are posts all the way to Jun 09, 2025

RECENT SERIES

  1. Webinar (7):
    05 Jun 2025 - Think inside the database
  2. Recording (16):
    29 May 2025 - RavenDB's Upcoming Optimizations Deep Dive
  3. RavenDB News (2):
    02 May 2025 - May 2025
  4. Production Postmortem (52):
    07 Apr 2025 - The race condition in the interlock
  5. RavenDB (13):
    02 Apr 2025 - .NET Aspire integration
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}