Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,583
|
Comments: 51,214
Privacy Policy · Terms
filter by tags archive
time to read 3 min | 533 words

Dror from TypeMock has managed to capture the essence of my post about unit tests vs. integration tests quite well:

Unit tests runs faster but integration tests are easier to write.

Unfortunately, he draws an incorrect conclusion out of that;

There is however another solution instead of declaring unit testing as hard to write - use an isolation framework which make writing unit test much easier.

And the answer to that is… no, you can’t do that. Doing something like that put you back in the onerous position of unit test, where you actually have to understand exactly what is going on and have to deal with that. With an integration test, you can assert directly on the end result, which is completely different than what I would have to do if I wanted to mock a part of the system. A typical integration test looks something like:

public class SelectNPlusOne : IScenario
{
    public void Execute(ISessionFactory factory)
    {
        using (var session = factory.OpenSession())
        using (var tx = session.BeginTransaction())
        {
            var blog = session.CreateCriteria(typeof(Blog))
                .SetMaxResults(1)
                .UniqueResult<Blog>();//1
            foreach (var post in blog.Posts)//2
            {
                Console.WriteLine(post.Comments.Count);// SELECT N
            }
            tx.Commit();
        }
    }
}


[Fact]
public void AllCallsToLoadCommentsAreIdentical()
{
    ExecuteScenarioInDifferentAppDomain<SelectNPlusOne>();
    var array = model.Sessions[0]
        .SessionStatements
        .ExcludeTransactions()
        .Skip(2)
        .ToArray();
    Assert.Equal(10, array.Length);
    foreach (StatementModel statementModel in array)
    {
        Assert.Equal(statementModel.RawSql, array[0].RawSql);
    }
}

This shows how we can assert on the actual end model of the system, without really trying to deal with what is actually going on. You cannot introducing any mocking to the mix without significantly hurting the clarity of the code.

time to read 2 min | 358 words

C# in Depth has a very different focus from most “Learn language X” books. Starting from the premise that you are already am familiar with the basic language syntax (for 1.0, or maybe you are a Java or C++ programmer), it focus entirely on the new additions to the language and platform.

Its stated goal is to take C# 1.0 developers and give them all the changes that happened to the language in the C# 2.0 and 3.0 versions. And it most certainly deserves the “in Depth” part of the name.

I consider myself a fairly proficient developer, and I believe that I have adequate knowledge in both C# 2.0 and 3.0, but I still found myself learning new things. More to the point, as someone who do know much of the material in the book, I was quite impress with the quality of the material, the depth of the discussion and the level in which it is being presented.

I think that Jon has managed to capture a lot of the complexities of the language in a way that is approachable, easy to understand and complete.

I have been recommending that book for clients ever since I read it, and only recently realized that I have never actually posted about it. I kept intending to, but that doesn’t seem to put words on the blog, unfortunately (otherwise I would blog even more).

The complexity of the C# is a personal worry of mine, mostly because I see how hard it is for people to bridge the gap when moving to the newer versions of the language and having to face the explosion of possibilities. I think that this book is a big step in closing that gap.

Perhaps the best compliment that I can give to the book is that I fully intend to use the 2nd edition as the text to read to get into C# 4.0 when it is out. No reason not to let Jon do all the hard work :-)

time to read 3 min | 418 words

I have been talking a lot lately about the technical aspects of working with NH Prof (and there are more posts scheduled), but not really talking about the new features at all. I firmly believe in the lynched-by-the-users model, and working for so long without having new features is an anathema to me.

I am going to try to leak them to the blog as they go fully online (with screenshotable UI), although I do mean to keep several surprises in my sleeve, and in my backpack, too, come to think about it, I need space for some of them ;-) ).

Anyway, I don’t think that this screen shot requires any additional explanations:

image

Unlike my previous posts about NH Prof’s new features, this is not currently available to download. Currently we are planning to sync up everything and show you what we got by the end of the week, so this is truly just a sneak peak right now.

The really amazing part of this feature? I did it, all on my own. The reason that I am surprised is that this feature actually have UI in it. And my facilities in UI manipulation are decidedly lacking. Nevertheless, this is all my work. The UI team didn’t have to get involved. I mentioned before how impressed I am by the work Christopher and Rob did on NH Prof. What I didn’t think got mentioned was the UI architecture that they built. Working separately, both me and Christopher & Rob has reached the same type of general architecture, based on concepts and features.

This report is a new feature based on an existing concept. The work to make that happen was already done, when the concept was introduced. Creating a new feature is a piece of cake now, and doesn’t require any special knowledge or any UI talent.

I am loving it.

Oh, and just to give you an idea about the time frame, I started coding the UI about 45 minutes ago, because I sat down to ensure that NH Prof was ready to show for my NHibernate workshops. I got sidetracked a bit and wired the feature into the UI. It is actually a bit more impressive, because the 45 minutes time also include the time to write the blog post :-)

time to read 1 min | 173 words

Can you make this test pass?

var expected = @"Cached query: 
SELECT this_.Id             as Id5_0_,
       this_.Title          as Title5_0_,
       this_.Subtitle       as Subtitle5_0_,
       this_.AllowsComments as AllowsCo4_5_0_,
       this_.CreatedAt      as CreatedAt5_0_
FROM   Blogs this_
WHERE  this_.Title = 'The lazy blog' /* @p0 */
       and this_.Id = 1 /* @p1 */

";
Assert.True(Regex.IsMatch(expected, expected.Replace("5_", @"\d+_")));

I really don’t know what to think about this anymore….

time to read 3 min | 590 words

As someone that so firmly believe that persistence is a solved problem, I keep tripping over it. The issue is quite simple, each scenario has radically different requirements, and usually require different solutions.

In NH Prof case, just using RDBMS is not a really good solution, but before arriving to that conclusion, I really need to explain what the requirements are. For NH Prof, there are several reasons to want to be able to persist things:

  • Create an offline dump of the profiling session, to be analyzed later.
    • This is actually a critical feature from my perspective, since it allows me to troubleshoot user issues quite easily. The user send me the dump file, I load it in NH Prof and can see exactly what their problem is.
  • Saving a profiling session to be analyzed at a later date (File > Save / Load).
  • In addition to the first tow, persistence format basically means the format of a stream, and we can also use a stream as a communication mechanism.

Right now, NH Prof actually have three different ways of handling each of those tasks (xml log, binary serialization and remoting). Obviously I would like to avoid having to do this, if only because more code that does the same thing for different purposes tend to create triple the amount of work. There is also the problem that each of those methods give a different set of data to the application, which make my life quite a bit harder.

There is also another consideration, which will make sense to you when we release NH Prof v1.0, but I don’t want to talk about that reason just yet.

So, what is the solution? Can we make it work?

The answer actually lies in the architecture that NH Prof utilize. In its core, NH Prof is a sophisticated analysis engine with a fancy UI on top. And what it analyze is the event stream from NHIbernate. That can actually cause some interesting problems. When we save to a file, what should we save? The event stream? The result of the analysis? There are arguments for both approaches.

My decision was based on several factors, simplicity and “how much pain do I have to deal with” were chief among them. The end result is that I decided to make use of Protocol Buffers, which is a serialization format that Google put out. It has some interesting properties, such as being fast to deserialize and serialize, light weight and cross platform. After some time struggling with the various options, I settled on Jon Skeet’s C# implementation, and so far it looks very good. Maybe I should join the fan club? :-)

Anyway, it means that all three separate persistence options are going to move over to be a protocol buffers implementation. There are still some issues that I have to deal with, mostly with the reliability of the network connection and retries attempts, but I feel certain that I can make this happen. The end result is pretty significant simplification in the way that I am working with the codebase, and it resolve a few other problems as well (mostly related to my misuse of remoting).

All in all, I think NH Prof is rapidly moving toward a functional release status.

time to read 1 min | 128 words

I can sum it up pretty easily:

Unit tests:
65 passed, 0 failed, 5 skipped, took 7.49 seconds.

Integration tests:
45 passed, 0 failed, 0 skipped, took 284.43 seconds.

Except, there is one thing that you aren’t seeing here. Writing an integration tests is about as straightforward as they come. Execute the user scenario, assert the result. There is no real knowledge required to write the test, and they are very easy to read and understand.

Unit tests, however, often require you to understand how the software work in much finer detail, and they often make no sense to someone who isn’t familiar with the software.

time to read 2 min | 332 words

The question came up in the alt.net mailing list, and I started replying there, before deciding that it would make a great post. I actually want to talk specifically about the notion that came of avoiding explicit state management in favor of out of process session state (on top of a database or memcached).

The problem is that those two are not interchangeable. Using a session make it easy to preserve the illusion of statefullness on the web, but it is an illusion. In general, you have to give me pretty good reasons before I will go with an abstraction layer. Session state make a lot of sense, but it carry with it its own set of problems.

On the face of it, explicit state management (making direct call to memcached or database) seems like it share the same problem, with the additional problem is that you don’t get the session abstraction to manage that for you.

That is true, but there are distinct advantages for that.

  • Different pieces of the state usually have different longelivity, refresh rate and scope scenarios. With a session, you get a single option (usually predetermined) for all of them.
  • You can pick and choose what you want to read, resulting in less data moving on the network. With the session, you have to get it all or none at all.
  • Reduced writes, because you only write on changes, the session has to flush itself (or do non trivial change tracking) every time.

There is also another issue, of concurrency. With a session, in order to maintain the sequential facade, you take a lock on the session for the duration of the request. That tend to create contention if you have concurrent request from the same user (which are quite frequent now, thanks to ajax and multi tabs), I now that we had to write code around that several times.

The BIG Merge

time to read 3 min | 413 words

Merging is one of my least favorite activities, this is especially true when we are talking about a big merge. You might have noticed that I have been talking lately about some big changes that we are making to NH Prof. And now the time has come to merge it all back together.

Well, big changes is quite an understatement, to be fair. what we did is rip apart the entire model the application used to work with. We moved from a push model to a pull model, and that had quite a lot of implications throughout the code base.

Of course, we also did some work on the trunk while we worked on that, so before we can even think about reintegrating the branch, we have to do a merge from the trunk to the branch, which resulted in:

image

Now, the problem is that this is totally stupid.

Nitpicker corner:
   And yes, I am using Subversion, and before the Git fanboys totally jump on me, I am seriously considering moving to Git to ease this sort of pain.
   And yes, I should have done reverse merges to the  branch all along, so before the Subversion fanboys totally jump on me, I know that.

It is stupid because there are some changes that has been made in parallel in both branches, there are some changes that involve deleting or renaming files that are just not being merged. And yes, I am using SVN 1.5. So, after resolving all the conflicts, I have to do a manual check over this, to make sure that we didn’t miss a merge because of that. I am at the point of so much Argh! that I can’t really keep it inside, hence this post.

A common example that I know is going to hit me is something like FormattedStatement, which is a pretty important class. In the branch, that was changed to be SqlStatement, and I renamed the file as well. Subversion doesn’t merge changes across that. And yes, I used svn rename for that (via Visual SVN).

And, to add insult to injury, doing this manual checking means a lot of going over the network, which means that this is slow.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
  2. Webinar (7):
    05 Jun 2025 - Think inside the database
  3. Recording (16):
    29 May 2025 - RavenDB's Upcoming Optimizations Deep Dive
  4. RavenDB News (2):
    02 May 2025 - May 2025
  5. Production Postmortem (52):
    07 Apr 2025 - The race condition in the interlock
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}