Ayende @ Rahien

filter by tags archive

architecture (614) rss
bugs (451) rss
challanges (123) rss
community (380) rss
databases (481) rss
design (896) rss
development (642) rss
hibernating-practices (71) rss
miscellaneous (592) rss
performance (397) rss
programming (1086) rss
raven (1454) rss
ravendb.net (538) rss
reviews (184) rss

2025
- July (4)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

RavenDB - High-Performance NoSQL Document Database

Sep 20 2010

Generating API Documentation

time to read 3 min | 419 words

Tweet Share Share 9 comments

Tags:

Development

I want to generate API documentation for Raven DB. I decided to take the plunge and generate XML documentation, and to ensure that I’ll be good, I marked the “Warnings as Errors” checkbox.

864 compiler errors later (I spiked this with a single project), I had a compiling build and a pretty good initial XML documentation. The next step was to decide how to go from simple XML documentation to a human readable documentation.

Almost by default, I checked Sandcastle, unfortunately, I couldn’t figure out how to get this working at all. Sandcastle Help File Builder seems the way to go, though. And using that GUI, I had a working project in a minute or two. There are a lot of options, but I just went with the default ones and hit build. While it was building, I had the following thoughts:

I didn’t like that it required installation.
I didn’t know whatever you can easily put it in a build script.

The problem was that generating the documentation (a CHM) took over three and a half minutes.

I then tried Docu. It didn’t work, saying that my assembly was in the wrong format. Took me a while to figure out exactly why, but eventually it dawned on me that Raven is a 4.0 project, while Docu is a 3.5. I tried recompiling it for 4.0, but it gave me an error there as well, something about duplicate mscorlib, at which point I decided to drop it. That was also because I didn’t really like the format of the API it generated, but that is beside the point.

I then tried doxygen, which I have used in the past. The problem here is the sheer number of options you have. Luckily, it is very simple to setup using Doxywizard, and the generated documentation looks pretty nice as well. It also take a while, but significantly faster than Sandcastle.

Next on the list was MonoDoc, where I could generate the initial set of XML files, but was unable to run the actual mono doc browser. I am not quite sure why that happened, but it kept complaining that the result was too large.

I also checked VSDocman, which is pretty nice.

All in all, I think that I’ll go with doxygen, it being the simplest by far and generating (OOTB) HTML format that I really like.

Sep 19 2010

On Silverlight

time to read 2 min | 375 words

Tweet Share Share 75 comments

I read Davy Bryon’s post about the future direction of Silverlight with a great deal of interest. But I think that Davy is missing a significant point.

First, I agree with Davy about a few things. Silverlight for public facing applications is going to be… problematic. Support for mobile devices, something that becomes ever so increasingly important, isn’t really there. And there seems to be a lot more reticence to “bet the farm” on public facing web sites by most companies. A good example of public facing Silverlight application is Justin’s blog. Which is really nice, but has the following issues:

It has a long loading time compared to most websites.
The scrolling inside the Silverlight application feels… wrong, compared to the one in the browser. It works, but I think it uses a different scroll size than the browser.
I can’t zoom with the keyboard (Ctrl –, Ctrl +, Ctrl 0).

All of that doesn’t really matter, to be perfectly frank. Silverlight isn’t really meant for websites, and blogs are probably one of the most common examples of web sites. Silverlight main purpose is applications. Cast your mind a few years back, to the rise of the Web Applications. Why did it happen? Mostly because the cost of deploying desktop software to all the machines in the organization was so high. Building a web application reduced deployment costs.

But web applications are still pretty hard to do, compared to desktop applications, if you want to have similar UX. And the number one problem that keeps recurring is that you have to manually manage the state. Silverlight gives you the same null cost of deployment, but with a lot of the advantages of building desktop applications. Mostly, again, because of the state. Yes, I am well aware that there is a large body of knowledge on how to build complex web applications in pure HTML/JS. But that is still harder than building a Silverlight application.

And that is where I see Silverlight being used most often. It isn’t replacing the company’s site, it is replacing all the internal applications and systems that used to be HTML applications.

Sep 18 2010

Resolving cross site scripting issues.

time to read 1 min | 198 words

Tweet Share Share 16 comments

Tags:

Raven
Bugs

I got a bug report about the following in the admin UI for RavenDB.

As you can imagine, this is certainly something that we would like to avoid, but there is a problem. How the hell do you find the problem?

I mean, obviously we are encoding the value when we present it to the user, since I can see it on the UI. But it is still running, so I am doing something bad here. But I don’t feel like traversing through a mountain of JavaScript to find out exactly where this is happening. Luckily, we don’t have to, we can use the XSS itself to help it localize it:

And given that, we can get directly to the actual fault:

And fixing that is a snap.

Sep 17 2010

Profiler new features, Sept Edition

time to read 3 min | 406 words

Tweet Share Share 9 comments

Tags:

The following features apply to NHProf, EFProf, HProf, L2SProf.

The first feature is something that was frequently requested, but we kept deferring. Not because it was hard, but because it was tedious and we had cooler features to implement: Sorting.

Yep. Plain old sorting for all the grids in the application.

Not an exciting feature, I’ll admit, but an important one.

The feature that gets me exciting is the Go To Session. Let us take the Expensive Queries report as a great example for this feature:

As you can see, we have a very expensive query. Let us ignore the reason it is expensive, and assume that we aren’t sure about that.

The problem with the reports feature in the profiler is that while it exposes a lot of information (expensive queries, most common queries, etc), it also lose the context of where this query is running. That is why you can, in any of the reports, right click on a statement and go directly to the session where it originated from:

We bring the context back to the intelligence that we provide.

What happen if we have a statement that appear in several sessions?

You can select each session that this statement appears in, getting back the context of the statement and finding out a lot more about it.

I am very happy about this feature, because I think that it closes a circle with regards to the reports. The reports allows you to pull out a lot of data across you entire application, and the Go To Session feature allows you to connect the interesting pieces of the data back to originating session, giving you where and why this statement was issued.

Sep 16 2010

Can you make this code run any faster?

time to read 3 min | 597 words

Tweet Share Share 14 comments

public static class GuidExtensions
{
    public static Guid TransfromToGuidWithProperSorting(this byte[] bytes)
    {
        return new Guid(new[]
        {
            bytes[10],
            bytes[11],
            bytes[12],
            bytes[13],
            bytes[14],
            bytes[15],
            bytes[8],
            bytes[9],
            bytes[6],
            bytes[7],
            bytes[4],
            bytes[5],
            bytes[0],
            bytes[1],
            bytes[2],
            bytes[3],
        });
    }

    public static byte[] TransformToValueForEsentSorting(this Guid guid)
    {
        var bytes = guid.ToByteArray();
        return new[]
        {
            bytes[12],
            bytes[13],
            bytes[14],
            bytes[15],
            bytes[10],
            bytes[11],
            bytes[8],
            bytes[9],
            bytes[6],
            bytes[7],
            bytes[0],
            bytes[1],
            bytes[2],
            bytes[3],
            bytes[4],
            bytes[5],
        };
    }
}

Just to note, this is stupid micro optimization trick, it takes 0.00008 ms to execute right now, which is plenty fast enough, but it is fun to play with it.

Sep 15 2010

How to reproduce an occasionally failing test?

time to read 2 min | 268 words

Tweet Share Share 12 comments

Tags:

Bugs

One of the worst possible things that can happen is a test that fails, sometimes. Not always, and never under the debugger, but it fails.

It tells you that there is a bug, but doesn’t give you the tool to find it:

Usually, this is an indication of a problem with the code that is exposed through multi threading. I found the following approach pretty useful in digging those bastards out:

static void Main()
{
    for (int i = 0; i < 100; i++)
    {
        using (var x = new Raven.Tests.Indexes.QueryingOnDefaultIndex())
        {
            x.CanPageOverDefaultIndex();
            Console.Write(".");
        }
    }
}

Yes, it is trivial, but you would be surprised how often I see people throwing their hands in despair over issues like this.

Sep 14 2010

It ain’t no simple feature, mister

time to read 1 min | 183 words

Tweet Share Share 14 comments

Tags:

Programming

I recently got a bug report about NH Prof in a multi monitor environment. Now, I know that NH Prof works well in multi monitor environment, because I frequently run in such an environment myself.

The problem turned out to be not multi monitors in and of itself, but rather how NH Prof handles the removal of a monitor. It turns out that NH Prof has a nice little feature that actually remembers the last position the window was at, and returns to it on start. When the monitor NH Prof was located on was removed, on start NH Prof would put itself beyond the screen position.

That led to me having to figure out how to find the available monitor space, so I could detect if the saved positions were valid or not. What I found interesting in this is that what seemed to be a very trivial feature (save two numbers) turned out to be somewhat more complex, and I am pretty sure that there are other scenarios that I am missing (in the very same feature).

Sep 13 2010

Implementing CreateSequentialUuid()

time to read 2 min | 281 words

Tweet Share Share 60 comments

Tags:

Programming

We run into an annoying problem in Raven regarding the generation of sequential guids. Those are used internally to represent the etag of a document.

For a while, we used the Win32 method CreateSequentialUuid() to generate that. But we run into a severe issue with that, it create sequential guids only as long as the machine is up. After a reboot, the guids are no longer sequential. That is bad, but it also means that two systems calling this API can get drastically different results (duh! that is the point, pretty much, isn’t it?). Which wouldn’t bother me, except that we use etags to calculate the freshness of an index, so we have to have an always incrementing number.

How would you implement this method?

public static Guid CreateSequentialUuid()

A few things to note:

We really actually care about uniqueness here, but only inside a single process, not globally.

The results must always be incrementing.

The always incrementing must be consistent across machine restarts and between different machines.

Yes, I am fully aware of the NHibernate’s implementation of guid.comb that creates sequential guids. It isn't applicable here, since it doesn't create truly sequential guids, only guids that sort near one another.

Sep 12 2010

Estimates sucks, especially when I pay for them

time to read 3 min | 588 words

Tweet Share Share 21 comments

Tags:

I recently got an estimate for a feature that I wanted to add to NH Prof. It was for two separate features, actually, but they were closely related.

That estimate was for 32 hours.

And it caused me a great deal of indigestion. The problem was, quite simply, that even granting that there is the usual padding of estimates (which I expect), that timing estimate was off, way off. I knew what would be required for this feature, and it shouldn’t be anywhere near complicated enough to require 4 days of full time work. In fact, I estimated that it would take me a maximum of 6 hours and a probable of 3 hours to get it done.

Now, to be fair, I know the codebase (well, actually that isn’t true, a lot of the code for NH Prof was written by Rob & Christopher, and after a few days of working with them, I stopped looking at the code, there wasn’t any need to do so). And I am well aware that most people consider me to be an above average developer.

I wouldn’t have batted an eye for an estimate of 8 – 14 hours, probably. Part of the reason that I have other people working on the code base is that even though I can do certain things faster, I can only do so many things, after all.

But a time estimate that was 5 – 10 times as large as what I estimated was too annoying. I decided that this feature I am going to do on my own. And I decided that I wanted to do this on the clock.

The result is here:

This is actually total time over three separate sittings, but the timing is close enough to what I though it would be.

This includes everything, implementing the feature, unit testing it, wiring it up in the UI, etc.

The only thing remaining is to add the UI works for the other profilers (Entity Framework, Linq to SQL, Hibernate and the upcoming LLBLGen Profiler) . Doing this now…

And we are done.

I have more features that I want to implement, but in general, if I pushed those changes out, they would be a new release that customers can use immediately.

Nitpicker corner: No, I am not being ripped off. And no, the people making the estimates aren't incompetent. To be perfectly honest, looking at the work that they did do and the time marked against it, they are good, and they deliver in a reasonable time frame. What I think is going on is that their estimates are highly conservative, because they don't want to get into a bind with "oh, we run into a problem with XYZ and overrun the time for the feature by 150%".

That also lead to a different problem, when you pay by the hour, you really want to have estimates that are more or less reasonably tied to your payments. But estimating with hours has too much granularity to be really effective (a single bug can easily consume hours, totally throwing off estimation, and it doesn't even have to be a complex one.)

Sep 11 2010