Ayende @ Rahien

Dec 19 2017

PR ReviewThe simple stuff will trip you

time to read 2 min | 285 words

Tags:

reviews

In a recent PR, I run into this code, which is used in query generation to decide if we need to quote a particular alias. The code itself is pretty straightforward and easy to follow:

It also have two distinct issues. First, there is the allocation because of the ToUpper call and second, we are doing O(N) search on the alias array every single time.

I asked for a change, to use HashSet and to use the OrdinalIgnoreCase comparer.

Here is the change I got back:

This is exactly what I asked for, and it is very subtly wrong. We are now saving an allocation, which is great, but the problem is with the Contains method.

This looks okay, but this is not HashSet.Contains, instead, this is an extension method from Enumerable.Contains, which iterates over the set and compare each value.

The fix is also simple:

And now we don’t have O(N) anymore.

Although I’ll admit that for such small size, it probably doesn’t matter.

Nov 08 2017

PR ReviewEncapsulation stops at the assembly boundary

time to read 2 min | 247 words

Tweet Share Share 20 comments

Tags:

The following set of issues all fall into code that is used within the scope of a single assembly, and that is important. I’m writing this blog post before I got the chance to talk to the dev in question, so I’m guessing about intent.

This change is likely motivated by the fact that callers are not expected to make a modification to the resulting dictionary.

That said, this is used between different components in the same assembly, and is never exposed outside. That means that we have a much higher trust between the components, and reading IReadOnlyDictionary means that we need to spend more cycles trying to figure out who you are trying to protect from.

Equally important, in this case, the Dictionary methods can be called without any virtual call overhead, while the IReadOnlyDictionary needs interface dispatch to work.

This is a case that is a bit more subtle. The existingData is a variable that is passed to a method. The problem is that in this case, no one is ever going to send null, and sending a null is actually an error.

In this case, if we did get a null, I would rather that the code would immediately crash with “what just happened?” rather than limp along with bad data.

Oct 25 2017

PR ReviewIt’s the error handling, again

time to read 1 min | 90 words

Tweet Share Share 8 comments

Tags:

This is part of a PR related to making sure that disposing once works. It contains this code:

This loses critically important information. Namely, the stack trace of the original exception. That leaves aside the issue that an aggregate exception may contain multiple exceptions as well.

In general, and I know this is old hat, whenever you see “throw e;” or “throw e.InnerException;” of any kind, you should always treat it as a bug.

Oct 23 2017

PR ReviewBeware the things you can’t see

time to read 1 min | 110 words

Tweet Share Share 0 comments

Tags:

reviews

I had to reject the following change in a recent PR. IN this context, the flags and conflicted.Flags are the same, and that wasn’t the problem. Can you spot the issue?

The problem is that the second version does an allocation. It does this silently, and you need to know about this issue to know that this happens. There is good discussion on this in this StackOverflow question.

It looks like this has been fixed in the JIT for CoreCLR and will be part of the 2.1 release when it is out.

Oct 20 2017

PR ReviewCode has cost, justify it

time to read 3 min | 453 words

Tweet Share Share 5 comments

Tags:

development

There is a reason why people talk about idiomatic code. Code that is idiomatic to the language matches what it expect and it generally faster / easier to work with for both developers and the compiler / runtime.

During a PR review, I run into this code:

The idiomatic manner for writing this code would have been any of:

“@id” == property
Constants.Documents.Metadata.Id == property
property.Equals(Constants.Documents.Metadata.Id)
Constants.Documents.Metadata.Id.Equals(property)

I can argue that the second option is the most idiomatic, and that the 3rd option can fail with Null Reference Exception if the property is null, but all of them are pretty clear.

Now, RavenDB has a lot on non idiomatic code, usually when we need to get more performance. For example:

This is code that is doing very much what is done above, but it does this on the raw byte buffer, and it knows that it is accessing UTF8 characters, so we can do some nice optimizations there to compare by just doing two instructions.

Indeed, when queried, the developer answered:

Most of the time its going to be false and comparing ints is cheaper than strings

There are several problems with this. First, this particular piece of code isn’t in a part of the code that is extremely performance sensitive. The string buffer work above is for processing requests from the network, a piece of code that can be called tens and hundreds of thousands of times per second. Performance there matters, a lot. This code is meant to be called as part of streaming results to the user, so it is likely to handle very large volume of data. Performance there matters, for sure, but we need to consider how much it matters.

Second, let us peek into what will actually happen if we drop the property.Length check. The call will end up calling to the native string routines in the CLR, and the relevant portion is:

In other words, this check is already going to happen, we didn’t really save anything from making it.

Third, and the most subtle of them all. This check is using a check against a constant, whose value is “@id”. It also check that the property .Length is equal to 3. The whole point of using a constant is that we need to replace it in just one location. But in this case, we will likely change the constant value, not realize that there is a hardcoded length elsewhere in the code and fail miserably with hard to explain behavior.

Aug 10 2017

PR ReviewErrors, errors and more errors

time to read 1 min | 49 words

Tweet Share Share 6 comments

Tags:

reviews

I’m harping on this, but this is really annoying. Here is the latest batch.

And this:

Jul 21 2017

PR ReviewIs your error handling required?

time to read 1 min | 129 words

Tweet Share Share 2 comments

Tags:

reviews

During reviewing a PR I run into what seemed like a strange thing. Take a look at this change:

This came with its own exception class, and left me pretty confused. Why would I want to have something like that?

Here we have some error handling code that doesn’t seem to add any additional value. Everything in the error here can be deducted from the details of the exception that will be thrown if we did nothing.

The fact we throw a specialized exception might be meaningful, but looking at the code, this isn’t actually used for anything.

Like all code, error handling needs to justify itself, and this one doesn’t pass the bar.

Jun 23 2017

PR Reviewavoid too many parameters

time to read 2 min | 228 words

Tweet Share Share 0 comments

Tags:

reviews

During code review I run into these two sections, which raised a flag. Can you tell why?

The problem with this type of code is two fold. First, we add optional parameters, to reduce the number of breaking changes we have. The problem with that is that we already have parameters on the call, and eventually you’ll get to something like this:

Which is the queen of optional parameters method, and you can probably guess how it looks internally.

In the first case, we can add the new optional parameter to the… options variable that we are already sending this method. This way, we don’t have to worry about breaking changes, and we already have a way to setup options, determine defaults, etc.

In the second case, we are passing two bools to the method, and there isn’t a preexisting parameters object. Instead of creating one, we can use a Flags enum, whose bits we can set to determine what exactly the behavior of this method should be. That is generally much easier to maintain in the long run.

Jun 21 2017

PR Reviewthe errors should be nurtured

time to read 2 min | 202 words

Tweet Share Share 4 comments

Tags:

reviews

I run into the following in a PR I recently reviewed. Take a look and try to figure out why I’m pointing this bits out:

Those are bad errors. In the sense that they are hiding information. Sure, in 90% of the cases the user just put it the backup location or the restore location, so they know what the application is referring to. But the other 10% is looking at the exception from the logs, having “I got an error” support call or all sort of weird stuff.

In particular, one of the most annoying things is when the user and the application disagree on a relative path. If the user believes that the relative path is based on one location and the application another, hilarity ensues. But not in a fun way.

A better way to prevent all of those would be to just include the actual locations (fully resolved, mind) in the error. It doesn’t prevent support calls, but it does make them a lot easier to solve.

Oren Eini

Oren Eini

CEO of RavenDB

PR ReviewThe simple stuff will trip you

PR ReviewEncapsulation stops at the assembly boundary

PR ReviewIt’s the error handling, again

PR ReviewBeware the things you can’t see

PR ReviewCode has cost, justify it

PR ReviewErrors, errors and more errors

PR ReviewIs your error handling required?

PR Reviewavoid too many parameters

PR Reviewthe errors should be nurtured

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed