Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,583
|
Comments: 51,212
Privacy Policy · Terms
filter by tags archive

RavenDB, Victory

time to read 2 min | 299 words

Jeremy Miller’s post about Would I use RavenDB again has been making the round. It is a good post, and I was asked to comment on it by multiple people.

I wanted to comment very briefly on some of the issues that were brought up:

  • Memory consumption – this is probably mostly related to the long term session usage, which we expect to be much more short lived.
    • The 2nd level cache is mostly there to speed things up when you have relatively small documents. If you have very large documents, or routinely have requests that return many documents, that can be a memory hog. That said, the 2nd level cache is limited to 2,048 items by default, so that shouldn’t really be a big issue. And you can change that (or even turn it off) with ease.
  • Don’t abstract RavenDB too much – yeah, that is pretty much has been our recommendation for a while.
    • I don’t see this as a problem. You have just the same issue if you are using any OR/M against an RDBMS.
  • Bulk Insert – the issue has already been fixed. In fact, IIRC, it was fixed within a day or two of the issue being brought up.
  • Eventual Consistency – Yes, you need to decide how to handle that. As Jeremy said, there are several ways of handling that, from using natural keys with no query latency associated with them to calling WaitForNonStaleResultsAsOfNow();

Truthfully, the thing that really caught my eye wasn’t Jeremy’s post, but one of the comments:

image

Thanks you, we spend a lot of time on that!

time to read 2 min | 325 words

There is a big problem in the RavenDB build process. To be rather more exact, there is a… long problem in the RavenDB build process.

image

As you can imagine, when the build process run for that long, it doesn’t' get run too often. We already had several runs of “let us optimize the build”. But… the actual reason for the tests taking this long is a bit sneaky.

image

To save you from having to do the math, this means an average of 1.15 seconds per test.

In most tests, we actually have to create a RavenDB instance. That doesn’t take too long, but it does take some time. And we have a lot of tests that uses the network, because we need to test how RavenDB works on the wire.

From that perspective, it means that we don’t seem to have any real options. Even if we cut the average cost of running the tests by half, it would still be a 30 minutes build process.

Instead, we are going to create a layered approach. We are going to freeze all of our existing tests, move them to an Integration Tests project. We will create a small suite of tests that cover just core stuff with RavenDB, and use that. Over time, we will be adding tests to the new test project. When that becomes too slow, we will have another migration.

What about the integration tests? Well, those will be run solely by our build server, and we will setup things so we can automatically test when running from our own forks, not the main code line.

time to read 2 min | 320 words

This seems to be a pretty common issue with people getting the two of them confused. As an example, let us take the users in Stack Overflow:

image

Here, we want to get the users in order. We want to get all the users in descending order of reputation.

But what happens when we want to do an actual search, for example, we want to get users by tag. Perhaps we want to get someone that knows some ravendb.

Here is the data that we have to work with:

image

Now, when searching, we want to be able to do the following. Find users that match what the tags that we specified, that are relevant and have them show up in reputation order.

And that is where it kills us. Relevancy & order are pretty much exclusive. Before we can explain that, we need to understand that order is absolute, but relevancy is not. If I have 10,000 tags, there is very little meaning to me having a tag or not. But if I have 10 tags, me having a tag or not is a lot more important. You want to talk with an expert in a specific field, not just someone who is a jack of all trades.

Now, it might be that you want to apply some boost factor to users with high reputation, because there are people who are jack of all trades and master of most. That is the difference between boosting and ordering.

Ordering is absolute, while boosting is a factor applied against the relative relevancy of the current query.

time to read 3 min | 480 words

Because RavenDB replication is async in nature, there is a period of time between a write has been committed on the master and until it is visible to the clients.

A user has requested that we would provide a low latency way to provide a solution to that. The idea was that the master server would report to the secondaries that a write happened, and then they would mark all reads from them for those documents as dirty, until replication caught up.

Implementation wise, this is all ready to happen. We have the Changes API, which is an easy way to get changes from a db. We have the ability to return a 204 Non Authoritative response, so it looks easy.

In theory, it sounds reasonable, but this idea just doesn’t hold water. Let us talk about normal operations. Even with the “low latency” notifications (and replication is about as low latency as it already get), we have to deal with a window of time between the write completing on the master and the notification arriving on the secondaries. In fact, it is the exact same window as with replication. Sure, if you have a high replication load, that might be different, but those tend to be rare (high write load, very big documents, etc).

But let us assume that this is really the case. What about failures?

Let us assume Server A & B and client C. Client C makes a write to A, A notifies B and when C reads from B, it would get 204 response until A replicates to B. All nice & dandy. But what happens when A can’t talk to B ? Remember a server being down is the easiest scenario, the hard part is when both A & B are operational, but can’t talk to one another. RavenDB  is designed to gracefully handle network splits and merges, so what would happen in this case?

Client C writes to A, but A can’t notify B or replicate to it. Client C reads from B, but since B got no notification about a change, it return 200 Ok response, which means that this is the latest version. Problem.

In this case, this is actually a bigger problem than you might consider. If we support the notifications under the standard scenario, user will make assumptions about this. They will have separate code paths for non authoritative responses, for example. But as we have seen, we have a window of time where the reply would say it is authoritative even though it isn’t (a very short one, sure, but still) and under failure scenarios we will out right lie.

It is better not to have this “feature” at all, and let the user handle that on his own (and there are ways to handle that, reading from the master for important stuff, for example).

time to read 3 min | 563 words

RavenDB handles replication in an async manner. Let us say that you have 5 nodes in your cluster, set to use master/master replication.

That means that you call SaveChanges(), the value is saved to the a node, and then replicated to other nodes.  But what happens when you have safety requirements? What happens if a node goes down after the call to SaveChanges() was completed, but before it replicate the information out?

In other systems, you have the ability to specify W factor, to how many nodes this value will be written before it is considered “safe”. In RavenDB, we decided to go in a similar route. Here is the code:

   1: await session.StoreAsync(user);
   2: await session.SaveChangesAsyng(); // save to one of the nodes
   3:  
   4: var userEtag = session.Advanced.GetEtagFor(user);
   5:  
   6: var replicas = await store.Replication.WaitAsync(etag: userEtag, repliacs: 1);

As you can see, we now have a way to actually wait until replication is completed. We will ping all of the replicas, waiting to see that replication has matched or exceeded the etag that we just wrote.  You can specify the number of replicas that are required for this to complete.

Practically speaking, you can specify a timeout, and if the nodes aren’t reachable, you will get an error about that.

This gives you the ability to handle write assurances very easily. And you can choose how to handle this, on a case by case basis (you care to wait for users to be created, but not for new comments, for example) or globally.

time to read 2 min | 338 words

One of the things that we keep thinking about with RavenDB is how to make it easier for you to run in production.

To that end, we introduce a new feature in 2.5, Index Locking. This looks like this:

image

But what does this mean, to lock an index?

Well, let us consider a production system, in which you have the following index:

from u in docs.Users
select new
{
   Query = new[] { u.Name, u.Email, u.Email.Split('@') }
}

After you go to production, you realize that you actually needed to also include the FullName in the search queries as well. You can, obviously, do a full deployment from scratch, but it is generally so much easier to just fix the index definition on the production server, update the index definition on the codebase, and wait for the next deploy for them to match.

This works, except that in many cases, RavenDB applications call IndexCreation.CreateIndexes() on start up. Which means that on the next startup of your application, the change you just did will be reverted. These options allows you to lock an index for changes, either in such a way that gives you the ability ignore changes to this index, or by raising an error when someone tries to modify the index

It is important to note that this is not a security feature, you can at any time unlock the index. This is there to make help operations, that is all.

time to read 2 min | 329 words

A while ago we introduced the ability to send js scripts to RavenDB for server side execution. And we have just recently completed a nice improvement on that feature, the ability to create new documents from existing ones.

Here is how it works:

store.DatabaseCommands.UpdateByIndex("TestIndex",
                                     new IndexQuery {Query = "Exported:false"},
                                     new ScriptedPatchRequest { Script = script }
  ).WaitForCompletion();

Where the script looks like this:

for(var i = 0; i < this.Comments.length; i++ ) {
   PutDocument('comments/', {
    Title: this.Comments[i].Title,
    User: this.Comments[i].User.Name,
    By: this.Comments[i].User.Id
  });
}

this.Export = true;

This will create a set of documents for each of the embedded documents.

time to read 2 min | 216 words

So I was diagnosing a customer problem, which required me to write the following:

image

This is working on a data set of about half a million records.

I took a peek at the stats and I saw this:

image

You can ignore everything before 03:23, this is the previous index run. I reset it to make sure that I have a clean test.

What you can see is that we start out with a mapping & reducing values. And you can see that initially this is quite expensive. But very quickly we recognize that we are reducing a single value, and we switch strategies to a more efficient method, and we suddenly have very little cost involved in here. In fact, you can see that the entire process took about 3 minutes from start to finish, and very quickly we got to the point where are bottle neck was actually the maps pushing data our way.

That is pretty cool.

time to read 1 min | 127 words

I was asked to comment on the current state of Rhino Mocks. The current codebase is located here: https://github.com/hibernating-rhinos/rhino-mocks

The last commit was 2 years ago. And I am no longer actively / passively monitoring the mailing list.

From my perspective, Rhino Mocks is done. Done in the sense that I don’t have any interest in extending it, done in the sense that I don’t really use mocking any longer.

If there is anyone in the community that wants to steps in and take charge as the Rhino Mocks project leader, I would love that. Failing that, the code it there, it works quite nicely, but that is all I am going to be doing with this for the time being and the foreseeable future.

FUTURE POSTS

  1. Production postmorterm: The rookie server's untimely promotion - about one day from now

There are posts all the way to Jun 11, 2025

RECENT SERIES

  1. Production postmorterm (2):
    02 Feb 2016 - Houston, we have a problem
  2. Webinar (7):
    05 Jun 2025 - Think inside the database
  3. Recording (16):
    29 May 2025 - RavenDB's Upcoming Optimizations Deep Dive
  4. RavenDB News (2):
    02 May 2025 - May 2025
  5. Production Postmortem (52):
    07 Apr 2025 - The race condition in the interlock
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}