Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,583
|
Comments: 51,212
Privacy Policy · Terms
filter by tags archive
time to read 5 min | 860 words

imageRavenDB’s core philosophy is that It Just Works and that means that we try very hard to get things right. Conversely, that means that we are also trying to make it hard to do the wrong thing. Basically, we want to push you hard into the pit of success.

Part of that approach is what we call the governors. It is a set of features that will detect and abort known bad behavioral patterns.  I have already talked about Unbounded Result Sets and I recently run into this post, which shows how nasty a problem that can be, and how invisible.

Another governor we have in place is the session’s maximum request limit. A session is meant to be a scope, it has a very short duration and is typically used for a single request / processing a single message, etc. It is supposed to live as long as the business transaction. Because the session is scoped, we can reason that a single session that is making a lot of database operation is probably doing something pretty bad.

For example, it might be calling the database in a loop. Those kind of issues can be truly insidious. Let us look at the following code (taken from here):

image

image

This kind of thing is a silent performance killer. No one is likely to see this is happening, and it will silently increase the number of database operations that your application make, leading to increased DB load, higher page load times and all sort of problems associated with it.

In one particular case, I saw a single page load generate 17,000 queries to the database. The software in question grew over time, and people assumed that this was just it took to run the software. Their database server was a true monster (this was about a decade ago), with dedicated RAM disks, high CPU count and a truly ridiculous amount of memory. Just to explain, we are talking about something like this:

image

But a decade ago, and it had a quite a bit of space. Now, this kind of beasty can do 500K IOPS (I’m drooling just thinking about it), but it is damn expensive. Just to put things in perspective, I spent several weeks at that company working on this particular problem, the cost of those weeks of work didn’t even cover the cost of the drive on that machine.

And on that monster, we were seeing page load times in the tens of seconds, and extremely high system load. I was able to bring it down to about 70 queries per page load, and their database server has pretty much idled ever since (IIRC, they turn that machine into a VM host for all the rest of their software, actually).

This is something that can bite.

To avoid that, we have the max numbers of requests in the session, which will abort excessive amount of database chatter. This have two important effects:

  • It follow the “better let one bad request die rather than take down the entire application”.
  • It put a budget on the number of calls that you can make.

Now, that budget is actually really interesting. Because we have it, we need to think about how we can reduce the number of database calls that we have to process the request. That led to a whole bunch of features around that. Lazy requests, includes and transformers to name just a few.

That had a positive unintended consequence. RavenDB is fast,  really fast, but it is also typically deployed as a network database, that means that each database call actually go over the network, and we all remember our fallacies, right?

image

In our profiling, we found that most often, the real cost in a RavenDB application was the back & forth chatter with the database. Reducing the number of requests we make to the server has an immediate benefit. And RavenDB allows you to do that by pipelining requests with Lazy, predicting requests with Includes or running the whole thing on the server side with Transformers.

And, like all governors, you can control it, RavenDB allows you to decide what the limit should be (on that particular session or globally based on your actual needs and environment.

time to read 3 min | 492 words

imageRavenDB doesn’t provide any way for queries to do table scans*.

* That isn’t actually true, we have Data Exploration, which does just that, but we don’t provide an explicit API for it, and it is a DBA driven feature (I wanna get this report with a minimum of fuss without regards to how much it is going to cost me) than an API that is exposed.

What this means is that the cost of query operations in RavenDB is always going to be O(logN), instead of O(N). How does this relate to the topic of RavenDB retrospectives?

One of the things that I kept seeing over and over as a database consultant was that databases are complex, and that it is easy to write a query that works perfectly fine for a period of time, then fall over completely as the size of the data goes over a certain threshold. In particular, queries that use table scans are particularly vulnerable for this issue.

One of the design goals for RavenDB was to avoid that, completely. We did it by simply forbidding any query that doesn’t have an index. initially, that was a pretty annoying requirement, because every time that you needed a new query, you needed to go ahead and create an index. But early on we got the Auto Indexes feature.

Basically, it means that when you can query RavenDB without specifying which index you want to use, at which point the query optimizer will inspect the query and decide which index can serve it. The most interesting point here is that if there isn’t an index that can serve this query, the query optimizer is going to create one on the fly. See the previous post about BASE indexes and how we can afford to do that.

The fun part here is that the query optimizer is actually learning over time, and it will shape its indexes to best fit the kind of queries you are doing. It also makes RavenDB much more robust for New Version Degradation effects. NVD is what happens when you push a new version out, which have slightly different queries, which make previously used indexes ineffective, forcing all your queries to become full table scans. Here is an example of the kind of subtle issues that this can cause. With RavenDB, when you use auto indexes (in other words, when you don’t explicitly state which index to use), the query optimizer will take care of that, and it will create all the appropriate indexes (and retire the unused ones)  for you.

This in particular is a feature that I’m really proud of, it require very little from the user to work with, and it gets the Right Thing Done.

time to read 6 min | 1170 words

Today I started out like most recent days, I was working on improving performance and running benchmarks. I made a small change in how we handle file allocation and mapping inside Voron. This is the kind of change that should have no observable effects. And indeed, except for making us run faster, everything worked.

Except that later today I merged some stuff from a colleague and suddenly I started getting invalid memory accesses. After quickly blaming my colleague for the issue, we eventually figured out that it was my change that caused it.

Unfortunately, the problem was a lot more serious than it immediately appeared. It wasn’t just that I needed to fix my code, what was happening there was that I made a certain situation (a new file mapping, and thus, exercising the cleanup routine) a lot more frequent. Which is all well and good, except that this is something that will happen routinely in Voron anyway, it just means that this is now much more likely.

And the problem is that we couldn’t for the life of us figure out why it was failing. Oh, we quickly figured out that we are accessing memory that has been unmapped, but how? The Voron codebase is really careful about such things, and we have quite enough production usage to know that this doesn’t really happen. But again, it might just be a sleeper…

The real problem wasn’t actually with the access violation, that was pretty obvious and would have come to our immediate attention. The problem was that the error looked like the Page Translation Table had a race condition. In this case, because we are much more eager about cleanup, it was obvious that we are accessing old information, but without this to trigger our attention, the fear was that we are actually racy, and that the Page Translation Table will serve incorrect information.

That means that Voron would violate its consistency rule, we’ll effectively be returning random garbage to the user and… Bad Things to Follow.

At various points during the day, we had five different people working on it in three continents, because it is that kind of bug. And we couldn’t figure it out. We traced the code that did that every which way. It is old code, that has been worked upon repeatedly, and it has been stable for years. And none of us could figure out what was going on. Theories ranging from cosmic rays to the wrath of Murphy has been thrown out.

Something was very rotten in Voron. Okay, after all of this exposition, let me explain what was going on. We started out with a Page Table that looked like this:

image

So the first number is the page number, and the second is the page number inside the scratch file (#1 or #2, above).

Basically, this means that when Transaction #3 looks asks for Page #0, it will actually get Page #238 from scratch #1. And when Transaction #4 asks for the same page, it will get it from Page #482 on scratch #2. If you got lost with the numbers, don’t worry, we did too.

The problem was… the failing issue was in transaction #5, and the problem was that it was access page #412 on scratch #1. And due to my change, we actually closed scratch #1. The problem is that we couldn’t figure out how this thing could actually happen. Crazy stuff. We tried reproducing this in all sort of crazy ways, but it would only fail on the most trivial of tests, and very unpredictably. And then we finally figured it out. Basically by tracing everything, putting locks on everything that moved or looked like it would move it I kicked it and plain head against wall rinse & repeat.

Eventually we focused on what happened around the location of the error. It always happened during a query, that was very consistent, when it happened. And finally we figured it out. We now use Lucene indexes stored inside Voron, and Lucene has some funky ideas about how it should be able to access the data. So we have to put a Voron transaction around the whole thing. And we have to flow the same transaction across multiple Lucene index input instances. So we put the transaction inside a thread local variable. And the query method is async.

I think that you can figure out what happened from here, right? When the async machinery jumped us threads, we would end up with a totally foreign transaction, our old transaction would be gone, and all of the carefully thought out premises that we have for transaction scope went out the window. Much cursing was to be heard.

So we did a quick fix and changed the ThreadLocal<Transaction> to be an AsyncLocal<Transaction>, so it would flow through the async calls. And then we run the tests, confident it would solve it. But it didn’t, in fact, we got the exact same error, in the exact same place, and we went back to head butting the wall to see who is smarter.

And then I realized that we were doing something else there. Lucene has the notion of cloning an input, which allows for multi threaded usage of an input. When we do that, we check if we are in the old transaction scope and can reuse previous work, or if we need to do the initialization for a new transaction. The problem was that we were doing this check by id.

Now, two transactions with the same id will always show the same data, period. However, how they do it is very different. Let us take a look at the Page Table diagram above. It shows that Page #412 is located on scratch #2 in position #8327. Now, we have a flushing background process that will take the data from the scratch file and move it to the data file. So the new Page Table will look like this:

image

Note that because the data on the data file in position #412 and scratch #2 in position #8327 is the same, that doesn’t actually matter. Except that when you have started in one transaction and started reading from scratch #2, then was bumped into another thread, and now are trying to keep on reading from the same place, only to end up blowing up entirely.

Once we have fixed this problem as well, all was well with the world. The sky wasn’t falling, and I was writing a blog post at midnight for relaxation.

time to read 7 min | 1257 words

RavenDB was designed from the get go with ACID documents store, and BASE indexes. ACID stands for Atomic, Consistent, Isolated, Durable, and BASE stands for Basically Available, Soft state, Eventually consistent.

That design had been conceived by twin competing needs. First, and obvious, a database should never lose data. Second, we want to ensure that the system remains responsive even under load. It is quite common to have spike in production traffic, and we wanted to be able to be able to handle it with better aplomb.

In particular, the kind of promises that are made by RavenDB queries allow us to perform quite a few performance optimizations. In databases that require that all indexes will be up to date on transaction commit, you’ll find that there is a very high cost to adding indexes to the system, because each additional index means additional work is needed at query time. It also makes things such as aggregating indexes (map/reduce, in RavenDB terms) a lot harder to build.

By having BASE indexes, we gain the ability to batch multiple writes into a single index update operation. It also allows us to defer writing the indexes to the disk, avoiding costly I/O operations. But most importantly, by changing the kind of promise that we give to users, we are able to avoid a lot of locks, complexity and hardship inside RavenDB. This may seems like a small thing, but this is actually quite important. Take a look at this study:

image

In fact, there are a lot of studies on the overhead of locking in database systems, and that has been a hot research topic for many years. By choosing a different architecture, we can avoid a lot of those costs and complexities.

So far, that was the explanation from the point of view of the database creator. What about the users?

Here the tradeoff is more nuanced. On the one hand, there is a certain level of complexity that people have to deal with the notion that queries on just inserted data might not include it (stale queries), on the other hand, it means that queries are consistently faster and we can handle spikes in traffic and load much more easily and consistently.

But it is a mental model that can be hard to follow, even when you are familiar with it. Probably the most common issue with RavenDB’s BASE indexes is the case of Post / Redirect / Get. Let us look at how this may play out:

In here, we actually have two requests, one that adds a new order to the system, and the other that fetch the details. If you have redirected to the new order page, everything is going to work as expected, and you won’t notice anything even if the indexes are stale at the time of the request. But a pretty common scenario is to add the new order, and then go and look at the list of orders for this customer, and if the index didn’t have the chance to update between those two requests (which typically happen very quickly) then the customer will not see the new order.

That particular scenario is responsible for the vast majority the pain we have seen from our users around BASE indexes.

Now, one of the great things about BASE indexes is that the user get to choose whatever they want to wait for the up to date results or whatever they want whatever is there right now. And we have had mechanisms to control this at a very granular level (including options for personal consistency control, so different customers will have different waits depending on their own previous behavior). But we have found that this is something that puts a lot of responsibility on the developer to control the flow on their users on their applications.

So in RavenDB 3.5 we have changed things a bit. Now, instead of processing the write requests as soon as possible, you can ask for the server to wait until the relevant indexes has processed:

image

In other words, when you call SaveChanges, it will wait until the indexes has been updated, so when you return from the call, you can be certain that the results of any future queries will include all the changes on that transaction. This moves the responsibility to the  write side and make such scenarios much easier to handle.

Given all of that, and our experience with RavenDB for the past 8 years or so, we spiked how it would look like with ACID indexes, at least for certain things. The problem is that this pretty much takes out of the equation a lot of the power and flexibility that we get from Lucene (more on why you can’t do that in Lucene in a bit) and force us to offer what are essentially B+Tree indexes. Those are so limited that we would have to offer:

  • B+Tree indexes – ACID (simple property / range queries). With different indexes needed for different queries and ordering options.
  • Lucene indexes – BASE, full text, spatial, facets, etc queries. Much more flexible and easy to use.
  • Map/reduce indexes – BASE (because you aren’t going to run the full map/reduce during the original transaction).

The problem is that then we would have continuous burden of explaining when to use which index type, and how to deal with the different limitations. It will also make it much more complex if you have a query that can use multiple indexes, and there are problems associated with creating new ACID indexes on live systems. So it would generate a lot of confusion and complexity to users, for fairly small benefit that we can address already with the “wait on save” option.

As for why we can’t do it all via Lucene anyway, the problem is that this wouldn’t be sustainable. Lucene isn’t really meant for individual operations, it shines when you push large amount of data through it. It also doesn’t really have the facilities to be transactional, we have actually solved that particular problem in RavenDB 4.0, but it was neither pretty nor easy, and it doesn’t alleviate the issue of “we do best in large batches”. RavenDB’s BASE indexes are actually designed to take advantage of that particular aspect. Because under load, we’ll process bigger batches are reap the performance benefits that they bring.

BASE indexes also make for much simpler operations. You can define a new index without fearing locking the database, and it enables scenarios such as side by side indexing to update index definitions without impacting the running system.

Finally, a truly massive benefit of BASE indexes is that they allow us to change the following statement: more indexes means faster reads, slower writes. Fewer indexes means slower reads, faster writes. By movng the actual indexing work to a background task, we let the writes go though as fast as tehy possible can.

Indexes still have a cost, and the more indexes you have, the higher the cost (we still got to do some work here). But in the vast majority of the cases, we can squeeze this kind of work between writes, in times that the database would be idling. 

What that means is that you can have more indexes at the same cost, and that your queries are going to be using those indexes and are going to be fast.

 

time to read 3 min | 536 words

There are times when you write clean, easily to understand code, and there are times when you see 50% of your performance goes into DateTime parsing, at which point you’ll need to throw nice code out the window, put on some protective gear and seek out that performance hit that you need so much.

Note to the readers: This isn’t something that I recommend you’ll do unless you have considered it carefully, you have gathered evidence in the form of actual profiler results that show that this is justified, and you covered it with good enough tests. The only reason that I was actually able to do anything is that I know so much about the situation. The dates are strictly formatted, the values are stored as UTF8 and there are no cultures to consider.

With that said, it means that we are back to C’s style number parsing and processing:

Note that the code is pretty strange, we do upfront validation into the string, then parse all those numbers, then plug this in together.

The tests we run are:

Note that I’ve actually realized that I’ve been forcing the standard CLR parsing to go through conversion from byte array to string on each call. This is actually what we need to do in RavenDB to support this scenario, but I decided to test it out without the allocations as well.

All the timings here are in nanoseconds.

image

Note that the StdDev for those test is around 70 ns. And this usually takes about 2,400 ns to run.

Without allocations, things are better, but not by much. StdDev goes does to 50 ns, and the performance is around 2,340 ns, so there is a small gain from not doing allocations.

Here are the final results of the three methods:

image

Note that my method is about as fast as the StdDev on the alternative. With an average of 90 ns or so, and StdDev of 4 ns. Surprisingly, LegacyJit on X64 was the faster of them all, coming in at almost 60% of the LegacyJit on X86, and 20% faster than RyuJit on X64. Not sure why, and dumping the assembly at this point is quibbling, honestly. Our perf cost just went down from 2400 ns to 90 ns. In other words, we are now going to be able to do the same work at 3.66% of the cost. Figuring out how to push it further down to 2.95% seems to insult the 96% perf that we gained.

And anyway, that does leave us with some spare performance on the table if this ever become a hotspot again*.

* Actually, the guys on the performance teams are going to read this post, and I’m sure they wouldn’t be able to resist improving it further Smile.

time to read 2 min | 344 words

This is a small part from a larger benchmark that we run:

 

The index in question is using a DateTime field, and as you can see, quite a lot of time is spent in translating that. 50% of our time, in fact. That is… not so nice.

The question now is why we do it? Well, let us look at the code:

image

Here we can see several things, first, there is the small issue with us allocating the string to check if it is a date, but that isn’t where the money is. That is located in the TryParseExact.

This method is actually quite impressive. Given a pattern, it parses the pattern, then it parse the provided string. And if we weren’t calling it hundreds of thousands of times, I’m sure that it wouldn’t be an issue.  But we are, so we are left with writing our own routine to do this in a hard coded manner.

I built the following benchmark to test this out:

image

As you can see, this is pretty much identical to our code, and should tell us how good we are. Here are the benchmark results:

Method

Platform

Jit

Toolchain

Runtime

Median

StdDev

ParseDateTime

X64

RyuJit

Host

Host

2,458.2915 ns

102.7071 ns

ParseDateTime

X86

Host

Clr

Clr

2,506.7353 ns

142.7946 ns

ParseDateTime

X86

LegacyJit

Host

Host

2,443.4806 ns

51.4903 ns

In my next post, I’ll show what I came up with that can beat this.

time to read 4 min | 617 words

System administrators like to see graphs with server utilizations sitting at the very low end of the scale. That means that they don’t need to worry about spikes, capacity or anything much, they are way over provisioned, and that means less waking up at night.

That works very well, until you hit a real spike, hit some sort of limit, and then have to scramble to upgrade your system while under fire, so to speak. [I have plenty of issues with the production team behavior as described in this post, but that isn’t the topic for this post.]

So one of the things that we regularly test is a system being asked to do something that is beyond its limits. For example, have 6 indexes running at the same time, indexing at different speeds, a dataset that is 57GB in size.

The idea is that we will force the system to do a lot of work that it could typically avoid by having everything in memory. Instead, we’ll be forced to page data, and we need to see how we behave under such a scenario.

Here is what this looks like from the global system behavior.

image

If you’ll show this to most admins, they will feel faint. That usually means Something Is About To Break Badly.

But things are slightly better when we look at the details:

image

So what do we have here? We have a process that (at the time of running it, has mapped about 67 GB of files, and has allocated 8.5 GB of RAM). However, only about 4.5 GB of that is actively used, and the rest of the working set is actually the memory mapped files. That lead to an interesting observation, if most of your work is local and transient (so you scan through sections of the file, like we do during indexing), the operating system will load those pages from disk, and keep them around until there is memory pressure, at which point it will look at all of those nice pages that are just sitting them, unmodified and with a source on disk.

That means that the operating can immediately discard them without having to page them out. So that means that they are very cheap. Oh, we’ll still need to load the data from disk into them, but we’ll have to do that anyway, since we can’t fit the entire dataset into memory.

So that means that our allocation strategy basically goes something like this:

  • Ignore the actually free space the operating system report.
  • Instead, take into account the private working set and compare it to the actual working set.

The private working set is what goes into the page file, it mostly consists of managed memory and whatever unmanaged allocations we have to do during the indexing. So by comparing the two, we can tell how much of the used memory is actually used by memory mapped files. We are careful to ensure that we leave about 25% of the system memory to the memory mapped files (otherwise we’ll do a lot of paging), but that still gives us leave to use quite a lot of memory to speed things up, and we can negotiate between the threads to see who is faster (and thus deserve more memory).

time to read 5 min | 841 words

In my preview post, I mentioned that removing artificial batch limits has caused us to double our performance. But what are those artificial batch limits?

Well, anything that doesn’t involve actual system resources. For example, limit batch size by time or by document count is artificial. We used to have to do that as a correlation to the amount of managed memory we use, and because it allowed us to parallelize I/O and computation work. Now, each index is actually working on its own, so if one index is stalling because it need to fetch data, other indexes will use the available core, and every one will be happy.

Effectively, an indexing batch stopped being a global database event that we had to fetch data for specifically and became something much smaller. That fact alone gave us leeway to remove drastic amounts of code to handle things like prefetching, I/O / memory / time / CPU balancing and a whole bunch of really crazy stuff that we had to do.

So all of that went away, and we learned that anything that would artificially reduce a batch size is bad, that we should make the batch size as big as possible to benefit from economy of scale effects.

But wait, what about non artificial limits? For example, running an indexing batch take some memory. We can now track it much better, and most of it is in unmanaged memory anyway, so we don’t worry about keeping it around for a long time. We do worry about running out of it, though.

If we have six indexes all running at the same time, each trying to use as much of the system resources as it possible could. Of course, if we actually let them to that they would allocate enough memory to push us into the page file, resulting in all our beautiful code spending all its time just paging in and out from disk, and our performance looking like it was hit in the face repeatedly with the hard disk needle.

So we have a budget. In fact, we have a pretty complete heuristics system in place.

  • Start by giving each index 16 MB to run.
  • Whenever the index exceed that budget, allow it to complete the current operation (typically a single document, so pretty small)
  • Check if there is enough memory available* that we can still use, and if so, increase the budget by another 16 MB

* Enough memory available is actually a really complex idea, enough so that I’ll dedicate the next post to it.

So that leads us to all indexes competing with one another to get more memory, until we hit the predefined limit (which is supposed to allow us memory to do other work as well). At that point, we hit a real limit, and we stop the batch, complete our work and carry on. After the batch is completed, we could release all of that memory and start from scratch, but that would probably be a waste, we already know that we haven’t gone too badly over budget, so why release all that precious memory just to immediately require it again?

So that is what we did, and we run our benchmarks again. And the performance was not nice to us.

It took a while to figure out what happened, but you can see this on the following graph.

image

We started allocating memory, and as you can see, we have some indexes that have high memory requirement. At some point, we have hit the memory ceiling we specified, and started completing batches so we won’t use too much memory.

All well and good. Except that the act of completing the batch will also (sometimes) release memory. This is typically done because we have found the ideal sizes we need for processing, so we discard everything that is too small. But the allocator is free to release memory if it thinks that this is the best for the system.

Unfortunately, we didn’t adjust the budget in this case. Consider the case of indexes C & F, both of which released significant amount of memory after the batch was completed. Index B, which was forced to make do with whatever memory it managed to grab, suddenly finds itself in a position to grab more memory, and it will slowly increase its budgets and allocations.

At the same time, indexes C & F are also going to allocate more memory, after all, they are well within their budget, since we didn’t account for the released memory that was gobbled up by index B. The fact that this starts happening only about 45 minutes into the batch, and it actually shows up as higher memory utilization about 4 hours after that is really quite annoying when you need to debug it.

time to read 4 min | 739 words

We’ve been running all sort of benchmarks on RavenDB, ranging from micro optimizations to a single heavily used routine that focused on re-ordering the instructions so the CPU can execute them in parallel to large scale load testing and data processing.

This tale is about the later. We have defined 6 indexes over a data set of over 18 million documents with a total size of 57GB. We then run it on a system that had 16GB of RAM (and typically had a lot less actually available).

When we started, we very quickly run out of memory, in fact, our allocations exceeded the database size (so we allocated over 60 GB) before we decided to focus on fixing that. The problem was the scope of the batch, it was too long, and we didn’t reuse the memory inside the same batch. Once that was fixed, we were able to run the performance test successfully.

Total time to run it? Over 9 hours. Granted, this is talking about 6 indexes each needing to go over the entire dataset, so we expect it to take a while, and it is much faster than in previous versions, but that is still far too much.

I should probably explain that the whole point of doing something like that is to see the interference effects. What happen when you have a bunch of competing indexes over the same resources (CPU, memory, disk)?

Turn out, there is a lot going there, and you can’t really get good profiling results from this kind of run  (to start with, the profiling overhead would push this into a multi day effort), and while we captured some profiling run of shorter stats, we weren’t really able to pinpoint the blame.

So we added the tooling we needed to figure it out, and then put that in the studio. The end result was… not so good.

image_thumb

Yes, that is the studio trying to display the debug information for that benchmark. It… did not go so well for us. In other words, before we could fix the benchmark, we had to optimize the code that would tell us where we are spending all that time in the benchmark Smile.

As it turned out, the problem was that our code was optimizing for problems we no loner had. Basically, in RavenDB 3.x we have to take into account our memory usage, and pretty much all of it is managed memory. That lead to some interesting problems, to whit, we don’t know how much memory we use. The fact that a document is size so and so when serialized to disk means exactly squat with regards to how much it is actually going to take in memory. And because too much allocations lead to higher GC costs down the road, we put term limits on the size of the indexing batches we’ll process.

Effectively, we will take a certain amount of documents to process in a batch, and then complete the batch. All the memory we used in the batch will go away, and hopefully we didn’t push too much of it into higher generations. But the real costs that we have in RavenDB 4.x are very different. To start with, we use a lot less managed operations, and we use Voron for pretty much everything. That means that our costs has now shifted, instead of worrying about releasing the memory as quickly as possible, we need to worry about reducing the number of times we go to disk.

As it turns out, artificially reducing the batch size results in us processing more batches, which require us to hit the disk a lot more. The same logic applies to RavenDB 3.x (and we have users who have configured RavenDB to have very long batches for exactly that reason), but that come at GC cost that simply does not exist in RavenDB 4.0.

The immediate solution was to remove all the batch limits and see what would happen. Overall performance had doubled. We were able to process the same amount of information in about half the time. And that is before we did deep dive with a profiler to seek inefficiencies.

FUTURE POSTS

  1. Production postmorterm: The rookie server's untimely promotion - about one day from now

There are posts all the way to Jun 11, 2025

RECENT SERIES

  1. Production postmorterm (2):
    02 Feb 2016 - Houston, we have a problem
  2. Webinar (7):
    05 Jun 2025 - Think inside the database
  3. Recording (16):
    29 May 2025 - RavenDB's Upcoming Optimizations Deep Dive
  4. RavenDB News (2):
    02 May 2025 - May 2025
  5. Production Postmortem (52):
    07 Apr 2025 - The race condition in the interlock
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}