It is the little things that trips you: Reducing RavenDB’s Map/Reduce cost

time to read 7 min | 1297 words

After finally managing to get to Inbox Zero, I had the chance to tackle some problems that were on the back burner. One of the more acute ones was RavenDB performance for map/reduce indexes.

Standard indexes are actually very fast, especially since I just gave them an additional boost, but map/reduce indexes had two problems:

  • The reduce operation is currently single threaded (for all indexes).
  • The reduce operation blocks standard indexing.

In order to avoid that, we split the work so reduce would run in a separate thread from standard indexing operations. That done, I started to take a look at the actual cost of map/reduce operations.

It quickly became apparent that while the map part was pretty fast, it was the reduce operation that was killing us.

After some narrowing down, I was able to figure out that this is the code at fault:

public override void Execute(WorkContext context)
{
        if (ReduceKeys.Length == 0)
            return;

  var viewGenerator = context.IndexDefinitionStorage.GetViewGenerator(Index);
  if (viewGenerator == null)
    return; // deleted view?

  context.TransactionaStorage.Batch(actions =>
  {
    IEnumerable<object> mappedResults = null;
    foreach (var reduceKey in ReduceKeys)
    {
      IEnumerable<object> enumerable = actions.MappedResults.GetMappedResults(Index, reduceKey, MapReduceIndex.ComputeHash(Index, reduceKey))
        .Select(JsonToExpando.Convert);

      if (mappedResults == null)
        mappedResults = enumerable;
      else
        mappedResults = mappedResults.Concat(enumerable);
    }

    context.IndexStorage.Reduce(Index, viewGenerator, mappedResults, context, actions, ReduceKeys);
  });
}

Can you see the problem?

My first thought was that we had a problem with the code inside the foreach, since it effectively generate something like:

select * from MappedResulsts where Index = "MapReduceTestIndex" and ReduceKey = "Oren"
select * from MappedResulsts where Index = "MapReduceTestIndex" and ReduceKey = "Ayende"
select * from MappedResulsts where Index = "MapReduceTestIndex" and ReduceKey = "Arava"

And usually you’ll have about 2,500 of those.

Indeed, I modified the code to look like this:

public override void Execute(WorkContext context)
{
        if (ReduceKeys.Length == 0)
            return;

  var viewGenerator = context.IndexDefinitionStorage.GetViewGenerator(Index);
  if (viewGenerator == null)
    return; // deleted view?
  
  context.TransactionaStorage.Batch(actions =>
  {
    IEnumerable<object> mappedResults = new object[0];
    foreach (var reduceKey in ReduceKeys)
    {
      IEnumerable<object> enumerable = actions.MappedResults.GetMappedResults(Index, reduceKey, MapReduceIndex.ComputeHash(Index, reduceKey))
        .Select(JsonToExpando.Convert);

      mappedResults = mappedResults.Concat(enumerable);
    }
    var sp = Stopwatch.StartNew();
    Console.WriteLine("Starting to read {0} reduce keys", ReduceKeys.Length);

    var results = mappedResults.ToArray();

    Console.WriteLine("Read {0} reduce keys in {1} with {2} results", ReduceKeys.Length, sp.Elapsed, results.Length);

    context.IndexStorage.Reduce(Index, viewGenerator, results, context, actions, ReduceKeys);
  });
}

And got the following:

image_thumb[2]_thumb

Starting to read 2470 reduce keys
Read 2470 reduce keys in 00:57:57.5292856 with 2499 results

Yes, for 2,470 results, that took almost an hour!!

I started planning how to fix this by moving to what is effectively an “IN” approach, when I realized that I skipped a very important step, I didn’t run this through the profiler. And as we know, if it haven’t run through the profiler, it isn’t real, when we are talking about performance testing.

And the profiler led me to this method:

public IEnumerable<JObject> GetMappedResults(string view, string reduceKey, byte[] viewAndReduceKeyHashed)
{
    Api.JetSetCurrentIndex(session, MappedResults, "by_reduce_key_and_view_hashed");
    Api.MakeKey(session, MappedResults, viewAndReduceKeyHashed, MakeKeyGrbit.NewKey);
    if (Api.TrySeek(session, MappedResults, SeekGrbit.SeekEQ) == false)
        yield break;


    Api.MakeKey(session, MappedResults, viewAndReduceKeyHashed, MakeKeyGrbit.NewKey);
    Api.JetSetIndexRange(session, MappedResults, SetIndexRangeGrbit.RangeUpperLimit | SetIndexRangeGrbit.RangeInclusive);
    if (Api.TryMoveFirst(session, MappedResults) == false)
        yield break;
    do
    {
        // we need to check that we don't have hash collisions
        var currentReduceKey = Api.RetrieveColumnAsString(session, MappedResults, tableColumnsCache.MappedResultsColumns["reduce_key"]);
        if (currentReduceKey != reduceKey)
            continue;
        var currentView = Api.RetrieveColumnAsString(session, MappedResults, tableColumnsCache.MappedResultsColumns["view"]);
        if (currentView != view)
            continue;
        yield return Api.RetrieveColumn(session, MappedResults, tableColumnsCache.MappedResultsColumns["data"]).ToJObject();
    } while (Api.TryMoveNext(session, MappedResults));
}

You might have noticed the part that stand out Smile, yes, we actually had a O(n) algorithm here (effectively a table scan) that was absolutely meaningless and didn’t need to be there. I am pretty sure that it was test code that I wrote (yes, I run git blame on that and ordered some crow for tomorrow’s lunch).

Once I removed that, things changed for the better Open-mouthed smile. To give you an idea how much, take a look at the new results:

Starting to read 2470 reduce keys
Read 2470 reduce keys in 00:00:00.4907501 with 2499 results

And just for fun, I had tested how long it takes to reduce each batch of results:

Indexed 2470 reduce keys in 00:00:01.0533390 with 2499 results

And that is without doing any actual optimizations, that is just removing the brain-dead code that had no business being there.