Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,624
|
Comments: 51,249
Privacy Policy · Terms
filter by tags archive
time to read 23 min | 4458 words

In the previous post, I introduced the PropertySphere sample application (you can also watch the video introducing it here). In this post, I want to go over how we build a Telegram bot for this application, so Renters can communicate with the application, check their status, raise issues, and even pay their bills.

I’m using Telegram here because the process of creating a new bot is trivial, the API is really fun to work with, and it takes very little effort.

Compare that to something like WhatsApp, where just the process for creating a bot is a PITA.

Without further ado, let’s look at what the Telegram bot looks like:

There are a bunch of interesting things that you can see in the screenshot. We communicate with the bot on the other end using natural text. There aren't a lot of screens / options that you have to go through, it is just natural mannerism.

The process is pretty streamlined from the perspective of the user. What does that look like from the implementation perspective? A lot of the time, that kind of interface involves… big amount of complexity in the backend.

Here is what I usually think when I consider those demos:

In our example, we can implement all of this in about 250 lines of code. The magic behind it is the fact that we can rely on RavenDB’s AI Agents feature to do most of the heavy lifting for us.

Inside RavenDB, this is defined as follows:

For this post, however, we’ll look at how we actually built this AI-powered Telegram bot. The full code is here if you want to browse through it.

What model is used here?

It’s worth mentioning that I’m not using anything fancy, the agent is using baseline gpt-4.1-mini for the demo. There is no need for training or customization, the way we create the agent already takes care of that.

Here is the overall agent definition:


store.AI.CreateAgent(
    new AiAgentConfiguration
    {
        Name = "Property Assistant",
        Identifier = "property-agent",
        ConnectionStringName = "Property Management AI Model",
        SystemPrompt = """
            You are a property management assistant for renters.
            ... redacted ...
            Do NOT discuss non-property topics. 
            """,
        Parameters = [
            // Visible to the model:
            new AiAgentParameter("currentDate", 
"Current date in yyyy-MM-dd format"),
            // Agent scope only, not visible to the model directly
            new AiAgentParameter("renterId", 
"Renter ID; answer only for this renter", sendToModel: false),
            new AiAgentParameter("renterUnits", 
"List of unit IDs occupied by the renter", sendToModel: false),
        ],
        SampleObject = JsonConvert.SerializeObject(new Reply
        {
            Answer = "Detailed answer to query (markdown syntax)",
            Followups = ["Likely follow-ups"],
        }),
        // redacted
    });

The code above will create an agent with the given prompt. It turns out that a lot of work actually goes into that prompt to explain to the AI model exactly what its role is, what it is meant to do, etc.

I reproduced the entire prompt below so you can read it more easily, but take into account that you’ll likely tweak it a lot, and that it is usually much longer than what we have here (although what we have below is quite functional, as you can see from the screenshots).

The agent’s prompt

You are a property management assistant for renters.

Provide information about rent, utilities, debts, service requests, and property details.

Be professional, helpful, and responsive to renters’ needs.

You can answer in Markdown format. Make sure to use ticks (`) whenever you discuss identifiers.

Do not suggest actions that are not explicitly allowed by the tools available to you.

Do NOT discuss non-property topics. Answer only for the current renter.

When discussing amounts, always format them as currency with 2 decimal places.

The way RavenDB deals with AI Agents, we define two very important aspects of them. First, we have the parameters, which define the scope of the system. In this case, you can see that we pass the currentDate, as well as provide the renterId and renterUnits that this agent is going to deal with.

We expose the current date to the model, but not the renter ID or the units that define the scope (we’ll touch on that in a bit). The model needs the current date so it will understand when it is running and have context for things like “last month”. But we don’t need to give it the IDs, they have no meaning and are instead used to define the scope of a particular conversation with the model.

The sample object we use defines the structure of the reply that we require the model to give us. In this case, we want to get a textual message from the model in Markdown format, as well as a separate array of likely follow-ups that we can provide to the user.

In order to do its job, the agent needs to be able to access the system. RavenDB handles that by letting you define queries that the model can ask the agent to execute when it needs more information. Here are some of them:


Queries = [
    new AiAgentToolQuery
    {
        Name = "GetRenterInfo",
        Description = "Retrieve renter profile details",
        Query = "from Renters where id() = $renterId",
        ParametersSampleObject = "{}",
        Options = new AiAgentToolQueryOptions
        {
            AllowModelQueries = false,
            AddToInitialContext = true
        }
    },
     new AiAgentToolQuery
    {
        Name = "GetOutstandingDebts",
        Description = "Retrieve renter's outstanding debts (unpaid balances)",
        Query = """
            from index 'DebtItems/Outstanding'
            where RenterIds in ($renterId) and AmountOutstanding > 0
            order by DueDate asc
            limit 10
            """,
        ParametersSampleObject = "{}"
    },
    new AiAgentToolQuery
    {
        Name = "GetUtilityUsage",
        Description = """
Retrieve utility usage for renter's unit within a date 
range (for calculating bills)
""",
        Query = """
            from 'Units'
            where id() in ($renterUnits)
            select 
                timeseries(from 'Power' 
between $startDate and $endDate 
group by 1d 
select sum()),
                timeseries(from 'Water' 
between $startDate and $endDate 
group by 1d 
select sum())
            """,
        ParametersSampleObject = 
"""
{
"startDate": "yyyy-MM-dd", 
"endDate": "yyyy-MM-dd"
}
"""
    },
}]

The first query in the previous snippet, GetRenterInfo, is interesting. You can see that it is marked as: AllowModelQueries = false, AddToInitialContext = true. What does that mean?

It means that as part of creating a new conversation with the model, we are going to run the query to get all the renter’s details and add that to the initial context we send to the model. That allows us to provide the model with the information it will likely need upfront.

Note that we use the $renterId and $renterUnits parameters in the queries. While they aren’t exposed directly to the model, they affect what information the model can see. This is a good thing, since it means we place guardrails very early on. The model simply cannot see any information that is out of scope for it.

The model can ask for additional information when it needs to…

An important observation about the design of AI agents with RavenDB: note that we provided the model with a bunch of potential queries that it can run. GetRenterInfo is run at the beginning, since it gives us the initial context, but the rest are left for the judgment of the model.

The model can decide what queries it needs to run in order to answer the user’s questions, and it does so of its own accord. This decision means that once you have defined the set of queries and operations that the model can run, you are mostly done. The AI is smart enough to figure out what to do and then act according to your data.

Here is an example of what this looks like from the backend:

Here you can see that the user asked about their utilities, the model then ran the appropriate query and formulated an answer for the user.

The follow-ups UX pattern

You might have noticed that we asked the model for follow-up questions that the user may want to ask. This is a hidden way to guide the user toward the set of operations that the model supports.

The model will generate the follow-ups based on its own capabilities (queries and actions that it knows it can run), so this is a pretty simple way to “tell” that to the user without being obnoxious about it.

Let’s look at how things work when we actually use this to build the bot, then come back to the rest of the agent’s definition.

Plugging the model into Telegram

We looked at the agent’s definition so far - let’s see how we actually use that. The Telegram’s API is really nice, basically boiling down to:


_botClient = new TelegramBotClient(botSecretToken);
_botClient.StartReceiving(
    HandleUpdateAsync,
    HandleErrorAsync,
    new ReceiverOptions
    {
        AllowedUpdates = [
            UpdateType.Message, 
            UpdateType.CallbackQuery 
            ]
    },
    _cts.Token
);


async Task HandleUpdateAsync(ITelegramBotClient botClient, 
Update update, CancellationToken cancellationToken)
{
    switch (update)
    {
        case { Message: { Text: { } messageText } message }:
            await ProcessMessageAsync(botClient, 
message.Chat.Id.ToString(), 
messageText, 
cancellationToken);
            break;
    }
}

And then the Telegram API will call the HandleUpdateAsync method when there is a new message to the bot. Note that you may actually get multiple (concurrent messages), maybe from different chats, at the same time.

We’ll focus on the process message function, where we start by checking exactly who we are talking to:


async Task ProcessMessageAsync(ITelegramBotClient botClient, 
string chatId, string messageText, CancellationToken cancellationToken)
{
    using var session = _documentStore.OpenAsyncSession();


    var renter = await session.Query<Renter>()
        .FirstOrDefaultAsync(r => r.TelegramChatId == chatId,
 cancellationToken);


    if (renter == null)
    {
        await botClient.SendMessage(chatId,
            "Sorry, your Telegram account is not linked to a renter profile.",
            cancellationToken: cancellationToken
        );
        return;
    }
    var conversationId = $"chats/{chatId}/{DateTime.Today:yyyy-MM-dd}";
    // more code in the next snippet
}

Telegram uses the term chat ID in their API, but it is what I would call the renter’s ID. When we register renters, we also record their Telegram chat ID, which means that when we get a message from a user, we can check whether they are a valid renter in our system. If not, we fail early and are done.

If they are, this is where things start to get interesting. Look at the conversation ID that we generated in the last line. RavenDB uses the notion of a conversation with the agent to hold state. The conversation we create here means that the bot will use the same conversation with the user for the same day.

Another way to do that would be to keep the same conversation ID open for the same user. Since RavenDB will automatically handle summarizing and trimming the conversation, either option is fine and mostly depends on your scenario.

The next stage is to create the actual conversation. To do that, we need to provide the model with the right context it is looking for:


var renterUnits = await session.Query<Lease>()
    .Where(l => l.RenterIds.Contains(renter.Id!))
    .Select(l => l.UnitId)
    .ToListAsync(cts);


var conversation = _documentStore.AI.Conversation("property-agent",
    conversationId,
    new AiConversationCreationOptions
    {
        Parameters = new Dictionary<string, object?>
        {
            ["renterId"] = renter.Id,
            ["renterUnits"] = renterUnits,
            ["currentDate"] = DateTime.Today.ToString("yyyy-MM-dd")
        }
    });

You can see that we pass the renter ID and the relevant units for the renter to the model. Those form the creation parameters for the conversation and cannot be changed. That is one of the reasons why you may want to have a different conversation per day, to get the updated values if they changed.

With that done, we can send the results back to the model and then to the user, like so:


var result = await conversation.RunAsync<PropertyAgent.Reply>(cts);


var replyMarkup = new ReplyKeyboardMarkup(result.Answer.Followups
    .Select(text => new KeyboardButton(text))
    .ToArray())
    {
        ResizeKeyboard = true,
        OneTimeKeyboard = true
    };


await botClient.SendMessage(
    chatId,
    result.Answer.Answer,
    replyMarkup: replyMarkup,
    cancellationToken: cts);

The RunAsync() method handles the entire interaction with the model, and most of the code is just dealing with the reply markup for Telegram.

If you look closely at the chat screenshot above, you can see that we aren’t just asking the model questions, we get the bot to perform actions. For example, paying the rent. Here is what this looks like:

How does this work?

Paying the rent through the bot

When we looked at the agent, we saw that we exposed some queries that the agent can run. But that isn’t the complete picture, we also give the model the ability to run actions. Here is what this looks like from the agent’s definition side:


Actions = [
    new AiAgentToolAction
    {
        Name = "ChargeCard",
        Description = """
Record a payment for one or more outstanding debts. The 
renter can pay multiple debt items in a single transaction. 
Can pay using any stored card on file.
""",
        ParametersSampleObject = JsonConvert.SerializeObject(new ChargeCardArgs
        {
            DebtItemIds = ["debtitems/1-A", "debtitems/2-A"],
            PaymentMethod = "Card",
            Card = "Last 4 digits of the card"
        })
    }
]

The idea here is that we expose to the model the kinds of actions it can request, and we specify what parameters it should pass to them, etc. What we are not doing here is giving the model control over actually running any code or modifying any data.

Instead, when the model needs to charge a card, it will have to call your code and go through validation, business logic, and authorization. Here is what this looks like on the other side. When we create a conversation, we specify handlers for all the actions we need to take, like so:


conversation.Handle<PropertyAgent.ChargeCardArgs>("ChargeCard", async args =>
{
    using var paySession = _documentStore.OpenAsyncSession();


    var renterWithCard = await paySession.LoadAsync<Renter>(renter.Id!, cts);
    var card = renterWithCard?.CreditCards
.FirstOrDefault(c => c.Last4Digits == args.Card);


    if (card == null)
    {
        throw new InvalidOperationException(
$"Card ending in {args.Card} not found in your profile.");
    }


    var totalPaid = await PaymentService.CreatePaymentForDebtsWithCardAsync(
        paySession,
        renter.Id!,
        args.DebtItemIds,
        card,
        args.PaymentMethod,
        cts);


    return $"Charged {totalPaid:C2} to {card.Type}" +
    $" ending in {card.Last4Digits}.";
});

Note that we do some basic validation, then we call the CreatePaymentForDebtsWithCardAsync()method to perform the actual operation. It is also fun that we can just return a message string to give the model an idea about what the result of the action is.

Inside CreatePaymentForDebtsWithCardAsync(),we also verify that the debts we are asked to pay are associated with the current renter; we may have to apply additional logic, etc. The concept is that we assume the model is not to be trusted, so we need to carefully validate the input and use our code to verify that everything is fine.

Summary

This post has gone on for quite a while, so I think we’ll stop here. As a reminder, the PropertySphere sample application code is available. And if you are one of those who prefer videos to text, you can watch the video here.

In the next post, I’m going to show you how we can make the bot even smarter by adding visual recognition to the mix.

time to read 11 min | 2077 words

This post introduces the PropertySphere sample application. I’m going to talk about some aspects of the sample application in this post, then in the next one, we will introduce AI into the mix.

You can also watch me walk through the entire application in this video.

This is based on a real-world scenario from a customer. One of the nicest things about AI being so easy to use is that I can generate throwaway code for a conversation with a customer that is actually a full-blown application.

The full code for the sample application is available on GitHub.

Here is the application dashboard, so you can get some idea about what this is all about:

The idea is that you have Properties (apartment buildings), which have Units (apartments), which you then Lease to Renters. Note the capitalized words in the last sentence, those are the key domain entities that we work with.

Note that this dashboard shows quite a lot of data from many different places in the system. The map defines which properties we are looking at. It’s not just a static map, it is interactive. You can zoom in on a region to apply a spatial filter to the data in the dashboard.

Let’s take a closer look at what we are doing here. I’m primarily a backend guy, so I’m ignoring what the front end is doing to focus on the actual behavior of the system.

Here is what a typical endpoint will return for the dashboard:


[HttpGet("status/{status}")]
public IActionResult GetByStatus(string status, [FromQuery] string? boundsWkt)
{
    var docQuery = _session
.Query<ServiceRequests_ByStatusAndLocation.Result,
 ServiceRequests_ByStatusAndLocation>()
        .Where(x => x.Status == status)
        .OrderByDescending(x => x.OpenedAt)
        .Take(10);


    if (!string.IsNullOrWhiteSpace(boundsWkt))
    {
        docQuery = docQuery.Spatial(
x => x.Location, spatial => spatial.Within(boundsWkt));
    }


    var results = docQuery.Select(x => new
    {
        x.Id,
        x.Status,
        x.OpenedAt,
        x.UnitId,
        x.PropertyId,
        x.Type,
        x.Description,
        PropertyName = RavenQuery.Load<Property>(x.PropertyId).Name,
        UnitNumber = RavenQuery.Load<Unit>(x.UnitId).UnitNumber
    }).ToList();


    return Ok(results);
}

We use a static index (we’ll see exactly why in a bit) to query for all the service requests by status and location, and then we project data from the document, including related document properties (the last two lines in the Select call).

A ServiceRequest doesn’t have a location, it gets that from its associated Property, so during indexing, we pull that from the relevant Property, like so:


Map = requests =>
    from sr in requests
    let property = LoadDocument<Property>(sr.PropertyId)
    select new Result
    {
        Id = sr.Id,
        Status = sr.Status,
        OpenedAt = sr.OpenedAt,
        UnitId = sr.UnitId,
        PropertyId = sr.PropertyId,
        Type = sr.Type,
        Description = sr.Description,
        Location =  CreateSpatialField(property.Latitude, property.Longitude),
    };

You can see how we load the related Property and then index its location for the spatial query (last line).

You can see more interesting features when you drill down to the Unit page, where both its current status and its utility usage are displayed. That is handled using RavenDB’s time series feature, and then projected to a nice view on the frontend:

In the backend, this is handled using the following action call:


[HttpGet("unit/{*unitId}")]
public IActionResult GetUtilityUsage(string unitId, 
[FromQuery] DateTime? from, [FromQuery] DateTime? to)
{
var unit = _session.Load<Unit>(unitId);
if (unit == null)
    return NotFound("Unit not found");


var fromDate = from ?? DateTime.Today.AddMonths(-3);
var toDate = to ?? DateTime.Today;


var result = _session.Query<Unit>()
    .Where(u => u.Id == unitId)
    .Select(u => new
    {
        PowerUsage = RavenQuery.TimeSeries(u, "Power")
            .Where(ts => ts.Timestamp >= fromDate && ts.Timestamp <= toDate)
            .GroupBy(g => g.Hours(1))
            .Select(g => g.Sum())
            .ToList(),
        WaterUsage = RavenQuery.TimeSeries(u, "Water")
            .Where(ts => ts.Timestamp >= fromDate && ts.Timestamp <= toDate)
            .GroupBy(g => g.Hours(1))
            .Select(g => g.Sum())
            .ToList()
    })
    .FirstOrDefault();


return Ok(new
{
    UnitId = unitId,
    UnitNumber = unit.UnitNumber,
    From = fromDate,
    To = toDate,
    PowerUsage = result?.PowerUsage?.Results?
.Select(r => new UsageDataPoint
    {
        Timestamp = r.From,
        Value = r.Sum[0],
    }).ToList() ?? new List<UsageDataPoint>(),
    WaterUsage = result?.WaterUsage?.Results?
.Select(r => new UsageDataPoint
    {
        Timestamp = r.From,
        Value = r.Sum[0],
    }).ToList() ?? new List<UsageDataPoint>()
});

As you can see, we run a single query to fetch data from multiple time series, which allows us to render this page.

By now, I think you have a pretty good grasp of what the application is about. So get ready for the next post, where I will talk about how to add AI capabilities to the mix.

time to read 3 min | 414 words

I’m happy to announce the official release of the RavenDB Kubernetes Operator.

As organizations use Kubernetes for more and more parts of their infrastructure, the complexity of deploying databases in such an environment is quite a challenge. For RavenDB, you need to handle certificates, persistence, and upgrades, and it is easy for that to become a bottleneck. This release bridges the gap between RavenDB’s ease of use and the declarative power of Kubernetes.

If you are new to the concept, think of an operator as software that acts like a Site Reliability Engineer (SRE). Kubernetes is excellent at managing stateless applications, but databases require specific knowledge to manage correctly (e.g., "Don't upgrade all nodes at once" or "Ensure the leader is stable before restarting").

The RavenDB Operator extends the Kubernetes API. It allows you to define what you want your cluster to look like (the "Manifest"), and the Operator works tirelessly in the background to make sure your infrastructure matches that state.

Why This Matters

Previously, deploying a secure, clustered RavenDB instance on K8s required manual configuration of StatefulSets, Services, and complex TLS certificate chains.

With the RavenDB Kubernetes Operator, everything is driven by a single custom resource: RavenDBCluster. You provide the specs, and the Operator handles the heavy lifting, ensuring your deployments are fully reproducible, secure, and declarative.

Here is what the Operator brings to the table:

  • Automated Security & Certificate Management: Whether you are using Let’s Encrypt or Self-Signed certificates, the Operator handles bootstrapping, distribution, and rotation.For the Operator’s own webhook certificate, the Operator uses cert-manager behind the scenes, since that is not exposed externally.
  • Safe Rolling Upgrades: Database upgrades can be scary. The Operator orchestrates upgrades node-by-node, using safety gates to ensure the cluster is healthy and data is safe before moving to the next node. If a gate fails, the upgrade stops automatically.
  • Flexible External Access: Exposing a database outside K8s is often a networking headache. We’ve added dedicated support for AWS NLB, Azure Load Balancer, HAProxy, Traefik, and NGINX, giving you production-ready access strategies out of the box.
  • Storage Orchestration: Declarative control over your data, logs, and audit volumes, supporting local paths, AWS EBS, and Azure Disks.
  • One-Click Deploy: Using our official Helm chart, you can spin up a fully operational cluster in minutes.

The Operator is available now via Helm and works on EKS, AKS, Kind, Minikube, and Kubeadm clusters.

We look forward to seeing what you build with RavenDB on Kubernetes!

time to read 4 min | 763 words

We are a database company, and many of our customers and users are running in the cloud. Fairly often, we field questions about the recommended deployment pattern for RavenDB.

Given the… rich landscape of DevOps options, RavenDB supports all sorts of deployment models:

  • Embedded in your application
  • Physical hardware (from a Raspberry Pi to massive servers)
  • Virtual machines in the cloud
  • Docker
  • AWS / Azure marketplaces
  • Kubernetes
  • Ansible
  • Terraform

As well as some pretty fancy permutations of the above in every shape and form.

With so many choices, the question is: what do you recommend? In particular, we were recently asked about deployment to a “naked machine” in the cloud versus using Kubernetes. The core requirements are to ensure high performance and high availability.

Our short answer is almost always: Best to go with direct VMs and skip Kubernetes for RavenDB.

While Kubernetes has revolutionized the deployment of stateless microservices, deploying stateful applications, particularly databases, on K8s introduces significant complexities that often outweigh the benefits, especially when performance and operational simplicity are paramount.

A great quote in the DevOps world is “cattle, not pets”, in reference to how you should manage your servers. That works great if you are dealing with stateless services. But when it comes to data management, your databases are cherished pets, and you should treat them as such.

The Operational Complexity of Kubernetes for Stateful Systems

Using an orchestration layer like Kubernetes complicates the operational management of persistent state. While K8s provides tools for stateful workloads, they require a deep understanding of storage classes, Persistent Volumes (PVs), and Persistent Volume Claims (PVCs).

Consider a common, simple maintenance task: Changing a VM's disk type or size.

As a VM, this is typically a very easy operation and can be done with no downtime.The process is straightforward, well-documented, and often takes minutes.

For K8s, this becomes a significantly more complex task. You have to go deep into Kubernetes storage primitives to figure out how to properly migrate the data to a new disk specification.

There is an allowVolumeExpansion: true option that should make it work, but the details matter, and for databases, that is usually something DBAs are really careful about.

Databases tend to care about their disk. So what happens if we don’t want to just change the size of the disk, but also its type? Such as moving from Standard to Premium. Doing that using VMs is as simple as changing the size. You may need to detach, change, and reattach the disk, but that is a well-trodden path.

In Kubernetes, you need to run a migration, delete the StatefulSets, make the configuration change, and reapply (crossing your fingers and hoping everything works).

Database nodes are not homogeneous

Databases running in a cluster configuration often require granular control over node upgrades and maintenance. I may want to designate a node as “this one is doing backups”, so it needs a bigger disk. Easy to do if each node is a dedicated VM, but much harder in practice inside K8s.

A recent example we ran into is controlling the upgrade process of a cluster. As any database administrator can tell you, upgrades are something you approach cautiously. RavenDB has great support for running cross-version clusters.

In other words, take a node in your cluster, upgrade that to an updated version (including across major versions!), and it will just work. That allows you to dip your toes into the waters with a single node, instead of doing a hard switch to the new version.

In a VM environment: Upgrading a single node in a RavenDB cluster is a simple, controlled process. You stop the database on the VM, perform the upgrade (often just replacing binaries), start the database, and allow the cluster to heal and synchronize. This allows you to manage the cluster's rolling upgrades with precision.

In K8s: Performing a targeted upgrade on just one node of the cluster is hard. The K8s deployment model (StatefulSets) is designed to manage homogeneous replicas. While you can use features like "on delete" update strategy, blue/green deployments, or canary releases, they add layers of abstraction and complexity that are necessary for stateless services but actively harmful for stateful systems.

Summary

For mission-critical database infrastructure where high performance, high availability, and operational simplicity are non-negotiable, the added layer of abstraction introduced by Kubernetes for managing persistence often introduces more friction than value.

While Kubernetes is an excellent platform for stateless services, we strongly recommend deploying RavenDB directly on dedicated Virtual Machines. This provides a cleaner operational surface, simpler maintenance procedures, and more direct control over the underlying resources—all critical factors for a stateful, high-performance database cluster.

Remember, your database nodes are cherished pets, don’t make them sleep in the barn with the cattle.

time to read 7 min | 1362 words

A really interesting problem for developers building agentic systems is moving away from chatting with the AI model. For example, consider the following conversation:

This is a pretty simple scenario where we need to actually step out of the chat and do something else. This seems like an obvious request, right? But it turns out to be a bit complex to build.

The reason for that is simple. AI models don’t actually behave like you would expect them to if your usage is primarily as a chat interface. Here is a typical invocation of a model in code:


class MessageTuple(NamedTuple):
    role: str
    content: str


def call_model(
    message_history: List[MessageTuple],
    tools: List[Callable] = None
):
   pass # redacted

In other words, it is the responsibility of the caller to keep track of the conversation and send the entire conversation to the agent on each round. Here is what this looks like in code:


conversation_history = [
    {
        "role": "user",
        "content": "When do I get my anniversary gift?"
    },
    {
        "role": "agent",
        "content": "Based on our records, your two-year anniversary is in three days. This milestone means you're eligible for a gift card as part of our company's recognition program.\nOur policy awards a $100 gift card for each year of service. Since you've completed two years, a $200 gift card will be sent to you via SMS on October 1, 2025."
    },
    {
        "role": "user",
        "content": "Remind me to double check I got that in a week"
    }
]

Let’s assume that we have a tool call for setting up reminders for users. In RavenDB, this looks like the screenshot below (more on agentic actions in RavenDB here):

And in the backend, we have the following code:


conversation.Handle<CreateReminderArgs>("CreateReminder", async (args) =>
{
    using var session = _documentStore.OpenAsyncSession();
    var at = DateTime.Parse(args.at);
    var reminder = new Reminder
    {
        EmployeeId = request.EmployeeId,
        ConversationId = conversation.Id,
        Message = args.msg,
    };
    await session.StoreAsync(reminder);
    session.Advanced.GetMetadataFor(reminder)["@refresh"] = at;
    await session.SaveChangesAsync();


    return $"Reminder set for {at} {reminder.Id}";
});

This code uses several of RavenDB’s features to perform its task. First we have the conversation handler, which is the backend handling for the tool call we just saw. Next we have the use of the @refresh feature of RavenDB. I recently posted about how you can use this feature for scheduling.

In short, we set up a RavenDB Subscription Task to be called when those reminders should be raised. Here is what the subscription looks like:


from Reminders as r
where r.'@metadata'.'@refresh' != null

And here is the client code to actually handle it:


async Task HandleReminder(Reminder reminder)
{
        var conversation = _documentStore.AI.Conversation(
                agentId: "smartest-agent",
                reminder.ConversationId,
                creationOptions: null
       );
     conversation.AddArtificialActionWithResponse(
"GetRaisedReminders", reminder);
     var result = await conversation.RunAsync();
     await MessageUser(conversation, result);
}

The question now is, what should we do with the reminder?

Going back to the top of this post, we know that we need to add the reminder to the conversation. The problem is that this isn’t part of the actual model of the conversation. This is neither a user prompt nor a model answer. How do we deal with this?

We use a really elegant approach here: we inject an artificial tool call into the conversation history. This makes the model think that it checked for reminders and received one in return, even though this happened outside the chat. This lets the agent respond naturally, as if the reminder were part of the ongoing conversation, preserving the full context.

Finally, since we’re not actively chatting with the user at this point, we need to send a message prompting them to check back on the conversation with the model.

Summary

This is a high-level post, meant specifically to give you some ideas about how you can take your agentic systems to a higher level than a simple chat with the model. The reminder example is a pretty straightforward example, but a truly powerful one. It transforms a simple chat into a much more complex interaction model with the AI.

RavenDB’s unique approach of "inserting" a tool call back into the conversation history effectively tells the AI model, "I've checked for reminders and found a reminder for this user." This allows the agent to handle the reminder within the context of the original conversation, rather than initiating a new one. It also allows the agent to maintain a single, coherent conversational thread with the user, even when the system needs to perform background tasks and re-engage with them later.

You can also use the same infrastructure to create a new conversation, if that makes sense in your domain, and use the previous conversation as “background material”, so to speak. There is a wide variety of options available to fit your exact scenario.

time to read 1 min | 88 words

I gave the following talk at Microsoft Ignite 2025:

Connecting LLMs to your secure, operational database involves complexity, security risks, and hallucinations. This session shows how to build context-aware AI agents directly on your existing data, going from live database to production-ready, secure AI agent in hours. You'll see how to ship personalized experiences that will define the next generation of software. RavenDB's CEO will demonstrate this approach.

time to read 1 min | 76 words

Want to see how modern applications handle complexity, scale, and cutting-edge features without becoming unmanageable? In this deep-dive webinar, we move From CRUD to AI Agents, showcasing how RavenDB, a high-performance document database, simplifies the development of a complex Property Management application.

time to read 3 min | 441 words

When building AI Agents, one of the challenges you have to deal with is the sheer amount of data that the agent may need to go through. A natural way to deal with that is not to hand the information directly to the model, but rather allow it to query for the information as it sees fit.

For example, in the case of a human resource assistant, we may want to expose the employer’s policies to the agent, so it can answer questions such as “What is the required holiday request time?”.

We can do that easily enough using the following agent-query mechanism:

If the agent needs to answer a question about a policy, it can use this tool to get the policies and find out what the answer is.

That works if you are a mom & pop shop, but what happens if you happen to be a big organization, with policies on everything from requesting time off to bringing your own device to modern slavery prohibition? Calling this tool is going to give all those policies to the model?

That is going to be incredibly expensive, since you have to burn through a lot of tokens that are simply not relevant to the problem at hand.

The next step is not to return all of the policies and filter them. We can do that using vector search and utilize the model’s understanding of the data to help us find exactly what we want.

That is much better, but a search for “confidentiality contract” will get you the Non-Disclosure Agreement as well as the processes for hiring a new employee when their current employer isn’t aware they are looking, etc.

That can still be a lot of text to go through. It isn’t as much as everything, but still a pretty heavy weight.

A nice alternative to this is to break it into two separate operations, as you can see below:

The model will first run the FindPolicies query to get the list of potential policies. It can then decide, based on their titles, which ones it is actually interested in reading the full text of.

You need to perform two tool calls in this case, but it actually ends up being both faster and cheaper in the end.

This is a surprisingly elegant solution, because it matches roughly how people think. No one is going to read a dozen books cover to cover to answer a question. We continuously narrow our scope until we find enough information to answer.

This approach gives your AI model the same capability to narrowly target the information it needs to answer the user’s query efficiently and quickly.

time to read 3 min | 445 words

When using an AI model, one of the things that you need to pay attention to is the number of tokens you send to the model. They literally cost you money, so you have to balance the amount of data you send to the model against how much of it is relevant to what you want it to do.

That is especially important when you are building generic agents, which may be assigned a bunch of different tasks. The classic example is the human resources assistant, which may be tasked with checking your vacation days balance or called upon to get the current number of overtime hours that an employee has worked this month.

Let’s assume that we want to provide the model with a bit of context. We want to give the model all the recent HR tickets by the current employee. These can range from onboarding tasks to filling out the yearly evaluation, etc.

That sounds like it can give the model a big hand in understanding the state of the employee and what they want. Of course, that assumes the user is going to ask a question related to those issues.

What if they ask about the date of the next bank holiday? If we just unconditionally fed all the data to the model preemptively, that would be:

  • Quite confusing to the model, since it will have to sift through a lot of irrelevant data.
  • Pretty expensive, since we’re going to send a lot of data (and pay for it) to the model, which then has to ignore it.
  • Compounding effect as the user & the model keep the conversation going, with all this unneeded information weighing everything down.

A nice trick that can really help is to not expose the data directly, but rather provide it to the model as a set of actions it can invoke. In other words, when defining the agent, I don’t bother providing it with all the data it needs.

Rather, I provide the model a way to access the data. Here is what this looks like in RavenDB:

The agent is provided with a bunch of queries that it can call to find out various interesting details about the current employee. The end result is that the model will invoke those queries to get just the information it wants.

The overall number of tokens that we are going to consume will be greatly reduced, while the ability of the model to actually access relevant information is enhanced. We don’t need to go through stuff we don’t care about, after all.

This approach gives you a very focused model for the task at hand, and it is easy to extend the agent with additional information-retrieval capabilities.

time to read 3 min | 504 words

Building an AI Agent in RavenDB is very much like defining a class, you define all the things that it can do, the initial prompt to the AI model, and you specify which parameters the agent requires. Like a class, you can create an instance of an AI agent by starting a new conversation with it. Each conversation is a separate instance of the agent, with different parameters, an initial user prompt, and its own history.

Here is a simple example of a non-trivial agent. For the purpose of this post, I want to focus on the parameters that we pass to the model.


var agent = new AiAgentConfiguration(
"shopping assistant", 
config.ConnectionStringName,
"You are an AI agent of an online shop...")
{
    Parameters =
    [ 
       new AiAgentParameter("lang", 
"The language the model should respond with."),
        new AiAgentParameter("currency", "Preferred currency for the user"),
        new AiAgentParameter("customerId", null, sendToModel: false),
    ],
    Queries = [ /* redacted... */ ],
    Actions = [ /* redacted... */ ],
};

As you can see in the configuration, we define the lang and currency parameters as standard agent parameters. These are defined with a description for the model and are passed to the model when we create a new conversation.

But what about the customerId parameter? It is marked as sendToModel: false. What is the point of that? To understand this, you need to know a bit more about how RavenDB deals with the model, conversations, and memory.

Each conversation with the model is recorded using a conversation document, and part of this includes the parameters you pass to the conversation when you create it. In this case, we don’t need to pass the customerId parameter to the model; it doesn’t hold any meaning for the model and would just waste tokens.

The key is that you can query based on those parameters. For example, if you want to get all the conversations for a particular customer (to show them their conversation history), you can use the following query:


from "@conversations" 
where Parameters.customerId = $customerId

This is also very useful when you have data that you genuinely don’t want to expose to the model but still want to attach to the conversation. You can set up a query that the model may call to get the most recent orders for a customer, and RavenDB will do that (using customerId) without letting the model actually see that value.

FUTURE POSTS

  1. PropertySphere bot: understanding images - 4 days from now

There are posts all the way to Dec 26, 2025

RECENT SERIES

  1. Recording (20):
    05 Dec 2025 - Build AI that understands your business
  2. Webinar (8):
    16 Sep 2025 - Building AI Agents in RavenDB
  3. RavenDB 7.1 (7):
    11 Jul 2025 - The Gen AI release
  4. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
  5. RavenDB News (2):
    02 May 2025 - May 2025
View all series

Syndication

Main feed ... ...
Comments feed   ... ...
}