I had some really interesting discussions while I was in CodeMash, and a few of them touched on modeling concerns with non trivial architectures. In particular, I was asked about my opinion on the role of OR/M in systems that mostly do CQRS, event processing, etc.
This is a deep question, because on first glance, your requirements from the database are pretty much just:
INSERT INTO Events(EventId, AggregateId, Time, EventJson) VALUE (…)
There isn’t really the need to do anything more interesting than that. The other size of that is a set of processes that operate on top of these event streams and produce read models that are very simple to consume as well. There isn’t any complexity in the data architecture at all, and joy to world, etc, etc.
This is true, to an extent. But this is only because you have moved a critical component of your system, the beating heart of your business. The logic, the rules, the thing that make a system more than just a dumb repository of strings and numbers.
But first, let me make sure that we are on roughly the same page. In such a system, we have:
- Commands – that cannot return a value (but will synchronously fail if invalid). These mutate the state of the system in some manner.
- Events – represent something that has (already) happened. Cannot be rejected by the system, even if they represent invalid state. The state of the system can be completely rebuilt from replaying these events.
- Queries – that cannot mutate the state
I’m mixing here two separate architectures, Command Query Responsibility Separation and Event Sourcing. They aren’t the same, but they often go together hand in hand, and it make sense to talk about them together.
And because it is always easier for me to talk in concrete, rather than abstract, terms, I want to discuss a system I worked on over a decade ago. That system was basically a clinic management system, and the part that I want to talk about today was the staff scheduling option.
Scheduling shifts is a huge deal, even before we get to the part where it directly impacts how much money you get at the end of the month. There are a lot of rules, regulations, union contracts, agreement and bunch of other staff that relate to it. So this is a pretty complex area, and when you approach it, you need to do so with the due consideration that it deserves. When we want to apply CQRS/ES to it, we can consider the following factors:
The aggregates that we have are:
- The open scheduled for two months for now. This is mutable, being worked on by the head nurse and constantly changes.
- The proposed scheduled for next month. This one is closed, changes only rarely and usually because of big stuff (something being fired, etc).
- The planned schedule for the current month, frozen, cannot be changed.
- The actual schedule for the current month. This is changed if someone doesn’t show to their shift, is sick, etc.
You can think of the first three as various stages of a PlannedScheduled, but the ActualSchedule is something different entirely. There are rules around how much divergence you can have between the planned and actual schedules, which impact compensation for the people involved, for example.
Speaking of which, we haven’t yet talked about:
- Nurses / doctors / staff – which are being assigned to shifts.
- Clinics – a nurse may work in several different locations at different times.
There is a lot of other stuff that I’m ignoring here, because it would complicate the picture even further, but that is enough for now. For example, regardless of the shifts that a person was assigned to and showed up, they may have worked more hours (had to come to a meeting, drove to a client) and that complicated payroll, but that doesn’t matter for the scheduling.
I want to focus on two actions in this domain. First, the act of the head nurse scheduling a staff member to a particular shift. And second, the ClockedOut event which happens when a staff member completes a shift.
The ScheduleAt command place a nurse at a given shift in the schedule, which seems fairly simple on its face. However, the act of processing the command is actually really complex. Here are some of the things that you have to do:
- Ensure that this nurse isn’t schedule to another shift, either concurrently or too close to another shift in a different address.
- Ensure that the nurse doesn’t work with X (because issues).
- Ensure that the role the nurse has matches the required parameters for the schedule.
- Ensure that the number of double shifts in a time period is limited.
The last one, in particular, is a sinkhole of time. Because at the same time, another business rule says that we must give each nurse N number of shifts in a time period, and yet another dictates how to deal with competing preferences, etc.
So at this point, we have: ScheduleAtCommand.Execute() and we need to apply logic, complex, changing, business critical logic.
And at this point, for that particular part of the system, I want to have a full domain, abstracted persistence and be able to just put my head down and focus on solving the business problem.
The same applies for the ClockedOut event. Part of processing it means that we have to look at the nurse’s employment contract, count the amount of overtime worked, compute total number of hours worked in a pay period, etc. Apply rules from the clinic to the time worked, apply clauses from the employment contract to the work, etc. Again, this gets very complex very fast. For example, if you have a shift from 10PM – 6 AM, how do you compute overtime? For that matter, if this is on the last day of the month, when do you compute overtime? And what pay period do you apply it to?
Here, too, I want to have a fully fleshed out model, which can operate in the problem space freely.
In other words, a CQRS/ES architecture is going to have the domain model (and some sort of OR/M) in the middle, doing the most interesting things and tackling the heart o complexity.