A vision of enterprise platformSecurity Infrastructure

time to read 11 min | 2133 words

I have been asked how I would design a security infrastructure for my vision of an enterprise platform, and here is an initial draft of the ideas.

As anything in this series, no actual code was written down to build them. What I am doing is going through the steps that I would usually go before I actually sit down and implement something.

While most systems goes for the Users & Roles metaphor, I have found that this is rarely a valid approach in real enterprise scenarios. You often want to do more than just the users & roles, such as granting and revoking permissions from individuals, business logic based permissions, etc.

What are the requirements for this kind of an infrastructure?

  • Performant
  • Human understandable
  • Flexible
  • Ability to specify permissions using the following scheme:
    • On a
      • Group
      • Individual users
    • Based on
      • Entity Type
      • Specific Entity
      • Entity group

Let us give a few scenarios and then go over how we are going to solve them, shall we?

  1. A helpdesk representative can view account data, cannot edit it. The helpdesk representative also cannot view the account's projected revenue.
  2. Only managers can handle accounts marked as "Special Care"
  3. A team leader can handle all the cases handled by members in the team, team members can handle only their own cases.

The security infrastructure revolves around this interface:

image

The IsAllowed purpose should be clear, I believe, but let us talk a bit about the AddPermissionsToQuery part, shall we?

Once upon a time, I built a system that had a Security Service, that being a separate system running on a different machine. That meant that in order to find out if the user had permission to perform some action, I had to send the security service the entity type, id and the requested operation. This worked, but it was problematic when we wanted to display the user more than a single entity at a time. Because the system was external, we couldn't involve it in the query directly, which meant that we had to send the entire result set to the external service for filtering. Beyond the performance issue, there is another big problem, we had no way to reliability perform paged queries, the service could decide to chop up 50% of the returned results, and we would need to compensate for that somehow. That wasn't fun, let me tell you that.

So, the next application that I built, I used a different approach. Instead of an external security service, I had an internal one, and I could send all my queries through it. The security service would enhance the query so permissions would be observed, and everything just worked. It was very good to observe. In that case, we had a lot of methods that did it, because we had a custom security infrastructure. In this case, I think we can get away with a single AddPermissionsToQuery method, since the security infrastructure in place is standardize.

Now, why do we have a Why method there? Pretty strange method, that one, no?

Well, yes, it is. But this is also something that came up through painful experience. In any security system of significant complexity, you would have to ask yourself questions such as: "Why does this user see this information" and "Why can't I see this information" ?

I remember once getting a Priority Bug that some users were not seeing information that they should see, and I sat there and looked at it, and couldn't figure out how they got to that point. After we gave up understanding on our own, we started debugging it, and we reproduced the "error" on our machines. After stepping through it for ten or twenty times, it suddenly hit me, the system was doing exactly what it was supposed to do. I stepped over the line that did it in each and every one of the times that I debugged it, but I never noticed it.

You really want transparency in such a system, because "Access Denied" is about the second most annoying error to debug, if the system will give you no further information.

Now, I am going to show you the table structure, this is not fixed in stone, and don't try to read too much into seeing a table model here. It simply make it easier to follow the connections that a class diagram would.

image

Let us go over some of the concepts that we have here, shall we?

Users & Groups should be immediately obvious, let us focus for a moment on the Operations and Permissions. What is an operation? Operation is an action that can happen in the application. Examples of operations are:

  • Account.View
  • Account.Edit
  • Account.ProjectedRevenue.View
  • Account.ProjectedRevenue.Edit
  • Account.Assign
  • Account.SendEmail

As you can see, we have a fairly simple convention here. [Entity].[Action] and [Entity].[Field].[Action], this allows me to specify granular permissions in a very easy to grok fashion. The above mentioned operations are entity-based operations, they operate on a single entity instance at a time. We also have feature-based operations, such as:

  • Features.HelpDesk
  • Features.CustomerPortal

Those operate without an object to verify on, and are a way to turn on/off permissions for an entire section of the application. Since some operations are naturally grouped together, we also have relations between operations, so we will have the "Account" operation, which will include the "Account.Edit", "Account.View" as children. If you are granted the Account operation on an entity, you automatically get the "Account.Edit" and "Account.View" on the entity as well.

This makes the design somewhat more awkward, because now we need to go through two levels of operations to find the correct one, but it is not a big deal, since we are going to be smart about how we do it.

Permissions are the set of allowed / revoked permissions for an operation on an EntitySecurityKey (will be immediately explained) which is associated with Group, User or EntityGroup.

A simple example may be something like:

  • For User "Ayende", Allow "Account" on the "Account Entity" EntitySecurityKey, Importance 1
  • For Group "Managers", Revoke "Case.Edit" on "Case Entity" EntitySecurityKey, Importance 1
  • For Group "Users", Revoke "Account.Edit" on "Important Accounts Entity Group" EntitySecurityKey, Importance 1
  • For Group "Managers", Allow "Account.Edit" on "Important Accounts Entity Group" EntitySecurityKey, Importance 10
  • For User "Bob from Northwind", Revoke "Account" on "Northwind Account"  EntitySecurityKey, Importance 1

The algorithm for IsAllowed(account, "Account.Edit", user) is something like this, get all the operations relevant to the current entity, default to deny access, then check operations. Revoke operation gets a +1, so it is more important than an Allow operation in the same level. Or in pseudo code (ie, doesn't really handle all the complexity involved):

bool isAllowed = false;
int isAllowedImportance = 0;
foreach(Operation operation in GetAllOperationsForUser(user, operationName, entity.EntitySecurityKey))
{
	bool importance = operation.Importance;
	if(operation.Allow == false)
		importance + 1; 
	if ( isAllowedimportance <  )
	{
		isAllowed = operation.Allow;
		isAllowedimportance = operation.Importance;
	}
}
return isAllowed;

As you had probably noticed already, we have the notion of an Entity Security Key, what is that?

Well, when you define an entity you also need to define its default security, this way, you can specify who can view and edit it. Then, we we create an entity, its EntitySecurityKey is copied from the default one. If we want to set special permissions on a specific entity, we will create a copy of all the current permissions on the entity type, and then edit that, under a different EntitySecurityKey, which is related to its parent.

All the operations in the child EntitySecurityKey are automatically more important then the ones in the parent EntitySecurityKey, regardless of the important score that the parent operations has.

In addition to all of that, we also have the concept of an EntityGroup to consider. Permissions can be granted and revoked on an Entity Group, and those are applicable to all the entities that are member in this group. This way, business logic that touches permissions doesn't need to be scattered all over the place, when a state change affects the permissions on an entity, it is added or removed to an entity group, which has a well known operations defined on it.

Now that you probably understand the overall idea, let us talk about what problem do we have with this approach.

Performance

The security scheme is complex, and of the top of my head, given all the variables, I can't really think of a single query that will answer it for me. The solution for that, like in all things, it to not solve the complex problem, but to break it down to easier problems.

The first thing that we want to consider is what kind of question are we asking the security system. Right now, I am thinking that the IsAllowed method should have the following signatures:

public bool IsAllowed(Operation, User, Entity);
public bool IsAllowed(Operation, User);

This means that the question that we will always ask is "Does 'User' have 'Operation' on 'Entity'?", and "Does 'User' have 'Operation'?". The last is applicable for feature based operations only, of course.

So, given that this is the question we have, how can we answer this efficiently? Let us try to take the above mentioned table structure and de-normalize it to make queries more efficient. My first attempt is this:

image

This allows you to very easily query by the above semantics, and get all the required information in a single go.

A lot of the rules that I have previously mentioned will already be calculated in advance when we write to this table, so we have a far simpler scenario when we come to check the actual permissions.

For instance, the EntitySecurityKey that we send is always the one on the Entity, so the DenormalizedPermissions table will always have the permissions from the parent EntitySecurityKey copied with pre calculated values.

Since everything is based around the EntitySecurityKey, we also have a very simple time when it comes to updating this table.

All we need to do it rebuilt the permissions for this particular EntitySecurityKey.

This makes things much easier, all around.

 

Querying

What this means, in turn, is that we have the following query to issue when we come to check permissions:

SELECT dp.Allow, dp.Importance FROM DenormalizedPermission dp
WHERE       dp.EntitySecurityKey = :EntitySecurityKey
AND         dp.Operation = :Operation
AND         (dp.User = :User OR dp.Group IN (@UserGroups)
                  OR EntityGroup IN (@EntityGroups) )

All we need to do before the query is to find out all the groups that the user belongs to, directly or indirectly, and all the Entity Groups that the entity belongs to.

When it comes down to check a feature-base operation, we can issue the same query, sans the EntitySecurityKey, and we are done.

Another important consideration is the ability to cache this sort of query. Since we will probably make a lot of those, and since we are probably also going to want to have immediate response to changes in security, caching is important, and write-through caching layer can do wonder for making this optimized.

What is missing

Just to note: this is not complete, I can think of several scenarios that this has no answer for, from the Owner can do things other cannot to supporting permissions if the organization unit is identical for the entity and the user. However, adding those is fairly easy to build within the system, all we need to do is define an action that would add the owner's permissions explicitly to the entity, and remove it when they are changed. The same can be done for entities in an organization unit, you would have the group of users in Organization Unit Foo and the Entity Group of entities in Organization Unit Foo, which will have a permission set for that group.

Final thoughts

This turned out to be quite a bit longer than anticipated, waiting expectantly for you, dear reader, to tell me how grossly off I am.

Next topics:

  • Hot deployments and distributed deployments
  • A database that doesn't make you cry
  • Supporting upgrades
  • Platform I/O - integration with the rest of the enterprise

More posts in "A vision of enterprise platform" series:

  1. (29 Nov 2007) A database that you don't hide in the attic
  2. (24 Nov 2007) Hot & Distributed Deployment
  3. (17 Nov 2007) Security Infrastructure