NHibernate Generics:
Q: Where can I get NHibernate.Generics?
A: Just go to the downloads page, or click here.
Q: What is the use of NHibernate.Generics?
A: Well, it's pretty simple, you get a generic smart collection that you can work with, along with all the benefits of working with NHibernate.
Q: What do you mean, a generic smart collection?
A: I mean that you tell it to do what you want, and it figures out how to do it. This means that after configuring it one, you don't need to worry about it anymore. You get the benefits of OOP and the comfortable syntax while still getting the reliable backend of a database.
Q: Okay, enough marketing speak, just show me what you mean!
A: With pleasure...
The problem:
You are working in .Net 2, and wish to use NHibernate for the data layer. All is well and good, you start coding and pretty soon you've got yourself business objects and mapping for NHibernate. You're happy as a lamb until you start using it, and you notice that you get IList and ISet and the like all over the place. You spider sense starts tingling, you're on .Net 2, why are you using un-typed collections?
NHibernate doesn't support generic collections (yet) so it's a little bumming, but you go on, ignoring that slight tingling on the back of your neck. Then you notice that sometimes your relationships are not persisted to the database.You scratch your head for a bit, and then quickly find the problem, you had code like this:
1: Blog blog = session.Load(typeof(Blog),1);
2: Post post = new Post("Briliant Post By Ayende");
3: blog.Posts.Add(post);
4: session.Save(post);
5: session.Flush();
And of course that NHibernate doesn't persist the relationship to the database, since you created it on the wrong side of the connection (think about the mapping in the database and you'll see the problem). You should've written it like this:
1: Blog blog = session.Load(typeof(Blog),1);
2: Post post = new Post("Briliant Post By Ayende");
3: post.Blog = blog;
4: blog.Posts.Add(post);
5: session.Save(post);
6: session.Flush();
Now it works, but it's another thing to remember. You can always do the recommended thing and and add a AddPost() method to the Blog class, and a SetBlog() to the Post class. But, it's a pain to do so for each collection, for each class {n^2, anyone? I don't like this type of algorithms.). And it's very easy to forget to use those methods and use the natural blog.Posts.Add().
Q: Okay, I understand the problem. It seems reasonable that this is the way it is. The problem is that objects and rows aren't behaving the same way. What snake oil do you've to convince me that you've solved it.
A: Well, generics and anonymous delegates are a good solution for a lot of things, and you can plug into NHibernate very easily... So, without further ado, let's see the code.
Q: Finally.
The solution:
First, the introductions to our newest team mates:
-
EntityRef<T> - Represent the One side of a OneToMany relationship. It is assumed that the relationship is maintained on this side.
-
EntitySet<T> - Represent the Many side of a ManyToOne relationship. It is assumed that the relationship is maintained on the other side.
For our example, we will take the Blog -> Posts example, where a single blog may have many posts. Here is the code for the Blog class:
1: public class Blog
2: {
3: EntitySet<Post> _posts;
4: int blog_id;
5:
6: public int BlogID
7: {
8: get { return blog_id; }
9: set { blog_id = value; }
10: }
11: string blog_name;
12:
13: public string BlogName
14: {
15: get { return blog_name; }
16: set { blog_name = value; }
17: }
18:
19: public virtual ICollection<Post> Posts
20: {
21: get { return _posts; }
22: }
23:
24: public Blog()
25: {
26: _posts = new EntitySet<Post>(
27: delegate(Post p) { p.Blog = this; },
28: delegate(Post p) { p.Blog = null; }
29: );
30: }
31:
32: public Blog(string name) : this()
33: {
34: this.blog_name = name;
35: }
36: }
You can see that we define an EntitySet of Post (line #3), and that we expose it via a getter property as a ICollection of Post (line #19), which is what you would expect on .Net 2. You may wonder about the constructor (line #26), it's pretty peculiar, isn't it? The EntitySet constructor takes two delegates, which specify which action should occur when an item is added/removed from the collection. This allows the EntitySet to manage the relationship without your intervention all the time.
And here it how the other side looks like:
1: public class Post
2: {
3: int post_id;
4:
5: public int PostId
6: {
7: get { return post_id; }
8: set { post_id = value; }
9: }
10: string post_title;
11:
12: public string PostTitle
13: {
14: get { return post_title; }
15: set { post_title = value; }
16: }
17:
18: EntityRef<Blog> _blog;
19:
20: public Blog Blog
21: {
22: get { return _blog.Value; }
23: set { _blog.Value = value; }
24: }
25:
26: public Post()
27: {
28: _blog = new EntityRef<Blog>(
29: delegate(Tests.Blog b) { b.Posts.Add(this); },
30: delegate(Tests.Blog b) { b.Posts.Remove(this); }
31: );
32: }
33:
34: public Post(string title):this()
35: {
36: this.post_title = title;
37: }
38: }
You can see that we are using an EntityRef of Blog (line #18) and we expose it in a property as Blog (line #20), which means that client code doesn't need to know about what is going on there. Again, we have to setup the relationship (line #28) and tell the EntityRef what to do when a Blog is set and cleared from the property.
Q: It can't be all there is to it! NHiberante would replace the Posts collection with what it loaded from the database, and you would lose the relationships. And the same for the Blog.
A: Indeed, it's not all, but it's nearly so. Let's first take a look at the mapping and then we'll talk about how it's all implemented.
1: <?xml version='1.0' encoding='utf-8'?>
2: <hibernate-mapping
3: xmlns:xsd='http://www.w3.org/2001/XMLSchema'
4: xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
5: xmlns='urn:nhibernate-mapping-2.0'>
6: <class
7: name='NHibernate.Generics.Tests.Blog, NHibernate.Generics.Tests'
8: table='Blogs'>
9: <id
10: name='BlogID'
11: column='blog_id'
12: unsaved-value='0'>
13: <generator
14: class='native' />
15: </id>
16: <property
17: name='BlogName'
18: column='blog_name' />
19: <set
20: lazy='true'
21: inverse='true'
22: name='Posts'
23: access='NHibernate.Generics.GenericAccessor, NHibernate.Generics'>
24: <key
25: column='post_blogid' />
26: <one-to-many
27: class='NHibernate.Generics.Tests.Post, NHibernate.Generics.Tests' />
28: </set>
29: </class>
30: <class
31: table='Posts'
32: name='NHibernate.Generics.Tests.Post, NHibernate.Generics.Tests'>
33: <id
34: name='PostId'
35: column='post_id'
36: unsaved-value='0'>
37: <generator
38: class='native' />
39: </id>
40: <many-to-one
41: name='Blog'
42: access='NHibernate.Generics.GenericAccessor, NHibernate.Generics'
43: class='NHibernate.Generics.Tests.Blog, NHibernate.Generics.Tests'
44: column='post_blogid' />
45: </class>
46: </hibernate-mapping>
You can see that there isn't anything special in the mapping except on line #23, where we define a custom access strategy for the Posts collection and the Blog property. This is done in order to preserve the declaration of the relationships that you've provided during the constructor. And that is it. Nothing more to do. You can start using the objects and they will automatically keep the relationships between themselves. The users of the class get the comfortable syntax that they know and love, the author of the class doesn't need to bother with maintaining anything, since it's fire & forget. Neat, isn't it?
Q: How does it works?
A: Well, here are the details.
Implementation:
EntitySet<> is a wrapper around a regular ISet (which is why it needs to have the GenericAccessor access strategy), when you add or remove from the set, it makes sure to execute the correct action. The GenericAccessor makes sure that NHibernate manages the inner collection without bother the setting you had on the EntitySet<>. EntityRef<> works in a similar manner, but there is not need to use custom access strategy since you can just use the
Q: So it's not really strongly typed? Wouldn't it cause problems in performance?
A: It's for you and your users, internally it's using an un-typed ISet to hold the values. There should be no negative impact on performance, since there is no boxing involved (because there are no value types).
Q: Okay, I'll admit that this looks good. But what are the guide lines to using it?
A: Thanks, here are the guidelines.
Guidelines:
Don't expose a setter for the EntitySet<T>. This is generally a sound advice anywhere where you're exposing collections to the outside world.
Don't expose an EntityRef<T> directly, use its Value property inside the property and let it manage the connections for you.
You need to modify the access strategy for your collection, the default is: "NHibernate.Generics.GenericAccessor, NHibernate.Generics" - Camel casing with underscore prefix, but if your naming convention calls for just camel casing, you can spesify that by using "NHibernate.Generics.GenericAccessor+CamelCase, NHibernate.Generics". All of NHibernate access strategies are supported (just replace the CamelCase with your favoraite naming strategy, like: "NHibernate.Generics.GenericAccessor+PascalCaseUnderscore, NHibernate.Generics" ).
Be aware of what is going on, that you have nicer syntax doesn't mean you don't need to understand how things work!
If you use EntitySet<> for many to many relationship, be sure to pass InitializeOnLazy.Always to the instance on the side that is responsible for the relation (inverse=false, in the mapping).
Questions:
Q: Wait, don't you've infinite recursion here? You add a post to a blog, which set's the post's blog, which add it to the blog which...
A: No, this is taken care of by EntitySet<> and EntityRef<>, they work together to ensure no double adding/removing, etc.
Q: Do I need to worry about null handling in the delegates that I pass to EntitySet<> and EntityRef<>?
A: No, they will take care of it by themselves and never pass a null reference to the delegates.
Q: Can I use EntitySet<> and EntityRef<> in multi threaded scenarios.
A: No, they are not safe for use in multi threaded scenarios (and neither are most of the other collections). You can modify them easily to support it, though.
Q: What are the gotacha? There has got to be one at the least.
A: Well, there is one...
Gotchas:
One thing that you need to be aware of is the behavior of EntitySet when it holds a lazy collection that wasn't initialized. If you add an item to the collection, and the collection is not loaded, EntitySet doesn't load the collections. This is done because EntitySet assumes that the relationship is maintained on the other side, so it is not important to load the collection just to add one item.
Q: Okay, seems reasonable, but where it the gotacha?
A: It's an edge case, but here is the code that might surprise you.
1: [Test]
2: public void CantGetAddedItemsFromLazyLoadCollectionIfNotSavedToDB()
3: {
4: Blog blog = new Blog("My Blog");
5: session.Save(blog);
6: session.Flush();
7: session.Dispose();
8:
9: //Needs to create a new session so I wouldn't get the same instance
10: session = factory.OpenSession();
11:
12: Blog blogFromDB = (Blog)session.Load(typeof(Blog), blog.BlogID);
13: Post newPost = new Post("Second Post");
14: blogFromDB.Posts.Add(newPost);
15: Assert.IsFalse(blogFromDB.Posts.Contains(newPost));
16: }
Checkout line #15, Even though we just added the newPost to the blog, it's not in the collection. WTF?!
What goes here is simple:
- You added a post to a blog whose Posts collection is lazy loaded.
- EntitySet detected this and didn't load the collection from the database just to add a single item. It relies on the relationship being maintained on the other side, so it did nothing more than notify the other side (newPost.Blog) that the connection was made.
- You check if newPost is in the Posts collection which caused:
- NHibernate loads the collection from database.
- The newPost relationship was not saved to database yet, so it wasn't loaded (because there is nothing to load).
Q: How do I solve this issue? It seems like it could be a real problem.
A: Not really, it's an true edge case, it can happen if and only if you've an unloaded lazy collection, you add an item to it, and you access the connection before you saved the item to the database.
Q: Edge case or not, I can see places where it would happen in my code, what do I do?
A: You save the item to database before using the collection, or you use the IsInitialized property and the Load() method on the EntitySet<T> before adding to it. Here are the examples:
1: [Test]
2: public void CanGetNewlyAddedItemInLazyCollectionIfSavedToDB()
3: {
4: Blog blog = new Blog("My Blog");
5: session.Save(blog);
6: session.Flush();
7: session.Dispose();
8:
9: //Needs to create a new session so I wouldn't get the same instance
10: session = factory.OpenSession();
11:
12: Blog blogFromDB = (Blog)session.Load(typeof(Blog), blog.BlogID);
13: Post newPost = new Post("Second Post");
14: blogFromDB.Posts.Add(newPost);
15: session.Save(newPost);
16: Assert.IsTrue(blogFromDB.Posts.Contains(newPost));
17: }
You can see that we save the newPost after adding the newPost (lines #15, #16). Now, when we access the Posts collection, NHibernate would initialize it and load the newly added item as well, since it's on the database.
Another way to do that is to explicitly load the collection before adding:
1: [Test]
2: public void LoadCollectionAndThenAddingToItAddsToInMemoryCollection()
3: {
4: Blog blog = new Blog("My Blog");
5: session.Save(blog);
6: session.Flush();
7: session.Dispose();
8:
9: //Needs to create a new session so I wouldn't get the same instance
10: session = factory.OpenSession();
11:
12: Blog blogFromDB = (Blog)session.Load(typeof(Blog), blog.BlogID);
13: EntitySet<Post> posts = (EntitySet<Post>)blogFromDB.Posts;
14: posts.Load();//Load the lazy collection.
15: Assert.IsTrue(posts.IsInitialized);
16: Post newPost = new Post("Second Post");
17: blogFromDB.Posts.Add(newPost);
18: Assert.IsTrue(blogFromDB.Posts.Contains(newPost));
19: }
This demonstrate loading the collection explicitly, first we need to cast it (line #13) to an EntitySet<Post>, since we're exposing it as ICollection<Post> (usually this would happen inside the containing class, so you wouldn't need to do this.) and then we call the Load() method (line #14) which loads the data from the database. Now when we add the newPost, it's added to the Posts collections. :-)
Q: But what about when I have many-to-many association, and then I hit this edge case? This can cause more serious trouble.
A: This is true, and this is why you have an overload on all the construtors for EntitySet<T> that will accept an InitializeOnLazy enum. Just pass InitializeOnLazy.Always in the constructor, and you are set. You shouldn't use this always, since there is a good reasons why you need it, beyond performance. For a start, NHibernate assume that all the collections are stupid, and doesn't really like you messing around with the objects while it is loading it. See below for more about many-to-many associations.
Q: I want to update the collection from the delegate being executed, but it doesn't work. What is wrong?
A: Well, you hit the collection recursion defense. Are you sure you need this? If you aren't sure, then think carefully about your design. Here is how you do it:
1: [Test]
2: public void ChangeCollectionWhileInsideAction()
3: {
4: //Require a class member variable, can't use a local
5: //variable, because of delegates rules.
6: _posts = new EntitySet<Post>(
7: delegate(Post p)
8: {
9: using (_posts.AllowModifications)
10: {
11: _posts.Add(new Post("Duplicate"));
12: }
13: }, null);
14:
15: _posts.Add(new Post("Original"));
16:
17: Assert.AreEqual(2, _posts.Count);
18:
19: }
Pay attention to line #9, we are calling AllowModification within a using block. This make sure that when we'll exist the block, the set will be in a consistent state. As you can see, this allow to modify the collection while a modification is in progress. It is also a single chance only, so we don't get into recursion when we add the duplicate post to the posts collections.
Q: Can I use this for many to many associations?
A: Yes, but you need to be aware to which side holds the connection (which side NHibernate considers as the authoritative source (Which just means: inverse=false, in the NHibernate mapping). This side needs to be passed InitializeOnLazy.Always in the constructor. You can see it clearly on the two tests below. But first, some background, those state has two users, and two blogs (where there is a many to many associations between users and blogs). The side that is responsible for maintaining the link is the Blogs property on the User class, so here is the test for changing a lazy loaded association from the user side, like this:
1: _blogs = new EntitySet<Blog>(
2: delegate(Blog b) { b.Users.Add(this); },
3: delegate(Blog b) { b.Users.Remove(this); },
4: InitializeOnLazy.Always);
And here is the test:
1: [Test]
2: public void LazyLoadingWhenTheSetsAreNotInitialized_AnInitOnLazySetToAlways()
3: {
4: User ayendeFromDb = (User)session.Load(typeof(User), ayende.UserId);
5:
6: EntitySet<Blog> blogs = ((EntitySet< Blog >) ayendeFromDb.Blogs);
7: Assert.IsFalse(blogs.IsInitialized);
8:
9: ayendeFromDb.Blogs.Add(new Blog());
10:
11: Assert.IsTrue(blogs.IsInitialized);
12: Assert.AreEqual(3, ayendeFromDb.Blogs.Count);
13:
14: }
And now on the other side, the Users collection on the Blog class is not the authoritative source, so we define it as usual, like this:
1: _users = new EntitySet<User>(
2: delegate(User u) { u.Blogs.Add(this);},
3: delegate(User u) { u.Blogs.Remove(this);});
And here is the test that shows what happens if we try it on the other side:
1: //Notice that here the famous edge-case appear,
2: //since the blog side doesn't hold the
3: [Test]
4: public void LazyLoadingWhenTheSetsAreNotInitialized()
5: {
6: Blog techFromDb = (Blog)session.Load(typeof(Blog), tech.BlogID);
7:
8: EntitySet<User> users = ((EntitySet<User>)techFromDb.Users);
9: Assert.IsFalse(users.IsInitialized);
10:
11: techFromDb.Users.Add(new User());
12:
13: Assert.IsFalse(users.IsInitialized);
14: Assert.AreEqual(2, techFromDb.Users.Count);
15: }
As you can see, the collection was not loaded, and when it was loaded, the new user wasn't added to it (and wouldn't until we will save the user and reload the collection.
I don't recommend using InitializeOnLazy.Always on both sides, since this is just wasteful, and, again, my interfere with the way NHibernate works. One side should do it.
For reference, here is the table structure, as you can see, nothing special here:
1: CREATE TABLE [dbo].[Blogs] (
2: [blog_id] [int] IDENTITY (1, 1) NOT NULL ,
3: [blog_name] [varchar] (50) NULL
4: ) ON [PRIMARY]
5:
6: CREATE TABLE [dbo].[Posts] (
7: [post_id] [int] IDENTITY (1, 1) NOT NULL ,
8: [post_title] [varchar] (50) NULL ,
9: [post_blogid] [int] NULL
10: ) ON [PRIMARY]