Random NHibernate tweets
Those crossed my desk recently, and I thought they were interesting.
I am seeing more and more movement behind NHibernate.
Those crossed my desk recently, and I thought they were interesting.
I am seeing more and more movement behind NHibernate.
No future posts left, oh my!
Comments
@Jesuslover037 - NHibernate is winning over the Christian demographic.
Working on a big LinqToSql -> NHibernate refactoring, very interesting mapping an existing big and complex domain model that isn't built with ORM mapping in mind. LinqToSql was only used as a data layer, there was a VERY big translation layer that mapped LinqToSql object to domain objects, this translation layer was replaced by mapping domain objects directly to the database via nhibernate.
The amount of code to handle save/update/cascade scenarios and the translation from linq-to-sql to the domain model was huge, all of that comlex and error prone code is being removed as we move more more parts to NHibernate.
I have worked with NHibernate for many years but this is the first project I have used NHProf on and it is kind of a game changer and puts NHibernate in another league compared to other ORM frameworks because of the transparency and good guidance it gives.
We have had no show stopper, nhiberante is so flexible and extensible. By creating custom IUserType to mapp to existing value types, custom ID generators, using almost every inheritance scheme, and when needed using <sql-query to mapp complex queries/projections to objects.
The only issue we have now is that you can't join in another table using another column than the id column.
groups.google.com/.../99cd63bbd487aec4>
NHibernate -> LightSpeed here.
Used NH on a legacy ERP database and it was able to handle plenty of absurdity. Love NH.
Changed jobs and working with custom data access layer. Well ... layer is being kind -- just custom data access.
I miss NH a lot.
Not sure how to be the new guy telling them they should change how they do things :)
Personally I'm the opposite, whilst I love nhibernate I've recently started to think its trying to attack a huge problem space and trying to accomplish something that just isn't realistic, there are lots of places to trip up and have to elude to the underlying system (sql).
Been experimenting more recently with dumb conceptual models, lots of cheap binding / transfer models and lots of intelligent automated data mapping.
Seems to fit better again what I do anyway, I'm sure gigantic apps still would prefer to 'patch over' the gaps that ORMs implicitely have.
I find it amusing when people cite the ability to swap out different databases as a good reason to abstract the data layer. It is never a swap.
FWIW,
I think it is possible to do so. It would require some digging, but I am pretty sure that you can.
Yeah same, for simplicity and max-performance reasons i ended up switching to writing my own light-weight ORM wrapper around System.Data interfaces. C# 3.0 actually makes this really easy to do, I was able to knock up something that' pretty feature complete (for flat POCO's) in a day, anyone is welcome to the source:
code.google.com/.../ServiceStack.OrmLite.Sqlite
Using a convention-based approach, the code ended up being a lot cleaner, performs better and I get access to raw sql so I can control exactly how many calls my application makes to ensure there are no full-table scans and they are only hitting indexes.
I've evaluated all 3, I've used NHibernate, and it was a no brainer. L2S and EF are just very lacking and too rough around the edges to really consider at this point.
Demis, how are you handling your caching, unit of work, session management, optimistic concurrency, lazy load/earger loading, hi/lo id generation, criteria and specifications in your system ?
"Demis, how are you handling your caching, unit of work, session management, optimistic concurrency, lazy load/earger loading, hi/lo id generation, criteria and specifications in your system ?"
I suspect ignoring it makes the problem go away.
Steve,
For caching I take advantage of my existing cache providers implementations (memcached and InMemory) and have an intelligent- persistence cache that works ontop of OrmLite (i.e. the OrmLite itself has no deps on the cache providers).
The cache supports Store (insert/update) and GetByIds requests all working through the cache. A simple rule that every POCO must have an 'Id' property which is its primary key simplifies this greatly.
I support Flat POCO's (that can handle text-blobs) if needed. So there is no need for lazy/eager loading as you only need one db call to fill your model.
Again, I use this to achieve maximum perf for database with flat structures. When I need to maitain a rich object graph, i do away with ORM's completely and use db4o instead as it performs faster (for deep object graphs), results in a cleaner object model and is infinitely more productive to program with.
I am a pragmatic programmer not a religious one, my solutions are based on maximum performance and developer productivity.
I'm a Linq2Sql -> NHibernate -> Entity Framework -> Nhibernate guy here. Why?
I was following what I worked with at the day job for a while but now I've converted to using NHibernate for everything. NHProf makes the decision easier and its just much more battle tested.
@Demis, your crazy, all ORMs function in the "who cares" performance range.
Maybe that's just me, but i think there's too much of confusion. I mean - it took me hours to finally understand that Linq provider for NHibernate is not ready yet (i.e. - it does not support Joins). It took me hours to realize, that i can't integrate NHibernate with Lucene.NET (at least - the way i wanted to). And still - i'm not sure about those revelations.
NHibernate->DataObjects.NET (just kidding)
I am currently using Linq2Sql on side projects and seriously considering moving to NHibernate, but I can't get over the initial overhead related to switching.
I have been using ADO.NET for many years and have written several frameworks with it for several projects and have more confidence (experiance) with it than any ORM, but those days are numbered ;)
I am glad to see NHibernate progress so far from when I originally started following it in 2004. It's making it easier to get over my reluctance to change and possibly go use it in an upcoming project.
@Demis, your crazy, all ORMs function in the "who cares" performance range.
Not all ORM's :) It depends on your application, with internal applications I agree you shouldn't care about performance, maintainability and integrity should be your top priorities. But for public facing Internet apps with >5m records and a potential of millions of users than your top priority needs to be scalabitiy and performance in order to provide the best customer experience possible. In these cases you do notice the overhead that ORM's provide.
I'm not ORM's completely, at work one of our resident guru devs has built an in-house ORM (with c# 3.0 interfaces) with native support for LINQ / System.Transactions, it actually the most full-featured LINQ provider I've seen with the cleanest object model/API with full support for joins, aggregate functions, multiple db providers (postgres,mysql and sqlite), etc. Unfortunately it's not open-source yet, and as its not our companies flagship product so they're not interested in selling/licencing it, at the moment - were going to revisit open-sourcing next year after we ship.
@demis: so your company payed a developer to write a linq provider that's so superduper good, it's better than anything out there, and also not used in a product to get the money spend on that developer to writing the provider?
yeah right.
Writing a row materializer for some database isn't that hard. It's the fluff around it that makes an O/R mapper hard and a lot of code and very complex. It's not saving 1 entity that makes it hard, it's saving 100 entities in the right order, synchronizing fk's with pk's, making sure inheritance is done ok etc. that makes it hard.
In your situation, you re-invented the wheel because you thought what was out there sucked and you could do it better. Well, the problem for you is that whatever you wrote, YOU will have to maintain it till the end of day.
I've seen a lot of in-house pseudo O/R mappers, often they were presented as a great 'framework' while in practise they sucked big time, simply because they could do just 1 trick, and every application built with it needed more than just that 1 trick causing work around code using direct access etc.
As a person who has written a large O/R mapper, I can tell you it's not something you can complete in a couple of months, or even a year. It takes many years to get everything right, every feature in, make it work in every situation (as there's no room for the user to use a 'workaround' as that would mitigate the usage of the framework to begin with).
About performance: if your website is required to feed millions of users, the last thing you want to do is go to the db for every request. You cache page parts, or even whole pages for several seconds, as the user won't notice and it saves you many redundant trips. No cache will help you with queries as "Get all customers from germany", unless you built an in-memory db as well (which I'm sure you didnt).
With applications which have to work with many millions of rows, you want a reliable framework which has proven itself in these situations. A homebrewn framework doesn't fall in that category. For example, does your framework efficient paging over those millions of rows? can it efficiently filter on related data (so not with subqueries over millions of rows but with IN queries for example)? In all possible forms the developer might need?
The devil is in the details. As I said, an object materializer around a datareader, I think that's 1 day of work. The problem starts with the features on top of that, and the quest to cover every need a developer might run into. You haven't covered that at all, so your framework requires workarounds in many scenarios or worse: hard-coded string based queries for many situations.
All do respect Frans,
I'm not sure if you understood my post but my 1-day ORM is NOT the LINQ provider we use at work, it's a lightweight wrapper around System.Data interfaces I wrote to suit my needs of hitting the database directly, but still deal with objects. A simple and complete CRUD example of which is here: code.google.com/.../SimpleUseCase.cs
You should reserve judgement after it's been open-sourced and you've had a chance to evaluate it. The ORM didn't appear from thin air it came from a expert developer with years of experience in developing ORM's. From experience, he was able to have a fresh start (with best practices in mind) and design it knowing exactly how to design and architect an ORM, there was no R&D required for this iteration, what was new was parsing LINQ expressions and mapping that to SQL. Make no mistakes writing a LINQ provider is hard and I know very few people who will be able to do it properly (Ayende being one of them), it requires expert knowledge in the compiler design/cpu architecture, IL, CLR, Expression trees, etc.
Despite opinion, we did evaluate multiple frameworks and (I've had personal experience developing projects with both Linq2Sql and Entity Framework) unfortunately 18 months ago Linq in NHibernate did not exist (and it's still immature today). Linq2Sql promoted a compromised data models and is/was for Sql Server only. As we went for a scale-out architecture we needed an ORM that can work on databases that run on *nix servers. We had the developer resources and talent to write our own so we did. After evaluating the best LINQ providers available at the time, I can say that we do have a superior LINQ provider now, It just works and all known bugs have been resolved. Whenever we want a feature or fix a bug, it's done within minutes as we have the resources in-house and on-hand to do it. E.g. It took hardly anytime to add 'sharding support' to the ORM, if we hadn't have built our own this would've been much harder to do, and we would've had to build a sharded solution on-top of an existing ORM we didn't control.
NO, I'm saying in this case you don't want a framework at all! You want to remove all abstraction layers/cruft that caters for every other example and hit the database directly, so you have complete control over every SQL you hit against that database.
---- response over 2 comments to by-pass comment size limit ----
Yes it does. When you page over a dataset like this (I do this when creating the Search Index on the searchable parts of the dataset which is a 24/7 5+ day long-running process), in this case you can't/don't want to hold the entire dataset in-memory so you have to use datareader. When you have a process that takes days than you DO want to optimize the resources it takes to process a single row as any overhead can cost you hours. I've run the SearchIndex process multiple times and it's never failed.
Yes the devil's is in the details, which is why we control/approve all access to a dataset of this size. We employ front-level (i.e. gzip'd XML responses) and multiple levels of intelligent-caching that works across multiple appservers. When the database is modified all relevant caches (and only them) are invalidated. We don't cache for seconds, we cache until the data is no longer valid (which is a lot longer than seconds). Most queries never even make it to the database. But you can't cache 5m+ rows so you will hit the db eventually.
There is never any random queries like: "Get all customers from germany". There are no full-table scans, ever. that is the rule. There are no searches (we have a Search Index for that). Everything hits an index.
No competent developer will even attempt this, most of the time that query will never complete before it times out. You should optimally never use a large IN query, if you need to, use a join on virtual table / tabular function instead.
@Frans,
Excuse, my unawareness - I didn't know that you were the author of LLBGen Pro, so you're obviously very capable of developing a LINQ provider as yourself (as you have done), so you would also make it on the short-list of people being able to develop a LINQ provider :).
I think we evaluated LLBGenPro a long time ago (before our project started), and it looked good for generating data models from an existing schema. Although as a Greenfield project we had different requirements, we wanted to go the other way and have attributed/conventioned based POCO => generate our database schemas. Also we wanted to embrace C# 3.0 features and use LINQ (strongly typed) as our native query model which also needed to support mysql or postgresql (so we can scale-out).
@Demis what I simply don't get is why your management agreed to shell out the time (== cash) to develop an O/R mapper in-house, with so much o/r mappers already out there. If NHibernate or llblgen pro didn't appeal to you, neither did l2s or EF, why not use EUSS? (It's french ;)) Even has a linq provider and is open source (and implements all nh features (well almost)).
It's simply not worth the time to do this in-house: the time is better spend on writing code for the customer who pays the bills, OR if you ship a product, on the product itself.
It's similar to writing your own GUI framework because you don't like how the current ones work (yes, that's the same thing).
About paging: what I meant was generating a paging query in SQL, not paging through a datareader by skipping.
About IN query: it means filtering on FK fields in 'm' side of m:1 related entities instead of join with related entity (which might result in millions of rows).
About linq provider: the devil is in the details. I'm sure your colleague is very clever and better than me, but it took me over 9 months of full time work and there are still some edge cases which are tough to deal with. It might be your colleagues linq provider deals with the cases you'll run into etc., and e.g. not eager loading, inheritance, nested queries, contains etc. but it still would be a waste of time and effort: there are a couple rather mature linq providers out there now and releasing one yourself for a new framework won't get much traction at all. Even if it's open source.
F.e. if you would have used llblgen pro or NH, you might as well would have been done by now: all the time spend on writing your own framework would then be spend on what you are actually payed to do: writing code for either a customer or the product you're shipping (I don't know what you do as work).
Look, I see you're happy with your code, your colleague is happy with his / her linq provider no-one but you and your colleagues can use, that's all great :). The point is that your company did waste a lot of time and thus money on irrelevant stuff your competition might have avoided.
@Frans
Welcome to a Microsoft Shop. Microsoft was down on OSS for so long, that most of it's customers adopted that attitude. You have to remember that most of the people making key business decisions about what technolgies to use no practically nothing about them. Then generally choose whoever gave them the nicest mug at a convention. That, or Denis and Co. didn't tell management that ORMs even exist, my boss didn't know they existed until I told him last week.
Which leads me to...
While I think NHibernate is a great product, it's biggest hinderence in adoption by the masses is it's glaring lack of documentation and working examples. Hell, even Fluent NHibernate's "Examples.FirstProject" doesn't work.
If NHForge or whereever had something like 4 sample projects up (MVC, WPF, WCF "Service" and Winforms off the top of my head) all built "right", but obviously scaled back to only having a few Entities, I'm sure you'd see adoption go through the roof. There's Chinook out there (I've only looked at the Winforms one), but it's not built "right" in so much that it's all just code behind and not using MVP/MVPC/MVFlavorOfTheWeek pattern.
The barrier to entry to NHibernate is very high at the moment, at least much higher than it needs to be.
I actually agree with you, developing your own LINQ provider would not make a lot of fiscal sense to a lot of companies unless you're in the business of selling ORM's :)
We actually started development with NHibernate, but unfortunately discovered some perf problems early on. Also the NHibernate workflow, i.e. configuration and ISession/ICriteria query model also hindered the speed of development/refactoring in our initial frequently changing Data Model. This is also happens to be a multi-year Greenfield project with a very strong emphasis back-end data systems. The choice of ORM technology was actually a very important decision, one that would be the heart of all our services.
None of the available options (that we evaluated at the time anyways) had a LINQ provider available, encouraged a clean domain model (i.e. POCO-first), took advantages of current best-practices (e.g. System.Transactions, etc) and C# language features, used LINQ as its native query language and could run on *NIX databases (i.e. PostgreSQL or MySql). They also don't seem optimized for Greenfield development.
Our 'dev guru' happens to be our 'Technical Director' who has the luxury to decide what to work on. After evaluating what was available, thought he could develop a cleaner ORM that was more optimized for new development. Also being technology-driven with an interest in LINQ/Expression technology helped in the decision.
We continued development of our back-end systems in parallel using db4o for our persistence. Db4o btw, is actually the most productive persistence technology I've ever used (since we don't need an ORM at all). So much so, the question 'Do we need to store this in a RDBMS?' is now factored into my future decisions when building new services. Anyways after 3-4 months (of hard days) we had a functional POCO-driven LINQ provider that was very easy to port to since we didn't have to make any compromises on our domain model (thanks to db4o).
Over all I think we've actually saved time in the end, as everything is strongly typed to our domain model, refactoring and database schema generation is a non-issue for us. We also save time by not needing to 'learn how to use someone elses API' and have the ability to add features whenever we need them (i.e. text/binary blobs, sharding, etc).
@Andrew...
"If NHForge or whereever had something like 4 sample projects up (MVC, WPF, WCF "Service" and Winforms off the top of my head) all built "right", but obviously scaled back to only having a few Entities, I'm sure you'd see adoption go through the roof. There's Chinook out there (I've only looked at the Winforms one), but it's not built "right" in so much that it's all just code behind and not using MVP/MVPC/MVFlavorOfTheWeek pattern.
The barrier to entry to NHibernate is very high at the moment, at least much higher than it needs to be.:
AMEN BROTHER!!! Seriously, I love NHib (I use it on almost all of my projects) but you have hit the nail on the head, totally.
Demis,
What is the workflow for your O/RM? How does it compare to NHibernate or other O/RM frameworks? It's good you point out the importance of workflow. It's an under-discussed issue.
Frans,
Sometimes you are forced into rolling-your-own in-house infrastructure in order to accomodate poorly designed legacy software. We wrote a DI/IOC-based OR/M in VB 6 over 12 years ago, way before these buzzwords became fashionable. As you say, it is much closer to a object<->row materializer than a traditional ORM. But it also contains a lot of dynamic features traditional ORMs don't solve. It is very similar to what the Background Motion project does, except more model-driven.
The major flaw in this handrolled ORM today is not that it is in-house. It is that nobody ever refactored the query object model to accomodate the dynamic features, and instead just added guards to the ORM's internal state machine. As you probably know from OO theory, adding guards to any state machine will be the downfall of just about any object model. It is THE source of architectural decay in state machines; it turns a Simple Moore state machine into a complex black box that depends on several layers of static and dynamic configuration data.
P.S. You know a lot about this stuff. Rather than repeat it constantly on blogs that people lose track of, put it in a Wiki. I've even read some interesting stuff on the LLBLGenPro forums in the past, but it is never consolidated into a book. You're better off structuring your blog comment this way:
(1) Create a tiddly wiki for yourself that serves as your soapbox on O/RM technology.
(2) Create articles on this soapbox, and refine them over time.
(3) Actively seek out feedback, and creatre placeholder wiki articles to describe customer scenarios - talk people out of rolling things from scratch
(4) Actively encourage your wiki readership to explain what you could POSSIBLY be missing that FORCES them to roll their own
In sales, there is the slogan "ABC - Always Be Closing... if you want the knife set". However, this slogan is wrong and does not match reality. Salespeople should be connectors, not closers. They need to establish a connection with their prospect. Each sale is just a series of decisions, and most sales work out well when you can get both sides to agree on three or fewer value items.
Peace,
Z-Bo
@John,
Our workflow for creating new models/tables is now significantly reduced. We have POCO-driven data models where everything is convention-based and/or attributed. Except for connection strings, no other configuration is required. Adding a new table involves adding a class (for your table definition) and modifying the database class (i.e. LINQ source) to include a reference to the new table. Restarting IIS creates all the new tables that don't already exist.
What is new is the utilization of TransactionScope's. We no longer have to 'explicitly save' as the ORM keeps track of all changes that happen to the DataModels within the TransactionScope and upon commit will update all properties that have been modified and insert/delete all objects as needed. Coupled with LINQ for data access, it results in less but more intuitive code than would otherwise be required. Unfortunately I'm under an NDA, otherwise I would've provided links to examples.
LINQ is an impressive technology which is essentially a strongly-typed DSL for accessing data that is baked right into the language. I have yet to come across a db query that couldn't be expressed in LINQ. My opinion is that all 'query apis' that precede it are tainted by deficiencies in the programming language used to access it and an ORM designed today would look a lot different than what is currently available.
The ORM is only suitable for new development (i.e. there are no sophisticated mapping files to use to map to an existing db-schema), which is something that NHibernate, LLBGenPro and perhaps Linq2Sql/EF if you're using SQL Server (although I don't like generated code) excel at. It does fill a niche for having a 'clean api' (untainted by backwards-compatibility or legacy apis) that takes advantages of C# 3.0 language features which results in less effort and more readable code than other solutions.
It would seem that NHibernate just hit critical mass in the UK - look at this job trend graph for roles requiring it:
http://www.itjobswatch.co.uk/jobs/uk/nhibernate.do
This is the biggest trend change for any technology on the site, btw.
@Demis
@Our workflow for creating new models/tables is now significantly reduced. We have POCO-driven data models where everything is convention-based and/or attributed. Except for connection strings, no other configuration is required. Adding a new table involves adding a class (for your table definition) and modifying the database class (i.e. LINQ source) to include a reference to the new table. Restarting IIS creates all the new tables that don't already exist.
I am not sure why your Technical Director has chosen to tightly couple your "migrations" and "model validations" API to IIS. To me, this is not separation of concerns. What if you want to move to Mono eventually? Also, why would you want to synchronize IIS with your model. It doesn't make sense. In a distributed system, you would never be able to guarantee uniqueness. In fact, you have a race condition. If a model on one server is holding a connection string to the same catalog as a model on another server, you now have a dynamic content distributio problem. In effect, what your Technical Director has done is created a non-distributed system that when put in a distributed environment will be forced to be a "distributed, shared memory".model.
Thus, I suspect nobody will be interested in it as open source, and you'd be best off to keep it in-house and not add Yet Another Techical Solution, because all it does is confuse the average lead developer/architect on what the real problems to solve are.
But maybe I misunderstood you.
Cheers,
Z-Bo
Yeah I think you misunderstood me, I was only explaining the workflow (which in the above example shows what effort is required to add a table, i.e. 1. Add a class, 2. Modify another to add a ref, 3. Rebuild). A 'Create Databases' one-liner is in the global.asax on the AppDomain startup, you can call that function from anywhere and it's not coupled to IIS at all, its just a normal c# lib that can be run from anywhere (we have it in Console apps and windows services, etc). that was just one example (typical when building web services) - anything that restarts the AppDomain will do, i.e. touch the web.config - we typically do a rebuild, but whatever works.
Unfortunately LINQ providers really stresses mono's implementation, there are some things actually causing compiler errors, so the port to mono will require a few tweaks - which is on the cards when we have the time.
Okay, even still, live migrations are non-trivial.
Once you get to large table sizes you cant just:
alter table crm.Customer add widget_fk not null default(0)
This could block other more critical dbms processes.
@anything that restarts the AppDomain will do
Then All I can say is never delete anything, or modify anything, including rename refactorings. Only add columns. That's the best you can do. Anything else can result in race conditions. AppDomain is not safe against race conditions in your database, b/c you will eventually need more than one.
Yeah we wouldn't attempt a live migration without taking the site offline. PostgreSQL won't even allow you to change the db schema while there are still open connections to the database.
Running services in parallel, while we make the old one read-only is a solution. Otherwise if we really wanted 100% uptime then we can always do an staged user migration, where we would migrate inactive users so the next time they login they connect to the new service, etc.
Ultimately up to the customer, cost/benefit ratio and all that.
Comment preview