NHibernate

In spring this year, I began work on a moderately sized business application. The time from learning what the program should do to the first prototype were less than 2 weeks. To pull it off, I basically ditched any sane development practice and started throwing code at the problem as fast as I could.

The application required a client/server model and the server had to store its data in a database. At that point in time, I knew about ORMs but had never used one before, but hoping for the best, I picked Microsoft’s Entity Framework because it had a visual designer and looked like I could just drag & drop my data model together and then figure out how to perform queries.

The NHibernate logo, a rolled-up (hibernating) squirrel

It fell flat when I noticed that I could only choose between TPT (Table Per Type) and TPH (Table Per Hierarchy) globally for the whole freakin’ model. So I tried NHibernate and quickly rebounded due to its learning curve, resorting to writing SQL queries myself and loading them into objects with a very basic class mapper that I had written for another, much smaller project some time ago.

It was tedious work, but at least hacking away for two hours got you two hours further, not one step forward and two steps back figuring out the ORM. And so I managed to get the prototype done with two all-nighters. In autumn, when the heat was off, I decided to do the database properly, using NHibernate…

Back to NHibernate

After creating a .NET 4.0 build of NHibernate 3.1, I began two set up my mappings, using the NHibernate criteria API to retrieve my data. This API allows you to create complex queries by code (no domain-specific query language and no LINQ expression tree dissection – you directly communicate your intent to NHibernate). It was primarily designed to facilitate queries that depend on user input – for example a complex search form on a website, but I liked it because it was a fast and type-safe approach (and because NHibernate’s LINQ support was always rated as so-so in any reviews I read).

As I learned, the criteria API also does a very good job obfuscating your queries. For example, the following query counts all diagnoses that are visible to a certain client (diagnoses are mapped to a client via an intermediate many-to-many table):

DetachedCriteria visibleToClient = DetachedCriteria.For<Client>("client")
  .SetProjection(Projections.Id())
  .Add(Expression.Eq("Id", clientId))
  .Add(Property.ForName("client.Id").EqProperty("diagnosisClient.Id"));

return Session.CreateCriteria<Diagnosis>()
  .CreateAlias("Clients", "diagnosisClient")
  .Add(Subqueries.Exists(visibleToClient))
  .SetProjection(Projections.Count(Projections.Id()))
  .UniqueResult<int>();

You certainly won’t even understand the intent behind that code from a cursory glance. And fixing or changing the query without learning a good deal of NHibernate first is out of the question.

Here’s also a small issue that irks me: I normally write my programs to avoid exceptions in normal program flow. If something is wrong – throw right away. But for expected situations, say an optional parameter stored in a dictionary, I’m going to check if it’s there, not attempt to retrieve it and catch the KeyNotFoundException.

NHibernate seems to think differently. Whenever I use a query with an alias (as in the above case), I see a "first-chance exception" roll by in the debug window. NHibernate is retrieving a column with the specified name and – if that throws an exception – it will check if there is an alias with the same name.

But that fun was just getting started.

Using Cascades is Horribly Inefficient …and Goes Wrong

I’m using GUIDs for my IDs, so I configured all my keys with generator = "assigned" to prevent a useless roundtrip to the database. That works. But when I enable cascades to insert, delete and update entire object hierarchies at once, NHibernate performs a simple insert operation like this:

  1. Insert all of the child entities (correct parent ID already assigned by me)
  2. Insert all of the parent entities
  3. Update all of the child entities, setting the parent ID

You could first insert the parents, then even if the database generates the keys, no update would be required, but it seems that’s just not the way it is. Inefficient, but at least it’s working. Or is it? Let’s see the procedure for deleting one of the parents:

  1. Update all child entities, setting their parent ID to null
  2. Delete all parent entities
  3. Delete all child entities

That would still work, had NHibernate not conveniently missed the fact that the parent ID column is defined as NOT NULL. So deleting something results in an exception from the constraint violation. Oops.

SchemaExport Generates Funny Schemas

With the SchemaExport class, you supposedly can turn your NHibernate mappings into a complete database schema. That’s pretty cool, because this way you can initialize a temporary database (like an in-memory SQLite database) and use it in your unit tests without having to maintain a second database schema by hand.

Except that that schema it generates won’t necessarily be valid for your mappings…

Illustration of what NHibernate's SchemaExport class does

Assume you had two entities, a Person and a Employee, derived from Person. Both are stored in a TPH table that I imaginatively named TPH.

An Employee‘s, the EmployeeNumber property must not be null. But since the TPH table stores both Employees and Persons, the corresponding column must still be nullable – since otherwise you couldn’t store any Persons in the table.

Guess what SchemaExport gets wrong? Yep. I had to mess up my mappings just to get my unit tests to work.

Sometimes… you Lose a Table or Two

It’s the scenario out of a programmer’s nightmare. You just added this big, complicated ORM, the inner workings of which are a mystery to you, but you’re happy that it somehow gets your complex queries translated into efficient SQL that your database understands. But suddenly it has the idea to do a DELETE FROM instead of an UPDATE and now you can try to reconstruct several tables. Aaargh! Gladly, it was just a fever-dream you woke up from. A delete is completely different from an update. Surely the code paths don’t even touch. Such a thing could never happen. Or could it?

In my business application, I designed a typical permissions system:

Database diagram of a typical permissions system

If this was some kind of forum software, you’d have Roles, like "Member", "Moderator" and "Administrator" that are associated with certain Permissions like "CreateThread", "ReplyThread", "CloseThread" or "BanUser". The beauty in this is that Users can simply be assigned a Role and you don’t have to individually manage the permissions for each User.

Now I didn’t want to always retrieve all the Roles and all their Permissions each time I had a UserId somewhere. To stay with the forum software example, if you want to display a list of threads, you only need the Users so you can display their names, not their entire two levels deep roles and permissions tree.

That’s why I used the seemingly innocent access="noop" attribute which tells NHibernate that there is a mapping between users and roles, but I do not want it to be visible in code. I’ll query for it explicitly when I need it.

Well, the next time I queried for an entity that had a User in it, all Roles assigned to that User were deleted from the database.

I searched and searched because I couldn’t imagine that something like that could happen to an ORM. I looked at the SQL statements being sent. In the end, I had a reduced the code to a simple query that queried for a single record and then didn’t touch the result. NHibernate was issuing a DELETE during the query.

Conclusion

There never was any danger for my production data because I checked any new operation I added to the data layer first through my unit tests (massaging the code until the query did the right thing and was as efficient as possible), then did integration tests against my development database (which I can restore in 5 seconds), then release to a test installation for the client which is only taken over into the production installation when both me and my client give their okays.

It my very well be that I’m doing something dumb. I’ve heard several times on the NHibernate mailing list that many-to-many associations are exotic and that it’s better to manage the intermediate table (some call it a "reference table") yourself. I don’t see why — I’m using an ORM so I don’t have to do that stuff.

There are also some lingering issues, like there being no way you can disconnect an entity returned by NHibernate from the ISession. I should not use my entities as business objects, I’m being told. I can use AutoMapper to copy data from my entities to my business objects. But why duplicate all the classes and maintain them times two if my business layer is 99% identical to my entity layer? I have fully isolated my domains, my business objects know nothing about the database, but there is no large impedance mismatch I have to bridge, so why not let the ORM fill the columns into my business objects straight away and write special entity classes right when I actually require them?

I recently took another look at Microsoft’s Entity Framework. It can disconnect entities. It doesn’t require my properties to be virtual – it can use INotifyPropertyChanged or do a full state comparison. And it’s gaining a "Code First" interface where I can configure my mappings via a fluent interface.

Expect more Entity Framework stuff in the future. NHibernate definitely has me burned right now and these experiences don’t fit very well with an ORM that’s touting being mature as one of its advantages.

Leave a Reply

Your email address will not be published. Required fields are marked *

Please copy the string BLzBfL to the field below:

This site uses Akismet to reduce spam. Learn how your comment data is processed.