In spring this year, I began work on a moderately sized business application. The time from learning what the program should do to the first prototype were less than 2 weeks. To pull it off, I basically ditched any sane development practice and started throwing code at the problem as fast as I could.
The application required a client/server model and the server had to store its data in a database. At that point in time, I knew about ORMs but had never used one before, but hoping for the best, I picked Microsoft’s Entity Framework because it had a visual designer and looked like I could just drag & drop my data model together and then figure out how to perform queries.
It fell flat when I noticed that I could only choose between TPT (Table Per Type) and TPH (Table Per Hierarchy) globally for the whole freakin’ model. So I tried NHibernate and quickly rebounded due to its learning curve, resorting to writing SQL queries myself and loading them into objects with a very basic class mapper that I had written for another, much smaller project some time ago.
It was tedious work, but at least hacking away for two hours got you two hours further, not one step forward and two steps back figuring out the ORM. And so I managed to get the prototype done with two all-nighters. In autumn, when the heat was off, I decided to do the database properly, using NHibernate…
Back to NHibernate
After creating a .NET 4.0 build of NHibernate 3.1, I began two set up my mappings, using the NHibernate criteria API to retrieve my data. This API allows you to create complex queries by code (no domain-specific query language and no LINQ expression tree dissection – you directly communicate your intent to NHibernate). It was primarily designed to facilitate queries that depend on user input – for example a complex search form on a website, but I liked it because it was a fast and type-safe approach (and because NHibernate’s LINQ support was always rated as so-so in any reviews I read).
As I learned, the criteria API also does a very good job obfuscating your queries. For example, the following query counts all diagnoses that are visible to a certain client (diagnoses are mapped to a client via an intermediate many-to-many table):
DetachedCriteria visibleToClient = DetachedCriteria.For<Client>("client")
.SetProjection(Projections.Id())
.Add(Expression.Eq("Id", clientId))
.Add(Property.ForName("client.Id").EqProperty("diagnosisClient.Id"));
return Session.CreateCriteria<Diagnosis>()
.CreateAlias("Clients", "diagnosisClient")
.Add(Subqueries.Exists(visibleToClient))
.SetProjection(Projections.Count(Projections.Id()))
.UniqueResult<int>();
You certainly won’t even understand the intent behind that code from a cursory glance. And fixing or changing the query without learning a good deal of NHibernate first is out of the question.
Here’s also a small issue that irks me: I normally write my programs to avoid exceptions
in normal program flow. If something is wrong – throw right away. But for expected situations,
say an optional parameter stored in a dictionary, I’m going to check if it’s there, not
attempt to retrieve it and catch the KeyNotFoundException
.
NHibernate seems to think differently. Whenever I use a query with an alias (as in the above case), I see a "first-chance exception" roll by in the debug window. NHibernate is retrieving a column with the specified name and – if that throws an exception – it will check if there is an alias with the same name.
But that fun was just getting started.
Using Cascades is Horribly Inefficient …and Goes Wrong
I’m using GUIDs for my IDs, so I configured all my keys with
generator = "assigned"
to prevent a useless roundtrip to the database.
That works. But when I enable cascades to insert, delete and update entire object hierarchies
at once, NHibernate performs a simple insert operation like this:
- Insert all of the child entities (correct parent ID already assigned by me)
- Insert all of the parent entities
- Update all of the child entities, setting the parent ID
You could first insert the parents, then even if the database generates the keys, no update would be required, but it seems that’s just not the way it is. Inefficient, but at least it’s working. Or is it? Let’s see the procedure for deleting one of the parents:
- Update all child entities, setting their parent ID to null
- Delete all parent entities
- Delete all child entities
That would still work, had NHibernate not conveniently missed the fact that the parent ID column is defined as NOT NULL. So deleting something results in an exception from the constraint violation. Oops.
SchemaExport Generates Funny Schemas
With the SchemaExport
class, you supposedly can turn your NHibernate mappings into
a complete database schema. That’s pretty cool, because this way you can initialize a temporary
database (like an in-memory SQLite database) and use it in your unit tests without having to
maintain a second database schema by hand.
Except that that schema it generates won’t necessarily be valid for your mappings…
Assume you had two entities, a Person
and a Employee
, derived from
Person
. Both are stored in a TPH table that I imaginatively named TPH
.
An Employee
‘s, the EmployeeNumber
property must not be null.
But since the TPH
table stores both Employees
and
Persons
, the corresponding column must still be nullable – since otherwise you
couldn’t store any Person
s in the table.
Guess what SchemaExport
gets wrong? Yep. I had to mess up my mappings just to
get my unit tests to work.
Sometimes… you Lose a Table or Two
It’s the scenario out of a programmer’s nightmare. You just added this big, complicated ORM,
the inner workings of which are a mystery to you, but you’re happy that it somehow gets your
complex queries translated into efficient SQL that your database understands. But suddenly
it has the idea to do a DELETE FROM
instead of an UPDATE
and now you
can try to reconstruct several tables. Aaargh! Gladly, it was just a fever-dream you woke up
from. A delete is completely different from an update. Surely the code paths don’t even touch.
Such a thing could never happen. Or could it?
In my business application, I designed a typical permissions system:
If this was some kind of forum software, you’d have Roles, like "Member", "Moderator" and "Administrator" that are associated with certain Permissions like "CreateThread", "ReplyThread", "CloseThread" or "BanUser". The beauty in this is that Users can simply be assigned a Role and you don’t have to individually manage the permissions for each User.
Now I didn’t want to always retrieve all the Role
s and all their
Permission
s each time I had a UserId
somewhere. To stay with
the forum software example, if you want to display a list of threads, you only need
the User
s so you can display their names, not their entire two levels deep
roles and permissions tree.
That’s why I used the seemingly innocent access="noop"
attribute
which tells NHibernate that there is a mapping between users and roles, but I do not want
it to be visible in code. I’ll query for it explicitly when I need it.
Well, the next time I queried for an entity that had a User
in it, all
Role
s assigned to that User
were deleted from the database.
I searched and searched because I couldn’t imagine that something like that could happen
to an ORM. I looked at the SQL statements being sent. In the end, I had a reduced the code
to a simple query that queried for a single record and then didn’t touch the result.
NHibernate was issuing a DELETE
during the query.
Conclusion
There never was any danger for my production data because I checked any new operation I added to the data layer first through my unit tests (massaging the code until the query did the right thing and was as efficient as possible), then did integration tests against my development database (which I can restore in 5 seconds), then release to a test installation for the client which is only taken over into the production installation when both me and my client give their okays.
It my very well be that I’m doing something dumb. I’ve heard several times on the NHibernate mailing list that many-to-many associations are exotic and that it’s better to manage the intermediate table (some call it a "reference table") yourself. I don’t see why — I’m using an ORM so I don’t have to do that stuff.
There are also some lingering issues, like there being no way you can disconnect an entity
returned by NHibernate from the ISession
. I should not use my entities as
business objects, I’m being told. I can use AutoMapper to copy data from my entities to my
business objects. But why duplicate all the classes and maintain them times two if my
business layer is 99% identical to my entity layer? I have fully isolated my domains,
my business objects know nothing about the database, but there is no large impedance mismatch
I have to bridge, so why not let the ORM fill the columns into my business objects straight
away and write special entity classes right when I actually require them?
I recently took another look at Microsoft’s Entity Framework. It can disconnect entities.
It doesn’t require my properties to be virtual – it can use INotifyPropertyChanged
or do a full state comparison. And it’s gaining a "Code First" interface where I
can configure my mappings via a fluent interface.
Expect more Entity Framework stuff in the future. NHibernate definitely has me burned right now and these experiences don’t fit very well with an ORM that’s touting being mature as one of its advantages.