Fast Deep Cloning
When you ask around how to implement deep cloning in .NET, the first, second and third
recommendation you’ll hear will be to use the
BinarySerializer to save your
object tree to memory and then load it again, thereby creating a copy. While simple to
implement, it’s incredibly slow, uses up several times the memory of the entire object
tree and most implementations I’ve seen even required all cloned classes to be serializable.
With .NET’s reflection capabilities, it’s pretty easy to traverse an object tree and build a deep clone of it, so I wrote a small class that does just that. Then I got the idea to use Linq Expression Trees to compile the instructions required for cloning a type at runtime…
I implemented three different "cloners", all doing the exact same thing via different means. The only requirements were:
- A cloner needs to be able to clone any object, whether serializable or not
- A cloner must be able to clone objects that do not have a default constructor
- A cloner must be able to clone objects a) by their fields and b) by their properties
BinaryFormatter from the .NET Framework (equipped with a surrogate selector
that forces it to always use a surrogate for serialization which serializes all of a type’s fields,
similar to what is described A
Generic Method for Deep Cloning in C# 3.0).
A cloner that traverses an object tree via .NET’s reflection API, creating copies of any object
it encounters. Accessing types via reflection is pretty slow, but still about an order of magnitude
faster than the
BinaryFormatter. It also has the advantage of having no setup time
and working in Silverlight (and possibly the related .NET Compact Framework)
This is pretty advanced stuff. In C# 4.0, Linq Expression Trees can be used to represent any type of method or function as a tree of nodes. This cloner builds an expression tree to clone each type it encounters and then compiles it to a normal method which is kept in a cache. So once a type has been encountered by the cloner once, the cloning operation is as fast as assigning the fields in code by hand.
As you can see, all of the cloners scale quite linearly and in a predictable manner. The expression
tree cloner suffers a bit from the overhead of compiling expressions at runtime, but boy does it
catch up once the number of clones exceeds 100. Given a large list (or deep object tree) with
10,000 objects to clone, the
SerializationCloner would be busy for one and a half
seconds whereas the
ExpressionTreeCloner would hand you the finished clone after just
SerializationCloner is still useful because it works on Silverlight and possibly
also on the .NET Compact Framework (I could only check it against Windows Phone 7 and the Xbox 360
via XNA). On the latter platforms, cloned objects need to supply a default constructor, though,
because the required methods to pull new objects from thin air are only provided by the full framework.
BinarySerializer is definitely out. It has a setup time and there is no
case where the
ReflectionCloner wouldn’t mop the floor with it. On average,
ReflectionCloner is 6 times faster, while
ExpressionTreeCloner is more than 60 times faster (once it gets
over its higher cold start time).