When you ask around how to implement deep cloning in .NET, the first, second and third
recommendation you’ll hear will be to use the BinarySerializer
to save your
object tree to memory and then load it again, thereby creating a copy. While simple to
implement, it’s incredibly slow, uses up several times the memory of the entire object
tree and most implementations I’ve seen even required all cloned classes to be serializable.
With .NET’s reflection capabilities, it’s pretty easy to traverse an object tree and build a deep clone of it, so I wrote a small class that does just that. Then I got the idea to use Linq Expression Trees to compile the instructions required for cloning a type at runtime…
"Cloners"
I implemented three different "cloners", all doing the exact same thing via different means. The only requirements were:
- A cloner needs to be able to clone any object, whether serializable or not
- A cloner must be able to clone objects that do not have a default constructor
- A cloner must be able to clone objects a) by their fields and b) by their properties
SerializationCloner
Uses the BinaryFormatter
from the .NET Framework (equipped with a surrogate selector
that forces it to always use a surrogate for serialization which serializes all of a type’s fields,
similar to what is described A
Generic Method for Deep Cloning in C# 3.0).
ReflectionCloner
A cloner that traverses an object tree via .NET’s reflection API, creating copies of any object
it encounters. Accessing types via reflection is pretty slow, but still about an order of magnitude
faster than the BinaryFormatter
. It also has the advantage of having no setup time
and working in Silverlight (and possibly the related .NET Compact Framework)
ExpressionTreeCloner
This is pretty advanced stuff. In C# 4.0, Linq Expression Trees can be used to represent any type of method or function as a tree of nodes. This cloner builds an expression tree to clone each type it encounters and then compiles it to a normal method which is kept in a cache. So once a type has been encountered by the cloner once, the cloning operation is as fast as assigning the fields in code by hand.
Results
As you can see, all of the cloners scale quite linearly and in a predictable manner. The expression
tree cloner suffers a bit from the overhead of compiling expressions at runtime, but boy does it
catch up once the number of clones exceeds 100. Given a large list (or deep object tree) with
10,000 objects to clone, the SerializationCloner
would be busy for one and a half
seconds whereas the ExpressionTreeCloner
would hand you the finished clone after just
45 milliseconds.
The SerializationCloner
is still useful because it works on Silverlight and possibly
also on the .NET Compact Framework (I could only check it against Windows Phone 7 and the Xbox 360
via XNA). On the latter platforms, cloned objects need to supply a default constructor, though,
because the required methods to pull new objects from thin air are only provided by the full framework.
Cloning via BinarySerializer
is definitely out. It has a setup time and there is no
case where the ReflectionCloner
wouldn’t mop the floor with it. On average,
the ReflectionCloner
is 6 times faster, while
the ExpressionTreeCloner
is more than 60 times faster (once it gets
over its higher cold start time).
Your reflection cloner fails on examples with circular references.
That’s right. Thanks, I hadn’t thought of those. The expression tree cloner should be affected as well.
I might update the code in the future, but at the moment I don’t have any circular references in the objects I’m using it with, so it’ll be a while until I get around to it.
Good post and comparations.
But in my tests I’ve had same problem, circular references. Cygon, have you found a solution? I tried change the code and resulted in Stack Overflow Exception, as expected…
It’s not as much looking for a solution than just me needing to sit down and write some boring code ;-)
What needs to be done is simply to pass a dictionary (with
ReferenceEquals
based comparer) around and use it look keep track of which reference types have already been cloned (and therefore need to be simply assigned instead of traversed).The project I’ve written this component for currently has no circular references, but I’ll see if I can get around to adding support for them, I think it would be useful to have a fast and complete object cloning facility for C# instead of all those crutches out there :)
i have tried to use ReflectionCloner in silverlight but its giving an error “FormatterServices inaccessible due to its protection level”
can you please tell me any work around
That error you’ve been getting has pointed you to the following 5 lines:
Have you tried the most obvious thing, aka just adding
SILVERLIGHT
to the#if
? ;)I changed to the ReflectionCloner to handle circular references. It wasn’t much work really, maybe 45 mins of coding and testing. I’d like to tackle the expression tree cloner next, but the code is much harder to follow. I can post my change for the ReflectionCloner if you like.
Can you please add the implementation that solve circular references, by the way very good code ,thanks :)
Hi,
Download link does not seem to work.
Could you please email me the source code at yanik_bl@hotmail.com
Thank you very much!
Yan
Hi, could you give it one more try?
A recent PHP update caused segfaults when my download plugin looked up the mime type of files. I’ve added a workaround for that and downloads should be working normally again. Sorry!
Hi Cygon,
I’ve taken the liberty to put a slightly tidied up version of this code up on GitHub (crediting you of course), and think it might be useful on NuGet too. I’ve created the nuspec and nuget packages in the repository as well. If you would like to upload it to the NuGet Gallery yourself please feel free to do so.
p/s: I didn’t copy the BinarySerialization part of the project, and excluded the demo.
I would recommend to do the following change:
Func cloner = getOrCreateDeepFieldBasedCloner(/*typeof(TCloned)*/ objectToClone.GetType());
I commented out “typeof(TCloned)” and replaced it with “objectToClone.GetType()”
so that the code does not choke when it is given something declared as object, but actually being some specific type.
As a matter of fact current code blows up with NullReferenceException if it is given Object to clone, because
GetFieldInfosIncludingBaseClasses checks for the parent of first parameter being Object, but does not check the first parameter itself for being Object.
Hi,
thx a lot for this great inspiration with the expression trees, this actually saved my day.
I’ve modified the expression cloner and enabled
– circular references
– read only fields
– maintaining reference equality (multiple references to same object)
If you don’t mind I would like to publish that on CodePlex, with reference to you of course.
Tim,
Did you end up posting the code that addresses circular references somewhere?
Also, anyone would suggest a way to handle List properties (where T may be an arbitrary type). This does not seem to be handled by the present code.
Great article, Implemented in code and works.
Only one small bug to be wary off, and that is if you have empty lists in your class, it throws an exception. It is Ok if the list is null.
I’d like to see protobuf-net in that graph of yours ‘-)
I’ve written up a small benchmark, but I must be doing something wrong. I’ve reused the
MemoryStream
and didn’t include any setup code, but protobuf-net is completely off the charts (in a rather horrible way, that is):My ExpressionTreeCloner could have run the benchmark 51,000 times over during that time :-/
I’ve uploaded my benchmark code here: http://pastebin.com/ySCXMmW9
Scratch that, I found my mistake, the
MemoryStream
wasn’t reset properly and ProtoBuf-net must have accumulated more and more data.New results:
So ProtoBuf-net is about 1/3rd faster than my
ReflectionCloner
and ~5 times slower than myExpressionTreeCloner
. That’s pretty impressive!