Wednesday 12 April 2017

Caching Strategy For Compiled Expressions

So you may or may not be aware that using Activator.CreateInstance is very slow, and if you are doing this quite a bit it is worth using (cached) compiled expressions to do this.

Now if you don't know what an Expression is, there are many more qualified people than myself who can explain the concept.  A good explanation by Jon Skeet can be found here:

Using compiled expressions to speed these operations up is well documented on stackoverflow and commonly used.
A quick implementation may be something like this:
static ConcurrentDictionary<Type,Delegate> cache = new ConcurrentDictionary<Type,Delegate>();

static T Create<T>(){
    var constructor = (Func<T>) cache.GetOrAdd(typeof(T), CreateConstructorDelegate);
    return constructor.Invoke(); // Notice you can use invoke because it is a generic
} 

static object Create(Type t){
    var constructor = cache.GetOrAdd(typeof(T), CreateConstructorDelegate);

    // Note we must use DynamicInvoke because the type isn't known at compile time
    return constructor.DynamicInvoke(); 
} 

static Delegate CreateConstructorDelegate(Type t){
    var body = Expression.New(t);
    return Expression.Lambda(body).Compile();
}
Now the problem with this method is in the non generic DynamicInvoke area. After timing it I found it was no better than using Activator.CreateInstance.

For my purposes this was unacceptable, so I dug around looking for a way to call Invoke without knowing the type at compile time. Unfortunately this just isn't possible but it got me thinking... All* instances in c# derive from the standard .net System.Object!
Now Func<T> can't be cast to Func<object> even if T inherits from object due to generic classes not having covariance so I had to figure out something else.

I found by modifying the expression to return an object I could then call Invoke!
I achieved this by instead of returning the direct proper type, wrap it in a ConvertExpression to cast the result as an object first.
This changes the return type of the generated function to System.Object.

Paired with an effective caching and abstraction layer, all that would be required is to cast the return to the type that you are actually working with.  A generic wrapper method is perfect for this.

Now there is some extra memory overhead when value types (int, double, etc.) are casted to an object due to boxing.  To mitigate this, I have used a separate cache for when the method is called with a generic which does no boxing rather than a System.Type.

 All it took was changing the expression creator to this:
static ConcurrentDictionary<Type,Func<object>> cache = new ConcurrentDictionary<Type,Func<object>>();

static ConcurrentDictionary<Type,Delegate> genericCache = new ConcurrentDictionary<Type,Delegate>();

static T Create<T>(){
    var constructor = genericCache.GetOrAdd(typeof(T), x=>CreateConstructorDelegate<T>());
    return ((Func<T>)constructor).Invoke(); // Notice you can use invoke because it is a generic
} 

static object Create(Type t){
     var constructor = cache.GetOrAdd(typeof(T), CreateConstructorDelegate);

     // Note we can now use Invoke because it is a strongly typed Func<object>!
     return constructor.Invoke();
} 

static Func<T> CreateConstructorDelegate<T>(){
    var body = Expression.New(typeof(T));
    return Expression.Lambda<Func<T>>(body).Compile();
}

static Func<object> CreateConstructorDelegate(Type t){
    var ctor = Expression.New(t);

    // Note we now cast the new T to an object inside the function
    // allowing us to ensure that it returns a Func<object> on compile
    var body = Expression.Convert(ctor, typeof(object)); 
    return Expression.Lambda<Func<object>>(body).Compile();
}
After running the performance tests I found the overhead was negligible of casting to object, and still 2 orders of magnitude faster than using DynamicInvoke. Here are the timings of creating 1,000,000 objects running in release mode without the debugger, ordered slowest to fastest:
MethodSeconds
Non Generic Compiled Lambda (DynamicInvoke)01.1585493
Activator.CreateInstance01.0153902
Compiled Generic Lambda Without Cast (Invoke)00.0759568
Object Casted Compiled Expression (Invoke)00.0726109
Natively Calling Constructor00.0050353
As you can see the casted expression performed equally with the generic, non casted Func.

That's all for now
Until next time,
Liam Morrow

No comments:

Post a Comment

Overcoming Azure Storage Queue Message Size Limits

Background I'd been working on a project for work that required the ability to output PDF files. Normally this is a simple enough tas...