Just saw this post today:
http://www.outofmemory.co.uk/entity-framework-5-dramatically-faster-in-net-4-5/
Kudos to the EF team for great progress :-)
Just saw this post today:
http://www.outofmemory.co.uk/entity-framework-5-dramatically-faster-in-net-4-5/
Kudos to the EF team for great progress :-)
One of the cons of O/R mapping is that the abstraction is a bit too high.
You write object-oriented code and often forget about eventual performance problems.
Take this (somewhat naive) example:
class Customer
{
...
public double GetOrderTotal()
{
var total = ( from order in this.Orders
from detail in order.Details
select detail.Quantity * detail.ItemPrice)
.Sum();
return total;
}
}
For a given customer, we iterate over all the orders and all the details in those orders and calculate the sum of quantity multiplied with itemprice.
So far so good.
This will work fine as long as you have all the data in memory and the dataset is not too large, so chances are that you will not notice any problems with this code in your unit tests.
But what happens if the data resides in the database and we have 1000 orders with 1000 details each?
Now we are in deep s##t, for this code to work, we need to materialize at least 1 (cust) + 1000 (orders) * 1000 (details) entities.
The DB needs to find those 1 000 001 rows , the network needs to push them from the DB server to the App server and the App server needs to materialize all of it.
Even worse, what if you have lazy load enabled and aren’t loading this data using eager load?
Then you will hit the DB 1 000 001 times… GL with that! :-)
So clearly, we can not do this in memory, neither with lazy load nor eager load.
But what are the alternatives?
Make an ad hoc sql query?
In that case, what happens to your unit tests?
Maybe we want to keep this code, but we want to execute it in the database instead.
This is possible if we stop beeing anal about “pure POCO” or “no infrastructure in your entities”
Using an unit of work container such as https://github.com/rogeralsing/Precio.Infrastructure
We can then rewrite the above code slightly:
class Customer
{
...
public double GetOrderTotal()
{
var total = ( from customer in UoW.Query<Customer>() //query the current UoW
where customer.Id == this.Id //find the persistent record of "this"
from order in customer.Orders
from detail in order.Details
select detail.Quantity * detail.ItemPrice)
.Sum();
return total;
}
}
This code will run the query inside the DB if the current UoW is a persistent UoW.
If we use the same code in our unit tests and use an in mem UoW instance, this code will still work, if our customer is present in the in mem UoW that is..
So the above modification will reduce the number materialized entities from 1 000 001 to 1 (we materialize a double in this case)
I don’t know about you , but I’d rather clutter my domain logic slightly and get a million times better performance than stay true to POCO and suffer from a broken app.
I have blogged about this for quite a while now.
Now I’ve finally cleaned up the code and published it at github:https://github.com/rogeralsing/Precio.Infrastructure
This is a small framework for UoW/Workspace support in .NET with Linq support.
The framework contains a Unit of Work implementation and providers for Entity Framework 4, NHibernate and MongoDB(using NoRM).
There is also a small incomplete Blog sample project included.
I’ve started to build a Document DB emulator ontop of Sql Server XML columns.
Sql Server XML columns can store schema free xml documents, pretty much like RavenDB or MongoDB stores schema free Json/Bson documents.
XML Columns can be indexed and queried using XPath queries.
So I decided to build an abstraction layer ontop of this in order to achieve similair ease of use.
I’ve built a serializer/deserializer that deals with my own XML structure for documents (state + metadata) and also an early Linq provider for querying.
Executing the following code:
var ctx = new DocumentContext("main");
var customers = ctx.GetCollection<Customer>().AsQueryable();
var query = from customer in customers
where customer.Address.City == "abc" && customer.Name == "Acme Inc5"
orderby customer.Name
select customer;
var result = query.ToList();
foreach (var item in result)
{
Console.WriteLine(item.Name);
Console.WriteLine(item.Address.City);
}
Will yield the following SQL + XPath query:
select *
from documents
where CollectionName = 'Customer' and
((documentdata.exist('/object/state/Address/object/state/City/text()[. = "abc"]') = 1) and
(documentdata.exist('/object/state/Name/text()[. = "Acme Inc5"]') = 1))
order by documentdata.value('((/object/state/Name)[1])','nvarchar(MAX)')
The result of the query will be returned to the client and then deserialized into the correct .NET type.
As many of you might know, Entity Framework 4 still lacks support to map enum properties.
There are countless of more or less worthless workarounds, everything from exposing constants integers in a static class to make it look like an enum to totally insane generics tricks with operator overloading.
None of those are good enough IMO, I want to be able to expose real enum properties and make Linq queries against those properties, so I’ve decided to fix the problem myself.
My approach will be using Linq Expression Tree rewriting using the ExpressionVisitor that now ships with .net 4.
By using the ExpressionVisitor I can now clone an entire expression tree and replace any node in that tree that represents a comparison between a property and an enum value.
In order to make this work, the entities still needs to have an O/R mapped integer property, so I will rewrite the query from using the enum property and enum constant to use the mapped integer property and a constant integer value.
For me this solution is good enough, I can make the integer property private and make it invisible from the outside.
Example
public class Order
{
//this is the backing integer property that is mapped to the database
private int eOrderStatus {get;set;}
//this is our unmapped enum property
public OrderStatus Status
{
get{return (OrderStatus) eOrderStatus;}
set{eOrderStatus = (int)value;}
}
.....other code
}
This code is sort of iffy and it does violate some POCO principles but it is still plain code, nothing magic about it..
So how do we get our linq queries to translate from the enum property to the integer property?
The solution is far simpler that I first thought, using the new ExpressionVisitor base class I can use the following code to make it all work:
namespace Alsing.Data.EntityFrameworkExtensions
{
public static class ObjectSetEnumExtensions
{
private static readonly EnumRewriterVisitor visitor = new EnumRewriterVisitor();
private static Expression< Func< T, bool>> ReWrite< T>(this Expression< Func< T, bool>> predicate)
{
var result = visitor.Modify(predicate) as Expression< Func< T, bool>>;
return result;
}
public static IQueryable< T> Where< T>(this IQueryable< T> self,
Expression< Func< T, bool>> predicate) where T : class
{
return Queryable.Where(self, predicate.ReWrite());
}
public static T First< T>(this IQueryable< T> self,
Expression< Func< T, bool>> predicate) where T : class
{
return Queryable.First(self, predicate.ReWrite());
}
}
public class EnumRewriterVisitor : ExpressionVisitor
{
public Expression Modify(Expression expression)
{
return Visit(expression);
}
protected override Expression VisitUnary(UnaryExpression node)
{
if (node.NodeType == ExpressionType.Convert && node.Operand.Type.IsEnum)
return Visit(node.Operand);
return base.VisitUnary(node);
}
protected override Expression VisitMember(MemberExpression node)
{
if (node.Type.IsEnum)
{
var newName = "e" + node.Member.Name;
var backingIntegerProperty = node.Expression.Type.GetMember(newName, System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Public)
.FirstOrDefault();
return Expression.MakeMemberAccess(node.Expression, backingIntegerProperty);
}
return base.VisitMember(node);
}
}
}
The first class, is an extension method class that overwrite the default “where” extension of IQueryable of T.
The second class is the actual Linq Expression rewriter.
By including this and adding the appropriate using clause to your code, you can now make queries like this:
var cancelledOrders = myContainer.Orders.Where(order => order.Status == OrderStatus.Cancelled).ToList();
You can of course make more complex where clauses than that since all other functionality remains the same.
This is all for now, I will make a followup on how to wrap this up in a Linq query provider so you can use the standard linq query syntax also.
Hope this helps.
//Roger