O/R Mapping and domain query optimizations

One of the cons of O/R mapping is that the abstraction is a bit too high.
You write object-oriented code and often forget about eventual performance problems.

Take this (somewhat naive) example:

class Customer
{
   ...
  public double GetOrderTotal()
   {
       var total = ( from order in this.Orders
                        from detail in order.Details
                        select detail.Quantity * detail.ItemPrice)
                       .Sum();

       return total;
   }
}

For a given customer, we iterate over all the orders and all the details in those orders and calculate the sum of quantity multiplied with itemprice.
So far so good.

This will work fine as long as you have all the data in memory and the dataset is not too large, so chances are that you will not notice any problems with this code in your unit tests.

But what happens if the data resides in the database and we have 1000 orders with 1000 details each?
Now we are in deep s##t, for this code to work, we need to materialize at least 1 (cust) + 1000 (orders) * 1000 (details) entities.
The DB needs to find those 1 000 001 rows , the network needs to push them from the DB server to the App server and the App server needs to materialize all of it.
Even worse, what if you have lazy load enabled and aren’t loading this data using eager load?
Then you will hit the DB 1 000 001 times… GL with that! :-)

So clearly, we can not do this in memory, neither with lazy load nor eager load.

But what are the alternatives?
Make an ad hoc sql query?
In that case, what happens to your unit tests?

Maybe we want to keep this code, but we want to execute it in the database instead.

This is possible if we stop beeing anal about “pure POCO” or “no infrastructure in your entities”

Using an unit of work container such as https://github.com/rogeralsing/Precio.Infrastructure

We can then rewrite the above code slightly:

class Customer
{
   ...
  public double GetOrderTotal()
   {
  var total = ( from customer in UoW.Query<Customer>() //query the current UoW
                        where customer.Id == this.Id //find the persistent record of "this"
                        from order in customer.Orders
                        from detail in order.Details
                        select detail.Quantity * detail.ItemPrice)
                       .Sum();

       return total;
   }
}

This code will run the query inside the DB if the current UoW is a persistent UoW.
If we use the same code in our unit tests and use an in mem UoW instance, this code will still work, if our customer is present in the in mem UoW that is..

So the above modification will reduce the number materialized entities from 1 000 001 to 1 (we materialize a double in this case)

I don’t know about you , but I’d rather clutter my domain logic slightly and get a million times better performance than stay true to POCO and suffer from a broken app.

UoW / NWorkspace with Linq support

I have blogged about this for quite a while now.
Now I’ve finally cleaned up the code and published it at github:https://github.com/rogeralsing/Precio.Infrastructure

This is a small framework for UoW/Workspace support in .NET with Linq support.

The framework contains a Unit of Work implementation and providers for Entity Framework 4, NHibernate and MongoDB(using NoRM).
There is also a small incomplete Blog sample project included.

Linq to SqlXml – GitHub

An early alpha of Linq to SqlXml is now available on github: https://github.com/calyptus/linq-to-sqlxml

There is no documentation nor setup scripts so you’re on your own if you try it.
If you get it to run, be sure to add all indexes to the xml field – primary , secondary – path , secondary – value , secondary – property

Linq to SqlXml: Projections

I’ve managed to add projection support to my Linq to Sql Server Xml column implementation.

Executing this Linq query:

var query = (from order in ctx.GetCollection().AsQueryable()
                where order.OrderTotal > 100000000
                where order.ShippingDate == null
                where order.OrderDetails.Sum(d => d.Quantity * d.ItemPrice) > 10
                select new
                {
                    OrderTotal = order.OrderDetails.Sum(d => d.ItemPrice * d.Quantity),
                    CustomerId = order.CustomerId ,
                    Details = order.OrderDetails
                })
                .Take(5);

Will yeild this Sql + XQuery:

select top 5 Id,DocumentData.query(
'<object type="dynamic">
 <state>
  <OrderTotal type="decimal">
   {fn:sum( 
              for $A in /object[1]/state[1]/OrderDetails[1]/object/state[1] 
                      return ($A/ItemPrice[1] * $A/Quantity[1]))}  </OrderTotal>
  <CustomerId type="guid">
   {xs:string(/object[1]/state[1]/CustomerId[1])}
  </CustomerId>
  <Details type="collection">
   {/object[1]/state[1]/OrderDetails[1]/object}
  </Details>
 </state>
</object>') as DocumentData

from documents
where
CollectionName = 'Order'  and
(documentdata.exist('

/object/state[(fn:sum( 
        for $A in /object[1]/state[1]/OrderDetails[1]/object/state 
             return ($A/Quantity[1] * $A/ItemPrice[1])) > xs:decimal(10)) and

/object[1]/state[1]/ShippingDate[1][@type="null"] and
(/object[1]/state[1]/OrderTotal[1] > xs:decimal(100000000))]

')) = 1

Linq to SqlXML

I’m hacking along on my Document DB emulator ontop of Sql Server XML columns.

I have some decent Linq support in place now.
The following query:

var query = from order in orders
            //must have status shipped
            where order.Status >= OrderStatus.Shipped      
            //must contain foo or bar products
            where order.OrderDetails.Any(d => d.ProductNo == "Foo" || d.ProductNo == "Bar")
            //must have an order total > 100
            where order.OrderDetails.Sum(d => d.ItemPrice * d.Quantity) > 100 
            select order;

will yield the following Sql + XQuery to the Sql Server:

select *
from documents
where CollectionName = 'Order'  and 
--must have an order total > 100
(documentdata.exist('/object/state[(
     fn:sum( 
          for $A in OrderDetails[1]/object/state 
                return ($A/ItemPrice[1] * $A/Quantity[1])) > xs:decimal(100))]') = 1) and 
--must contain foo or bar products
(documentdata.exist('/object/state[OrderDetails[1]/object/state[((ProductNo[1] = "Foo") or 
 (ProductNo[1] = "Bar"))]]') = 1) and 
--must have status shipped
(documentdata.exist('/object/state[(Status[1] >= xs:int(2))]') = 1)

Follow

Get every new post delivered to your Inbox.