Using Command Pattern to capture language and intent for services


This is a follow up to my previous post http://rogeralsing.com/2013/12/01/why-mapping-dtos-to-entities-using-automapper-and-entityframework-is-horrible/

That post was about both the technical and conceptual problems faced when using DataTransferObjects(DTOs) as the base of your service design.

In this post I will try to show you another way to model services that does not suffer from those problems on neither plane.

First, let’s talk about language.
I want to capture language and intent when modelling services, that is, I want to see the same vocabulary that the domain experts on my team use in my code.

Let’s re-use the same scenario as we used in my previous post:

//entities
class Order
    int Id //auto inc id
    string DeliveryAddress
    ICollection Details

class Detail
    int Id //auto inc id
    decimal Quantity
    string ProductId

Maybe our domain experts have told us that we need to be able to do the following:

Place new orders.
Edit an existing order.
Add a product to an order.
Change the quantity of a product in an order.
Remove a product from an order.
Set a delivery address for an order.
Ship an order.
Cancel an order.

If you are used to think in terms of CRUD, you might look at this and think:
Well, “placing an order” is the same as “CreateOrder”.
And “Editing an existing order” is the same as “UpdateOrder”.
And “Cancel an order” is the same as “DeleteOrder”
And I’ll solve the rest just by passing my DTO, let’s put a DeliveryAddress and a nullable ShippingDate property on order and just copy details from the DTO to the order.

OK, you will probably get away with it (for now), but the code will look like something in the previous post liked at the top here.
And most importantly, you will lose semantics and intention.

e.g. Does “order.ShippingDate = DateTime.Now” mean that the order is shipped now? that it is up for shipping?
What if we reset the order.ShippingDate to null? does that mean that the order have been returned to us?
Is that even a valid state transition?
Your code might be small’ish, except for the weird change tracking replication you have in your update service.

Wouldn’t it be nice if we instead could capture valid state transitions and the domain experts language at the same time?
What if we do something like this:

public int PlaceNewOrder(){
    using (UoW.Begin())
    {
       var order = new Order();
       UoW.Add(order);
       UoW.Commit();
       return order.Id;
    }
}

public void AddProduct(int orderId,string productId,double quantity)
{
    using (UoW.Begin())
    {
       var order = UoW.Find(o => o.Id == orderId);
       if (order.IsCancelled)
          throw new Exception("Can not add products to cancelled orders");

       var detail = new Detail
       {
           ProductId = productId,
           Quantity = quantity,
       }
       order.Details.Add(detail);
       UoW.Commit();
    }
}

public void ShipOrder(int orderId,DateTime shippingDate)
{
  ...
}

This is of-course a very naive example, but still, we managed to capture the language of the domain experts, and even if this will make our code more verbose, it captures explicit state transitions in a very clean way.
Someone new to the project can open up this code and they will be able to figure out what is going on in each individual method.
e.g. you will most likely not misunderstand the intention of a method called “ShipOrder” which takes an argument called ShippingDate.

So far so good, but from a technical point of view, this code is horrible, we can’t expose a service interface that is this fine-grained over the wire.
This is where command pattern comes in.

Lets redesign our code to something along the lines of:

public static class OrderService
{
    public static void Process(IEnumerable<command></command> commands)
    {
        var ctx = new MyContext();
        Order order = null;
        foreach (var command in commands)
            order = command.Execute(ctx, order);
        ctx.SaveChanges();
    }
}

public abstract class Command
{
    public abstract Order Execute(MyContext container,Order order);
}

public class CreateNewOrder : Command
{
    public override Order Execute(MyContext container, Order order)
    {
        order = new Order();
        container.OrderSet.Add(order);
        return order;
    }
}

public class EditExistingOrder : Command
{
    public int OrderId { get; set; }
    public override Order Execute(MyContext container, Order order)
    {
        order = container.OrderSet.Single(o => o.Id == OrderId);
        return order;
    }
}

public class SetDeliveryAddress : Command
{
    public string DeliveryAddress { get; set; }

    public override Order Execute(MyContext container, Order order)
    {
        order.Address = DeliveryAddress;
        return order;
    }
}

public class RemoveProduct : Command
{
    public string ProductId { get; set; }

    public override Order Execute(MyContext container, Order order)
    {
        order.Details.Remove(d => d.Name == ProductId);
        return order;
    }
}

public class AddProduct : Command
{
    public string ProductId { get; set; }
    public decimal Quantity { get; set; }

    public override Order Execute(MyContext container, Order order)
    {
        var detail = new Detail
        {
            Name = ProductId,
            Quantity = Quantity,
        };
        order.Details.Add(detail);
        return order;
    }
}
...etc

So, what happened here?
We have created a single operation, Execute, that receives a sequence of “commands”.
Each command is modeled as a subclass of “Command”.
This enables us to send a sequence of commands and process them all in the same transaction.

We have captured language, intentions, and each command can easily be tested.
And as a bonus, it doesn’t suffer from the weird mapping problems that the DTO approach suffers from.

Another nice aspect of this is that if we want to edit a few details of a large object, e.g. we want to change quantity of a single product in an order of 100 order details, we can send a single command.

So to sum it up, pros:

  • Captured domain language
  • Explicit state transitions
  • Network effective
  • Testable
  • Easy to implement, no odd change tracking problems like using the DTO approach
  • Extensible, we can easily add new commands to the existing pipeline w/o adding more service operations

Cons:

  • Verbose
  • Puts somewhat more responsibility on the client

The code in this example can of-course be improved, we are using an anemic domain model right now, but the essence of the command pattern can be seen here.

I will post a more “correct” implementation in my next post

That’s all for now

O/R Mapping and domain query optimizations


One of the cons of O/R mapping is that the abstraction is a bit too high.
You write object-oriented code and often forget about eventual performance problems.

Take this (somewhat naive) example:

class Customer
{
   ...
  public double GetOrderTotal()
   {
       var total = ( from order in this.Orders
                        from detail in order.Details
                        select detail.Quantity * detail.ItemPrice)
                       .Sum();

       return total;
   }
}

For a given customer, we iterate over all the orders and all the details in those orders and calculate the sum of quantity multiplied with itemprice.
So far so good.

This will work fine as long as you have all the data in memory and the dataset is not too large, so chances are that you will not notice any problems with this code in your unit tests.

But what happens if the data resides in the database and we have 1000 orders with 1000 details each?
Now we are in deep s##t, for this code to work, we need to materialize at least 1 (cust) + 1000 (orders) * 1000 (details) entities.
The DB needs to find those 1 000 001 rows , the network needs to push them from the DB server to the App server and the App server needs to materialize all of it.
Even worse, what if you have lazy load enabled and aren’t loading this data using eager load?
Then you will hit the DB 1 000 001 times… GL with that! :-)

So clearly, we can not do this in memory, neither with lazy load nor eager load.

But what are the alternatives?
Make an ad hoc sql query?
In that case, what happens to your unit tests?

Maybe we want to keep this code, but we want to execute it in the database instead.

This is possible if we stop beeing anal about “pure POCO” or “no infrastructure in your entities”

Using an unit of work container such as https://github.com/rogeralsing/Precio.Infrastructure

We can then rewrite the above code slightly:

class Customer
{
   ...
  public double GetOrderTotal()
   {
  var total = ( from customer in UoW.Query<Customer>() //query the current UoW
                        where customer.Id == this.Id //find the persistent record of "this"
                        from order in customer.Orders
                        from detail in order.Details
                        select detail.Quantity * detail.ItemPrice)
                       .Sum();

       return total;
   }
}

This code will run the query inside the DB if the current UoW is a persistent UoW.
If we use the same code in our unit tests and use an in mem UoW instance, this code will still work, if our customer is present in the in mem UoW that is..

So the above modification will reduce the number materialized entities from 1 000 001 to 1 (we materialize a double in this case)

I don’t know about you , but I’d rather clutter my domain logic slightly and get a million times better performance than stay true to POCO and suffer from a broken app.

Entity Framework 4 Enum support in Linq


As many of you might know, Entity Framework 4 still lacks support to map enum properties.
There are countless of more or less worthless workarounds, everything from exposing constants integers in a static class to make it look like an enum to totally insane generics tricks with operator overloading.

None of those are good enough IMO, I want to be able to expose real enum properties and make Linq queries against those properties, so I’ve decided to fix the problem myself.

My approach will be using Linq Expression Tree rewriting using the ExpressionVisitor that now ships with .net 4.
By using the ExpressionVisitor I can now clone an entire expression tree and replace any node in that tree that represents a comparison between a property and an enum value.

In order to make this work, the entities still needs to have an O/R mapped integer property, so I will rewrite the query from using the enum property and enum constant to use the mapped integer property and a constant integer value.

For me this solution is good enough, I can make the integer property private and make it invisible from the outside.

Example

public class Order
{
     //this is the backing integer property that is mapped to the database
  private int eOrderStatus {get;set;}

  //this is our unmapped enum property
  public OrderStatus Status
     {
  get{return (OrderStatus) eOrderStatus;}
            set{eOrderStatus = (int)value;}
     }

     .....other code
}

This code is sort of iffy and it does violate some POCO principles but it is still plain code, nothing magic about it..

So how do we get our linq queries to translate from the enum property to the integer property?

The solution is far simpler that I first thought, using the new ExpressionVisitor base class I can use the following code to make it all work:

namespace Alsing.Data.EntityFrameworkExtensions
{
    public static class ObjectSetEnumExtensions
    {
        private static readonly EnumRewriterVisitor visitor = new EnumRewriterVisitor();
        private static Expression< Func< T, bool>> ReWrite< T>(this Expression< Func< T, bool>> predicate)
        {
            var result = visitor.Modify(predicate) as Expression< Func< T, bool>>;
            return result;
        }

        public static IQueryable< T> Where< T>(this IQueryable< T> self,
            Expression< Func< T, bool>> predicate) where T : class
        {
            return Queryable.Where(self, predicate.ReWrite());
        }

        public static T First< T>(this IQueryable< T> self,
            Expression< Func< T, bool>> predicate) where T : class
        {
            return Queryable.First(self, predicate.ReWrite());
        }
    }

    public class EnumRewriterVisitor : ExpressionVisitor
    {
        public Expression Modify(Expression expression)
        {
            return Visit(expression);
        }

        protected override Expression VisitUnary(UnaryExpression node)
        {
            if (node.NodeType == ExpressionType.Convert && node.Operand.Type.IsEnum)
                return Visit(node.Operand);

            return base.VisitUnary(node);
        }

        protected override Expression VisitMember(MemberExpression node)
        {
            if (node.Type.IsEnum)
            {
                var newName = "e" + node.Member.Name;
                var backingIntegerProperty = node.Expression.Type.GetMember(newName, System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Public)
                    .FirstOrDefault();

                return Expression.MakeMemberAccess(node.Expression, backingIntegerProperty);
            }

            return base.VisitMember(node);
        }
    }
}

The first class, is an extension method class that overwrite the default “where” extension of IQueryable of T.
The second class is the actual Linq Expression rewriter.

By including this and adding the appropriate using clause to your code, you can now make queries like this:

var cancelledOrders = myContainer.Orders.Where(order => order.Status == OrderStatus.Cancelled).ToList();

You can of course make more complex where clauses than that since all other functionality remains the same.

This is all for now, I will make a followup on how to wrap this up in a Linq query provider so you can use the standard linq query syntax also.

Hope this helps.

//Roger

Linq To Sql: POCO and Value Objects


Fetching POCO Entities and Value Objects using Linq To SQL

Linq To Sql support neither POCO Entities nor Value Objects when using it as an O/R Mapper.
What we can do is to treat it as a simple auto generated Data Access Layer instead.

By treating it as a DAL we can manually handle the data to object transformations in a type safe manner.
If we for example want to fetch a list of POCO Customers that also have an immutable Address value object associated to them,
we could use the following code to accomplish this:

//Poco prefix only used to distinguish between l2s and poco entities here
IList<Customer> FindCustomers(string name)
{
   var query = from customer in context.Customers
                   where customer.Name == name
                   select new PocoCustomer
                   {
                      Id = customer.Id,
                      Name = customer.Name,
                      Address = new PocoAddres
                            (customer.AddressStreet,
                             customer.AddressZipCode,
                             customer.AddressCity)
                   };

    return query.ToList();
}

This approach is quite handy if you work with multiple data source and don’t want to mix and match entities with different design in the same domain.

I’m sure many will find this approach quite dirty, but I find it quite pragmatic;
You can be up and running with a clean domain model in just a few minutes and simply hide the Linq To Sql stuff behind your DAL classes.

This works extremely well if you are into the “new” Command Query Separation style of DDD.
You can use Linq To Sql to create typed transformations from your Query layer and expose those as services.

Personally I’ve grown a bit tired of standard O/R mapping frameworks, simply because they try to do too much.
There is a lot of magic going on, it’s hard to keep track on what gets loaded into memory and when they will hit the database.

If I’m required to use both a memory profiler and a O/R mapper profiler in order to use the framework successfully, then something is very wrong with the whole concept.

This dumbed down DAL approach to Linq To Sql however makes the code quite explicit, you know when you hit the DB and what you get from it.
Sure you lose features like dirty tracking that mappers generally give you, but this can be accomplished by applying a Domain Model Management framework on top of  your POCO model.
Or maybe you just want to expose your objects as services and don’t care about those features.

[Edit]
In reply to Patriks comment:

If you go for Command Query Separation, you would only query the query layer, so you wouldn’t need to handle updates there.
And when it comes to writing data, you do that in the command layer , the commands carries the changes made from the GUI and thus you wouldn’t need to “figure out” what has changed.
The commands will carry that information for you.

Tracking changes in the GUI could simply be done by storing snapshots of the view specific data when you send a query.
Then pass a user modified projection together with the original snapshot to a command builder.
You could then submit the commands for processing.
[/Edit]

( hmmm, I somehow managed to turn a post about Linq To Sql into a rant about other O/R mappers, I usually do it the other way around :-) )

Linq To Sql: Dynamic Where Clause


Dynamic where clause using Linq To SQL:

Let’s say we need to implement a search method with the following signature:

IEnumerable FindCustomers(string name,string contactName,string city)

If the requirement is that you should be able to pass zero to three arguments to this method and only apply a “where” criteria for the arguments that are not null.
Then we can use the following code to make it work: 

IList<Customer> FindCustomers(string name,string contactName,string city)
{
     var query = context.Cutomers;

     if (name != null)
        query = query.Where ( customer => customer.Name == name );

     if (contactName != null)
        query = query.Where ( customer => customer.ContactName == contactName );

     if (city!= null)
        query = query.Where ( customer => customer.City == city );

     return query.ToList();
}

This way we can pass different combinations of arguments to the method and it will still build the correct where clause that executes at database level.

Do note that this only works when the different criteria should be “AND”‘ed together, but it’s still pretty useful for use cases like the one above.

Two flavors of DDD


I have been trying to practice domain driven design for the last few years.
During this time, I have learnt that there are almost as many ways to implement DDD as there are practitioners.

After studying a lot of different implementations I have seen two distinct patterns.

I call the first pattern “Aggregate Graph”:

When applying aggregate graphs, you allow members of one aggregate to have direct associations to another aggregate.
For example, an “Order” entity which is part of a “Order aggregate” might have a “Customer” property which leads directly to a “Customer” entity that is part of a “Customer aggregate”.

 aggregate-graph

According to Evans book this is completely legal, any member of an aggregate may point to the root of any other aggregate.
Evans is very clear on the matter that aggregate root identities are global while identity of non root entities are local to the aggregate itself.

The opposite pattern would be what I call “Aggregate Documents”:

Here the aggregates never relate _directly_ to other aggregate roots.
Instead, the associations may be designed as “snapshots” where you store light weight value object clones of the related aggregate roots.
An “Order” entity would have a “Customer” property which leads to a “CustomerSnapshot” value object instead of a Customer entity.
This way each aggregate instance becomes more of a free-floating document.

aggregate-document

Since I have been applying both of these patterns, I will try to highlight the pros and cons of them in the rest of this post.

Aggregate Graph

The Aggregate Graph pattern is the approach I used when I first started doing DDD and I think that it is the most common way to implement DDD.
Since I was an O/RM developer (NPersist) this felt very natural to me, I could design my object graph in our design tool and then draw a few boxes on top of it and claim that those were my aggregates.
I most often used eager load inside the aggregates and lazy load between aggregates in order to avoid that the entire database was fetches when one aggregate instance was loaded.

This had a very nice “OOP” feel to it, I was working with objects and associations and I could ignore that there even was a database involved.

My “Repositories” were mere windows into my object graph, I could ask a repository to give me one or more aggregate roots and from those object I could pretty much navigate to any other object in the graph due to the spider web nature of the aggregate graph.

repository-window

The pros of this approach is that it is easy to understand, you design your domain model just like any other class model.
It also works very well with O/R mappers, features like Lazy Load and Dirty Tracking makes it all work for you.

However, there are a few problems with this approach too.
Firstly, Lazy Load in O/R mappers is an implicit feature, there is no way for a developer to know at what point he will trigger a roundtrip to the database just by reading the code.
It always looks like you are traversing a fully loaded object graph while you are in fact not.
This often leads to severe performance problems if your development team don’t fully understand this.

I have seen reports over this kind domain models where the implicit nature of Lazy Load have caused some 700 round-trip to the database in a single web page.

This is what you get when you try to solve an explicit problem in an implicit way.

If you are going to use Lazy Load, make sure your team understands how it works and where you use it.

Another problem with this approach arise when you need to fill your entities with data from multiple sources.
Many of the applications I build nowadays relies on data from multiple sources, it could be a combination of services and internal databases.

When using Lazy Load to get related aggregates, there is no natural point where you can trigger calls to the other data sources and fill additional properties.
You will most likely have to hook into your O/R mapper in order to intercept a lazy load and call the services from there.
nowadays, I mostly use the second approach, Aggregate Documents.

Aggregate Document

Aggregate Document approach is much more explicit in its design.
For example, if you want to find the orders for a specific customer;
Instead of navigating the “Orders” collection of “Customer”, you will have to call a “FindOrdersByCustomer” query on the “OrderRepository”.

While I do agree that this looks less object oriented than the first approach, this allows developers to reason about the code in a different way.
They can see important design decisions and hopefully avoid pitfalls like ripple loading.

Another benefit is that since you only work with islands of data, you can now aggregate data from multiple sources much easier.
You can simply let your repositories aggregate the data into your entities.
(If you do it inside the actual repository or let the repository use some data access class that does it for it is up to you)
repo-prism
You don’t have to hook into any O/RM infrastructure since you no longer rely on lazy load between aggregates.

Personally I use eager load inside my aggregates, that is, I fetch “Order” and “Order Detail” together as a whole.
A side effect of this is that since I don’t use Lazy Load between aggregates and don’t use Lazy Load inside my aggregates, my need for O/R mapping frameworks drops.
I can apply this design without using a full-fledged O/R mapper framework.
I’m not saying that you should avoid O/R mapping, just that it is much easier to apply this pattern if you can’t use an O/R mapper for some reason.

This also makes it easier to expose your domain model in an SOA environment.
You can easily expose your entities or DTO versions of them in a service.

Lazy Load and services don’t play that well together.

Maybe it looks like I dislike the first approach, this is not the case, I may very well consider it in a smaller project where there is just one data source and where the development team is experienced with O/R mapping.
You can also create hybrids of the two approaches;
e.g. In Jimmy Nilsson’s book “Applying Domain Driven Design and Patterns” there are examples where an “Order” aggregate have a direct relation to the “Product” aggregate while the same “Order” aggregate uses snapshots instead of direct references to the “Customer” aggregate.

Snapshots also comes with the benefit of allowing you to store historical data.
The snapshot can for example store both the CustomerId and the name of the customer at the time the order was placed.

Thats all for now.

//Roger

Entity Framework 4 – Using Eager Loading


When Linq To Sql was released we were told that it did support eager loading.
Which was a bit misleading, it did allow us to fetch the data we wanted upfront, but it did so by issuing one database query per object in the result set.
That is, one query per collection per object, which is a complete performance nightmare. (Ripple loading)

Now in Entity Framework 4, we can actually do true eager loading.
EF4 will issue a single query that fetches the data for all the objects in a graph.
This have been possible in other mappers for a long time, but I still think it is awesome that Microsoft have finally listened to the community and created a framework that from what I’ve seen so far, does exactly what we want.

So how do you use eager loading in EF4 ?

Eager loading is activated by calling “ObjectSet[of T].Include(“Details.Product”)”, that is, a dot separated property path.
You can also call include multiple times if you want to load different paths in the same query.

There are also a few attempts out in the blog world to try to make it easier to deal with eager loading, e.g. by trying to remove the untyped string and use lambda expressions instead.

I personally don’t like the lambda approach since you can’t traverse a collection property that way; “Orders.Details.Product” , there is no way to write that as a short and simple lambda.

My own take on this is to use extension methods instead.
I always use eager loading on my aggregates, so I want a simple way to tell my EF context to add the load spans for my aggregates when I issue a query.
(Aggregates are about consistency, and Lazy Load causes consistency issues within the aggregate, so I try to avoid that)

Here is how I create my exstension methods for loading complete aggregates:

public static class ContextExtensions
{
  public static ObjectQuery<Order> 
           AsOrderAggregate(this ObjectSet<Order> self)
  {
    return self
        .Include("Details.ProductSnapshot")
        .Include("CustomerSnapshot");
  }
}

This makes it possible to use the load spans directly on my context without adding anything special to the context itself.
(You can of course add this very same method inside your context if you want, I simply like small interfaces that I can extend from the outside)

This way, you can now issue a query using load spans like this:

var orders = from order in context.OrderSet.AsOrderAggregate()
             select order;

And if you want to make a projection query you can simply drop the “AsOrderAggregate” and fetch what you want.

HTH.

//Roger