Tuesday, June 30, 2009

OUTER JOIN in LINQ

I’m quite disappointed to find out that LINQ has an OrderByDescending() and the “VB-ish” ThenBy() but seems to forget a dedicated extension method for OUTER JOIN. Although not as straightforward as I wanted it to be, SelectMany offers a working solution.

image 

For illustration, let’s consider the very naive model below. A left join from Child to Parent should yield 2 rows, with the second row having null values on the Parent columns.

image 

Dragging the tables to your LINQ-to-SQL canvass produces the model below. From this we can issue our statements.

image

The solution uses “statement” syntax, not the SQL-like syntax you always see when someone touts about LINQ features. To help with the breakdown of the LINQ statement structure, I included the SQL syntax and highlight the equivalent parts.

image

Part labeled 1 is the projection statement with the LINQ version utilizing anonymous type. Notice that type inference compels me to convert the ParentId to nullable type. Not doing so would have led the compiler to use Int32 as the type for ParentId. If this happens, an exception would occur for the second row because the value assigned to that property would be null.

Part labeled 2 qualifies the join. The object from which SelectMany is called returns all the rows. This is exactly what we wanted with the Child table. Also notice that we made sure that the qualifier columns from both table have the same type. In order to achieve this, the Value property of the Nullable<Int32> is called from Child.ParentId. You can also achieve the same by converting the other side into nullable as shown here:

image

You also need to call the DefaultIsEmpty() on the right table, in this case the Parent table. If you don’t call this method, the compiler would just ignore those rows with null values in the Parent table, effectively reverting to an INNER JOIN. Weird isn’t it?

Running the small snippet gives exactly what we wanted.

image

1 comment:

  1. I was so silly. I can't believe I went through all the agonizing syntax analyisis and criticizing when there is actually a much shorter solution, served on a silver platter:

    var childrenAndParent = children.Select(c => new { c.ChildId, ChildName = c.Name, ParentId = c.ParentId, ParentName = c.Parent.Name });

    This is possible because when using LINQ-to-SQL, VS generates the association to the Parent class in the form of a property Child.Parent. Similarly, the Parent class has the associated rows Parent.Childs.

    ReplyDelete