Sunday, April 19, 2009

Using Table Data Type for Variable-length Parameterized Search

The table data type is introduced in SQL Server 2008 to address the clamor of the community for an efficient means of inserting and updating multiple rows in the database. What used to be a task involving multiple calls to the database can now be achieved in one swoop. True to this objective, it’s no surprise that almost all discussion of table data type usage deals about this topic. Another one which is rarely discussed is on how table data type addresses one of the most volatile code construct in data-driven application – the variable-length parameterized search construct. This is the code construct for a search wherein you allow the user to specify 1 to n number of values for filtering a certain column. The user interface usually resembles the one below:

image

The interface above is obviously simplified. Most of the time, it involves a filter for almost every column in the result. The assumptions here are that the application does not cached any fetched data and the team has been living too long inside the cave that they haven’t heard about LINQ yet. Before SQL Server 2008 and LINQ, there are two approaches to achieving this, each has it’s own pros and cons.

The first approach is maintenance-free but opens your database to a possible SQL-injection attack. It involves a dynamic query which is composed of multiple strings. One of these strings is passed from the frontend application because it contains the values for the filter. The structure of the stored procedure may look like this:

CREATE PROCEDURE FilterDynaQuery @filterValues  VARCHAR(8000) AS
DECLARE @select VARCHAR(2000)
,@sort VARCHAR(500)
SET @select = ‘SELECT col1, col2, col3, … FROM Table1’
SET @sort = ‘ ORDER BY col1, col2, col3,…’
EXEC sp_executesql (@select + filterValues + @sort)

The @filterValues must contain all he hardcoded values from the frontend. The coder must make sure that regardless of the intention of the frontend, the concatenation of the all the components of the query should be valid. The argument passed should conform to the construct “ WHERE <column> IN (val1, val2,…valn)” like the one shown below
‘ WHERE col1 IN (1,5,8,4)’

The second approach ensures the safety of the database but couples the codes to the number of possible values in every filterable column. If the business adds or removes value allowable for that column, then the signature of the store procedure and the front-end codes should also change. For example, if there are 3 allowable values for that column, you need 3 parameters on for that column. Here’s how the procedure would then look like:

CREATE PROCEDURE FilterExplicitParams @value1 INT, @value2 INT, @value3 INT AS
SELECT col1, col2, col3 FROM Table1 WHERE col1 IN (@value1, @value2, @value3) ORDER BY col1, col2, col3

If you have many possible values for the column to be filtered, using this approach would surely bloat the codes. We all know that the more codes you write, the greater the chance for bugs and delays. Most coders - me included - would rather compromise security a bit, in exchange for easily maintainable codes. This is why the dynamic query is much preferred.

The New Approach

The table data type provides you the benefits of both approach without the compromises. However, be aware that RAD tools have no full support for queries with table-valued parameter, at least for now. In ASP.NET, you cannot use the SqlDataSource directly because you will not be able to fetch the schema of the result. ASP.NET requires the schema to configure the data control to which the data source is bound. This means executing the procedure using the dialog shown below. The problem is that it cannot accept the .NET types mapped to a SQL Server table type.

image

With ObjectDataSource this is not an issue because the tool doesn’t have to query the database for the schema. The quickest way to create an object for the result of our procedure is to make use of typed DataSet. Yes, this somewhat misunderstood and easily abused component is a boon for this kind of situation in a data-centric application. The stored procedure which will be discussed momentarily would become the source for the select command of the TableAdapter. If you’re new to typed DataSet, an excellent walkthrough can be found here.

For the sample application, we retrieve all the current employees from the AdventureWorks database. The tables involved are shown below.

image

Since the data is temporal, I decided to create a view which would contain only the current set of employees.

CREATE VIEW HumanResources.CurrentEmployee AS
SELECT e.NationalIDNumber, c.LastName, c.FirstName, c.MiddleName,
d.DepartmentID, d.Name AS Department, e.EmployeeID
FROM HumanResources.EmployeeDepartmentHistory h
JOIN HumanResources.Department d ON h.DepartmentID = d.DepartmentID
JOIN HumanResources.Employee e ON h.EmployeeID = e.EmployeeID
JOIN Person.Contact c ON e.ContactID = c.ContactID
WHERE (h.EndDate IS NULL)
The stored procedure accepts a table-valued parameter of type IntId. This is just a one-column table type that can be used to filter a integer column, just like the Employee.DepartmentId. Filtering a column is achieved by a join to the table or view to be filtered; in our case the CurrentEmployee
CREATE TYPE IntId AS TABLE
(
[Id] INT NOT NULL PRIMARY KEY
)
GO
CREATE PROC FilterEmployeeByDepartment
@DepartmentIds IntId READONLY
AS
SET NOCOUNT ON

SELECT e.EmployeeID, e.NationalIDNumber, e.LastName
, e.FirstName, e.MiddleName, e.Department
FROM HumanResources.CurrentEmployee e
JOIN @DepartmentIds ids ON e.DepartmentID = ids.Id
GO
Examining the parameter of the Select command in the TableDataAdapter would reveal that the mapped .NET object for the SQL table type is System.Object.

image

Don’t be dissuaded by what the editor seems to suggest. The only types you can use according to the documentation are DataTable, DbDataReader and IEnumerable. But even without having read the documentation, you would have probably used DataTable anyways; it’s just too obvious. Among the 3, IEnumerable still eludes me and I’m still struggling to get my first shot with it. I could care less; for now DataTable is perfectly fine for the job.

Setting up ObjectDataSource with typed DataSet is fairly easy. You can use the TableAdapter directly as shown below. In the next step, Visual Studio lets you specify which method is mapped to which CRUD method. With the sample application, the Get method of the adapter is used for the SELECT and the rest of the CRUD are just left blanks. The argument for the parameter is set to get its value from the Session.

image

The rows from the HumanResources.Department table should be fetched too. In the sample application, a SqlDataSource is used for this. A list box is bound to this source while a grid view is bound to the object data source discussed a while ago. There is no code necessary to populate the list box. As for the grid view, a dummy table is supplied during the loading because null is not yet supported by the ObjectDataSource with table-valued parameter. In the Find button click, the table is populated accordingly prior to passing it to the data source.

protected void Page_Load(object sender, EventArgs e)
{
if (!IsPostBack)
{
DataTable tbl = CreateDeptFilterTabl();
Session["DepartmentIds"] = tbl;
}
}

protected void btnFind_Click(object sender, EventArgs e)
{
DataTable tbl = CreateDeptFilterTable();
GridView1.Caption = "Employee(s) from ";
foreach (ListItem item in ListBox1.Items)
{
if (item.Selected)
{
tbl.Rows.Add(item.Value);
GridView1.Caption += (item.Text + " ");
}
}
Session["DepartmentIds"] = tbl;
DataBind();
}

DataTable CreateDeptFilterTable()
{
DataTable tbl = new DataTable();
tbl.Columns.Add("id", typeof(int));
return tbl;
}

Codes

You may download the codes here. If you don’t have AdventureWorks database, you can download it here. Be sure to change the connection string before you run the application.

Friday, April 10, 2009

SqlUserDefinedAggregate.IsNullIfEmpty: Does it work?

No. At least for me. I haven’t made this one work at all and I always have to resort to a dirty workaround. According to MSDN documentation, setting this property to true should direct your aggregate to return SQL-null when applied to an empty table which is exactly what you wanted. However, this is not the case. Let’s take a look at this naive example of a UAG which simply concatenates the string value of a column:

[Serializable]
[Microsoft.SqlServer.Server.SqlUserDefinedAggregate(
Format.UserDefined, MaxByteSize=8000, IsNullIfEmpty=true)]
public struct ConcatStr : IBinarySerialize
{
SqlString _concatenatedStrs;

public void Init()
{
_concatenatedStrs = null;
}

public void Accumulate(SqlString value)
{
if (value == SqlString.Null)
_concatenatedStrs += value;
}

public void Merge(ConcatStr group)
{
_concatenatedStrs += group.Terminate();
}

public SqlString Terminate()
{
return _concatenatedStrs;
}
#region IBinarySerialize Members

void IBinarySerialize.Read(System.IO.BinaryReader r)
{
_concatenatedStrs = new SqlString(r.ReadString());
}

void IBinarySerialize.Write(System.IO.BinaryWriter w)
{
if (_concatenatedStrs != SqlString.Null)
w.Write(_concatenatedStrs.Value);
}

#endregion
}

There’s nothing fancy about the codes and its implementation is pretty much straightforward. The quirk is, you don’t get a “NULL” if you apply this to an empty table as you see here

image

The workaround is simple. You need a flag that you can check if the Accumulate() method has been called or not. Accumulate() is called for every row in a table so no-call on this means the table is empty. Here’s the modified codes for our string aggregator

[Serializable]
[Microsoft.SqlServer.Server.SqlUserDefinedAggregate(Format.UserDefined
,MaxByteSize=8000, IsNullIfEmpty=true)]
public struct ConcatStr : IBinarySerialize
{
SqlString _concatenatedStrs;
bool _isEmpty;

public void Init()
{
_concatenatedStrs = null;
_isEmpty = true;
}

public void Accumulate(SqlString value)
{
if (value != SqlString.Null)
{
_concatenatedStrs += value;
if (_isEmpty==true)
_isEmpty = false;
}
}

public void Merge(ConcatStr group)
{
_concatenatedStrs += group.Terminate();
}

public SqlString Terminate()
{
if (!_isEmpty)
return _concatenatedStrs;
else
return SqlString.Null;
}
#region IBinarySerialize Members

void IBinarySerialize.Read(System.IO.BinaryReader r)
{
_isEmpty = r.ReadBoolean();
if (!_isEmpty)
_concatenatedStrs = new SqlString(r.ReadString());
}

void IBinarySerialize.Write(System.IO.BinaryWriter w)
{
w.Write(_isEmpty);
if (!_isEmpty)
{
if (_concatenatedStrs != SqlString.Null)
w.Write(_concatenatedStrs.Value);
}
}

#endregion
}

You shouldn’t forget to serialize the flag. Failure to do so wouldn’t preserve its value and it would always have the value in the initialization. If you put a breakpoint, in the terminate event, you could clearly say this one works.


image

And the result is what you expected:
image

The same workaround can be applied to any data type of your return value, be it a value-type or UDT. For example, if you don’t apply a workaround like the one below to an integer UAG, you get zero instead of null for an empty table!

[Serializable]
[Microsoft.SqlServer.Server.SqlUserDefinedAggregate(Format.Native
,IsNullIfEmpty=true)]
public struct SumInts
{
int _accumulator;
bool _isEmpty;
public void Init()
{
_isEmpty = true;
}

public void Accumulate(SqlInt32 value)
{
if (value != SqlInt32.Null)
{
_accumulator += value.Value;
if (_isEmpty)
_isEmpty = false;
}
}

public void Merge(SumInts group)
{
_accumulator += group.Terminate().Value;
}

public SqlInt32 Terminate()
{
if (!_isEmpty)
return new SqlInt32(_accumulator);
else
return SqlInt32.Null;
}
}

I still hope I would be proven wrong with my findings. But until then, I just have to content with this dirty codes. What’s important is that it works.