Nullable Nomad: January 2009

Saturday, January 31, 2009

Simple Database Change Log in SQL Server 2008

Another pesky bug came out just recently. After scrutinizing the logs I discovered that it was caused by a parameter mismatch between the stored procedure and our our service. Someone modified the procedure, probably for testing purposes. There's no way for us to find out who actually did it because the database objects are not under any source control. If it were a SQL Server database, one thing that the DBA could have done was implement a basic change logging mechanism using Data Definition Language (DDL) Trigger.

Data Definition Language (DDL) Trigger is a code block that is executed by SQL Server after a certain event concerning some specific objects. The complete list of events is found here. DDL trigger was was first introduced in SQL Server 2005 as part of the SQL-CLR feature but I wasn't able to delve much deeper into it because I was pretty much concentrated with UDT then. With 2008, it was ported to SQL Server and that means it can now be created using T-SQL. The fact that I can create DDL trigger without compiling from CLR assembly made the new future very enticing. The benefit I ultimately get from DDL trigger made me hate myself for not learning it when it first came out 4 years ago. It provided me with a simple way of tracking changes in my database. If you are part of team where database security is very lenient (everyone is an admin), then this tool is indispensable.

My database change log is nothing more than a table populated by a generic DDL trigger. I say generic because it takes care of all the events from the most volatile database objects like table, procedure, view, trigger and function. Change among these could surely wreck havoc to the objects in the upper tier. The change log table for storing the changes information is shown below:

The columns are mirrored from the elements found in the schama returned by EventData(). If SQL trigger is to CLR event handler, then EventData() is the EventArgs. The schema of EventData varies depending on the object and the event. You can check the complete schema here. The columns are the ones common to all the events of the volatile object I just mentioned. Take note that EventData() is a context-sensitive function. It can only be utilized inside the body of a DDL trigger.

The trigger tracks all the pertinent events and inserts the data from EventData(). The look might be a novelty to anyone who haven't worked with XML data type.

The value() is a SQL XML data type method for navigating the nodes. It uses XPath syntax for the first argument. The second argument is the data type to which the result is converted to. Don't concern yourself too much with the syntax. This is all you really need to know about XML data type as far as DDL trigger is concerned.

After you catalogue the trigger, it appears in the Database Triggers node under your database as shown below:

That's it! Try creating, updating and dropping the objects monitored by the trigger and see the rows inserted. Oh, did I mention that the entire command is also available to you? Just check TSQLCommand column for that.

Be aware though that there are temporary objects that SQL Server generates in some instances like when you alter your table. These temporary objects can also fire the trigger. It's easy to spot them because they have a consistent name format Tmp_<objectName>; for example Tmp_Employee.

One thing you probably want to happen is for your future databases to have this simple change logging mechanism. If that's the case, then do yourself a favor and create the two objects in the model database.

To the SQL Server 2005 folks, hey I'm not leaving you out. We'll tackle 2005 next time. ;)

Friday, January 30, 2009

Removing Duplicate Rows The Lazy Way

There is an easy way to remove duplicate rows in a table albeit not so space-efficient (but hey, space is cheap). Removing duplicate rows, now matter how crazy it is, is a very common task in database maintenance and data warehousing. I've once worked on an SSIS project before which involved one big and dirty database. Unique key constraints were non-existent making the tables very susceptible to duplicates. I was faced with similar situation today when verifying the bugs claimed by the developers on the business objects. It turned out to be a data-related issue. There are duplicates rows because the database developers failed to create unique indices on some of the business tables. Hunting for those rows can be done using complicated queries involving DISTINCT and temp tables but they're too tedious. I always opt for the lazy approach instead. I call it such because it does not require that much thinking. Here are the steps:

1) Create a duplicate table to contain the pristine data. This can be done easily in SQL Server Management Studio through the Script Table as context menu item as shown below:

2) Run the script but make sure you rename the table.

3) Create a unique index with the option to ignore duplicate key. The column that defines this index is the one duplicated in the dirty table. Unfortunately, there is no click-click way to creating an index with such an option. You need to execute this command:

CREATE UNIQUE INDEX IX_NoDuplicates (Code) WITH IGNORE_DUP_KEY

where the name of the index is IX_NoDuplicates and the column defining it is Code.

4) The last step is to execute the INSERT-SELECT statement to copy the rows

INSERT NoDuplicates (Code) SELECT Code FROM WithDuplicates

Voila! No more duplicates!

This particular option is not created in SQL Server solely for this purpose. There are instances when reporting a duplicate error does not contribute any value to the business. It merely lengthen development time. I'm talking, for example, of a simple reference table like EmploymentType. Does it really matter to the company if an HR personnel happened to enter value "Contractual" the second time?

Sunday, January 25, 2009

A Cow From Hell

I've been ruminating on blogging for quite sometime already but I just couldn't gather myself to start one. By “start”, I mean writing my first entry. I created accounts a couple of times only to leave them dormant later on. I usually get rid of extra luggage and those dormant blog accounts happened to be on top of the list. It was stupid and one awful way to waste time. It’s like trying to do something and undoing it the next minute. I live in that world – a world where a mistake can be easily rectified by pressing some key combination. Unfortunately, life is not like that. Some great things in life can be achieved by engaging oneself in something undoable. Blogging definitely is not one of those but it could somehow prove to myself that I have the resolve to start something trivial and actually sustain it. It’s a small step in banishing the fear that is always there to dissuade me whenever I set focus on something “life-changing”. The circumstance even raises the ante for I’m commencing blogging on the least opportune time. My colleague just left for India last Friday for a 3-week vacation with possible extension. This means more work on a dual role for me. A deja vu perhaps? Naah! I’ve already been to hell of software development and back. This one is just basking under the warm Florida sun. ;)