Saturday, February 7, 2009

Pathetic LINQ Programming

One of the tasks I inherited from a colleague after he left for a vacation is the very mundane (and time-consuming) xcopy of source files from DEV site to QA. That entails backing up the old version of those files before replacing them with the new ones just in case something is screwed up afterwards. It’s a very error-prone process. It’s so easy to miss a file or two if the list is quite long. It’s also possible that I back up the wrong file, thus preventing me from reverting to the previous working state. With my manager’s permission, I can definitely end this inefficiency with SVN or VSS but at the same time I thought it seemed to be a good time to flex those LINQ To Object muscles. So I did and the result is one pathetic utility program.

Steps

The steps were pretty much straightforward. It’s better understood if itemized together with the codes. Here they are

1) Get the list of the files from the source and destination folders.

These lists shall be compared to each other to find out which files are candidate for transfer and backup. IEnumerable<T> is a key component of LINQ To Object. This contains extension methods which can be exploited for set-based operations. The good thing is, DirectoryInfo.GetFiles() returns an array which is convertible to IEnumerable.

image

2) Get the list of existing files.

The existing files are those common in the source and destination. I should issue a LINQ To Object statement with a restriction clause similar to the “WHERE sourcefileName IN (destinationFileName1, destinationFileName2,…destinationFileNameN)” of SQL. And a WHERE method IEnumerable has.

image

Unlike SQL which mainly deals with a scalar value, LINQ deals with objects. It has to be specified what to match and how to do the match when comparing. In the case of the file comparison, I interested in matching only the file name at this point. The logic for this is contained in the IEqualityComparer which is required the Where overload I used.

image

The Equals() member dictates the logic of the compare and specifies which property to compare. This one means that a FileInfo object are equal if their names are the same. Had I omitted the IEqualityComparer in the Where statement above, .NET would have compared the two files using default object reference logic. That’s not what I wanted.

3) Get the list of files which are not updates.

These files could just have been copied accidentally and it was not really the intention to deploy them. The logic is similar to step 2 except that this uses another IEqualityComparer to compare only the LastWriteTime property of the files.

image

image

The FileLastWriteTimeEqualityComparer dictates that two files should be treated as equal if their LastWriteTime properties are the same. This is not always reliable. Two totally different files can have exactly the same LastWriteTime. In my codes, a better implementation is something that includes another property, the file name perhaps. But since I’m applying the WHERE restriction to the existing files, I’m assured I get files only from the source folder.

4) Get the list of update files.

For this step, I used the result from step 3 to perform something similar to “WHERE existingFile NOT IN (sameVersionFile1, sameVersionFile2…)” restriction of SQL:

image

Notice that it uses an overload that does not require an IEqualityComparer. This time, it’s safe to do so because the items in existingFiles and sameVersions come from the same set –file from the source folder. An equality in the object reference also implies equality on the properties of the objects so the FileNameEqualityComparer is redundant.

5) Get the list of new files

New files are those found only in the source. Taking away the existing files from the set found in the source yields this list.

image

6) Combine new files and update files.

These are the files that shall be copied to the destination. The OR operator below means that the source file should be either new or update to qualify.

image

7) Get the list of files to be backed up

The list of files to be backed up are files in the destination with the same name with the files in step 6.

image

8) Backup and copy

Need to say more?

image

Test

All this pathetic program needs are 3 folders like the ones below

image

After running the pathetic program, Text1 should be copied to the second folder since it’s new. Text2 should stay where they are. Text3 from source should be copied to Destination since it’s an update. Finally, Text3 from Destination should be copied to the Backup folder. The following debug trace clearly shows these:

image

That’s it! Pathetic as it may seem, it spares me from staying late in the office. I couldn’t imagine what the code would have looked like without LINQ. I know its possible but it would surely be messy and convoluted at its best.

No comments:

Post a Comment