Monday, March 10, 2014

Using the ANY operator

I'll be the first to admit that I don't know everything about T-SQL. But I do like to think that I've had broad enough exposure to know of the existence of certain things so that when their use is appropriate I know where in TFM to R.

Hit Any key to continue

Case in point: I was working with one of my developers recently who had an interesting problem. Of the columns that we were already capturing via CDC, he was only interested in a given change if any of a subset of those columns were changed. His initial approach was to use fn_cdc_has_column_changed which was very slow (and the documentation even suggests against using it for that very reason!).

By way of example, let's say that I have CDC set up against the Production.Product table in AdventureWorks2012 that captures change for all columns in the table. But, for whatever reason, we have a process that only cares about changes to the Name, StandardCost, and ListPrice columns. If any (or all) of those columns are updated, we want to know. If none of those columns are updated, we don't want our query to return a row for that change.

I came up with a solution involving fn_cdc_get_column_ordinal and fn_cdc_is_bit_set that I thought was a little clever. Plus, it marks the first time that I've ever used the ANY operator in actual code. Here goes:

with cte as (
    select * from (values
        ('Name'        ),
        ('ListPrice'   )
    ) as x(col)
, column_ordinals AS (
    select sys.fn_cdc_get_column_ordinal('Production_Product', col) as [ordinal]
    from cte
FROM cdc.[fn_cdc_get_net_changes_Production_Product](
    , 'all with mask'
) AS fcgacpp
WHERE [__$operation] = 4 --after update
    AND 1 = ANY (
        select [sys].[fn_cdc_is_bit_set]([ordinal], [__$update_mask])
        from [column_ordinals]

Let's break it down. First, I've got two common table expressions. The first of which just takes the names of the columns we're interested in, pumps them through a table value constructor to produce a table. The second CTE takes the results of the first and figures out what position each column is in in the CDC change table. For instance, we find that the Name column is column number 2.

Next, we use the cdc.fn_cdc_get_net_changes_... function to, well, get the net changes for that capture instance. The magic comes in the where clause. The first predicate is just saying that we're only interested in "after update" rows. The second predicate looks at the update mask that comes along with the CDC record and if one of the columns that we care about was updated returns a 1. Otherwise, it returns a 0. Once that calculation is done for all columns, that result set is fed to the ANY operator and if any of the columns that we care about was updated, that predicate will evaluate as true and the row will show up in the result set.

No comments:

Post a Comment