Monday, April 27, 2020

SQL Change Data Capture vs Change tracking

If you have ever installed Management Reporter on Dynamics GP, you know that it increases SQL resource usage and can create performance issues on your SQL server.  This is in large part due to its heavy use of SQL Change Tracking.

Change Tracking (CT) was introduced in SQL 2008.
It uses a synchronous process.  That means it runs in-line with your data operation, adding time to your save, update, insert, or delete operation.  This translates directly into reduced performance for the user in Dynamics GP.  It does not use much disk space, since it is only capturing which rows have changed and the type of change, not the data that changed.  Since there is no point in tracking changes if you don't act on them, this creates the same overall background hit to performance that CDC does, in addition to the front end in-line performance hit. 
For example, lets say I have an application that is interested in changes to the GL10000 records. 

    • I turn on CT for GL10000.  
    • A user creates a GL transaction.  Their save operation takes slightly longer while CT records that fact that they inserted a record.  
    • My application gets this notification and then separately queries the GL10000 table to get the data I an interested in from the new record, which uses overall resources from the SQL server, as well as hitting the GL10000 table directly.

Change Data Capture (CDC) was also introduced in SQL 2008, but only in Developer and Enterprise additions.  It is now also supported in Standard edition starting with SQL 2016sp1.
It uses an asynchronous process.  That means it does not directly slow down your data operations.  The asynchronous process does use resources such as processor, memory, and disk IO, so it has an indirect effect on performance for the user.  It also uses more disk space than CT because it is storing a default of 3 days of historical data for the table being tracked in order to capture the actual data that was changed.
 For example, lets say I have the same application that is interested in changes to the GL10000 records. 

    • I turn on CDC for GL10000.  This creates a shadow table of GL10000 that will hold historical data.  
    • A user creates a GL transaction.  There is no performance hit to their save operation.
    • CDC's asynchronous process reads the database log file and sees the insert operation.  It saves the data from the record to the shadow table, which uses overall SQL resources but does not hit the GL10000 table directly.
    • My application reads the shadow table on a predefined interval to get the data I am interested in, which uses overall SQL resources but does not hit the GL10000 table directly.

Both of these solutions will perform similarly in a controlled test environment with a small data change rate.  The difference really becomes apparent as the system scales up.  My assumption is that this is why CDC was only included in Enterprise edition early on.  

In my opinion, it is much easier to address overall performance of the SQL instance by adjusting resources, than to try to address the performance degradation of the individual user operations.  So, where possible, I would recommend using CDC rather than CT for tracking changes in a Dynamics GP database.  Hopefully, Management Reporter will make this adjustment soon!


SQL 2022 TSQL snapshot backups!

  SQL 2022 now actually supports snapshot backups!  More specifically, T-SQL snapshot backups. Of course this is hardware-dependent. Here ...