Monday, October 5, 2020

Which RAID should I use for my GP SQL server?

 I get this question from time to time.  Quite a bit of confusion seems to come from the way in which RAID is marketed.  There are also considerations on how a GP install uses SQL Server and the IO load it generates.

Typically the people setting up RAID arrays are more concerned with capacity vs cost.  However, they like to mix in performance numbers and this can be misleading.

First things first.  The RAID levels we should be concerned with for SQL Server are

RAID 0 - No redundancy.  Striping only.

RAID 10 - Mirror + striping.

RAID 6 - Double parity

(RAID 2-5 have been deprecated and 7 is pricy and overkill for most GP installations)


When asking a vendor about the differences in RAID, the following is a typical table you will see:


As you can see, it shows a modest read penalty with a huge up side in capacity.  Normally they leave off the write gain for RAID 6 (not sure why).  They also typically leave out RAID 0, probably because capacity is 1 to 1 for RAID 0.

This is useful for showing the difference in capacity for the same number of disks.  It is misleading when including the read and write gain.  Those numbers are best compared at similar capacities because the different RAID levels leave a differing number of disks available for IO.

Here is a much better table for comparing performance at the same capacity.


Now lets talk about how GP uses a SQL Server installation.
GP is an ERP.  Nearly everything the user does in the system requires a write operation to SQL.  There are plenty of reads, but the write speed seems to affect user experience more than read.  Since it is a multi-user system, resource intensive operations from a single user can cause a perceived slow-down to all the other users.  These large operations typically cause heavy use on tempdb and the log file of the company database.  There are also SQL jobs running throughout the work day related to GP that use resources in msdb as well as the DYNAMICS and company databases.  If SSRS is being used, there are additional read operations and heavy tempdb use (for ad-hoc table joins) that factor in.

In my experience, it is best to push the system RAM as high as possible, then move tempdb to a separate SSD, and then upgrade disks to bring write speed up as high as possible (read speed will follow) in that order.

RAID 6 is often pushed as the most economical way to add capacity, but in the GP world, write speed is king here.  A GP database that is more than 100GB in size is starting to be crushed under its own weight anyway.  The table and index structure, and the way in which the dexterity language uses the data begin causing performance degradation that is increasingly difficult and expensive to overcome.  The organization begins experiencing diminishing returns on further hardware investment.  In this scenario, the organization should stop investing in hardware until they develop and are ready to implement a strategy to reduce the size of the company database, or at least significantly slow its growth.  This can be done by either purging data or archiving it. If data is archived out of the company database and into another database, you now have 2 smaller databases.  While the overall capacity does not decrease, the options for storing and accessing the data multiply.  GP will perform better when reading and writing data on two 500,000 record tables than one 1,000,000 record table.  It is even better if the archive database can be moved to a different drive from the production database, since that way the two are not competing over read IO.

In a time when laptop hard drives are measured in Terabytes, it often does not make sense to build a huge RAID 6 array to hold GP SQL data.  This is data that needs to be separated from the rest of the organizations data, and its maintenance and backup procedures need to be handled differently.

So my answer to the question of which RAID to use for my GP server, I say all of them.

Best practices is to have at least 5 disks in a GP SQL Server setup.
Disk 1 - OS and SQL binaries
Disk 2 - tempdb
Disk 3 - system and user database data files
Disk 4 - system and user database log files
Disk 5 - backup files

Depending on your disk sizes and fault tolerance
Disk 1 - RAID 10
Disk 2 - no RAID - single SSD drive
Disk 3 - RAID 10
Disk 4 - RAID 10
Disk 5 - RAID 6 (or RAID 0 with frequent cloud backups)

Let's talk about these choices:

Disk 1
The OS and SQL binaries do not take enough space to warrant the capacity advantages of RAID 6, and do not need fast write speed.

Disk 2
There is no reason to need redundancy on the tempdb drive.  tempdb rebuilds itself every time SQL is restarted.  It is volatile, "in-flight" data, such as large data aggregations for reports or month-end routines.  It does not need to be backed up, but it does need the fastest disk IO possible.  That is why I recommend it be placed on a separate SSD.

Disk 3
The data files need fast read and write speed.  If there is a large amount of data, it may seems to be worth the lower cost to sacrifice performance and go with RAID 6.  However, if you find yourself considering that, first find out why there is so much data. (More on that in a later post)

Disk 4
The log files need the faster write speed.  The log file disk does not need to be as large as the data file disk.

Disk 5
The backup drive needs lots of capacity but not necessarily fast disk IO.  RAID 6 is a good fit for this drive.  Depending on the procedures in place for copying and retaining off-site backups, you may be able to forgo fault tolerance altogether and cut cost by using RAID 0.

Disk 6?
If you have an archive database, it may be beneficial to add Disk 6 to remove the capacity and IO from the production disks.  In this scenario, I would recommend placing both the data and log files for that database on Disk 6 and using RAID 0 or no RAID.  This database will only periodically experience writes and should be backed up immediately after each archiving session, eliminating the need for fault tolerance.



Here are some useful links to help you learn how to calculate IO and capacity for RAID.







Wednesday, September 16, 2020

How should I configure my disks for my new GP/SQL installation?

 It has been a while since I posted anything on disk setup for SQL Server and since technology marches on, below is my recommendation for a Great Plains/SQL installation.

Important considerations here are:

  • GP databases are categorized as a high IO databases.
  • GP databases are mission critical and depend on high reliability storage.

SQL Setup Footprint

  • SQL Server has .mdb files, which are the main data files, and .ldb files, which are the transactional log files. These files should be sized with at least 10% empty space to prevent auto-growth.
  • SQL Server has 4 system databases that are required for any SQL instance to work
    • Master
      • Holds the information that is global to the SQL instance.
      • This is not a high use database and should not grow much over time.
      • Typical size is less than 100MB.
      • Best practice is to place its files in the same location as the user database files, but it installs on the OS drive by default and would need to be moved immediately after the SQL installation completes.
      • These files should be backed up monthly or any time major changes are made, such as adding or removing databases.
    • Model
      • This is a template that is used for creating user databases.
      • Configuring it to properly reflect your most common desired settings can save much configuration time of databases later.
      • Typical size is less than 10MB.
      • Best practice is to place its files in the same location as the user database files, but it installs on the OS drive by default and would need to be moved immediately after the SQL installation completes.
      • These files should be backed up monthly.
    • Msdb
      • Holds the objects used by maintenance plans and SQL Agent objects (such as jobs and dbmail requests).
      • Typical size is under 2GB.
      • It will grow progressively larger due to job history records if maintenance is not regularly performed to clear those records. If this database gets too large, the SQL agent jobs may start failing due to poor performance.
      • Best practice is to place its files in the same location as the user database files, but it installs on the OS drive by default and would need to be moved immediately after the SQL installation completes.
    • Tempdb
      • This is used by the SQL instance to process data reads and writes.
      • Depending on the size of the databases being queried and the size of the data sets being returned, this database can grow quite large. The total size all files for tempdb in a typical GP instance is around 10GB, but month end processes can cause it to temporarily grow up to 100GB on a large system.
      • The files for tempdb are deleted and rebuilt each time the SQL Services are restarted, so there is no need to back them up.
      • It should be configured with the same number of files as there are active processors available for the SQL service.
      • Because of its high IO load and its large effect on overall system performance, it is highly recommended that the tempdb files (data and log) be stored on a dedicated SSD drive.
      • To discourage storage of other files on the drive and to prevent the need for file growth, best practice is to pre-size tempdb so that it takes up 80% of the drive space by default.
  • A GP/Signature install will contain the following user databases 
    • DYNAMICS
      • This is the main database for the GP instance.
      • Typical size is around 2GB on a large system and it does not typically grow very fast.
      • Should be stored on user database and user log drives.
    • Company database(s)
      • There will be one database for each GP company. It will have a maximum of 5 characters in the name.
      • Initial size with no testing data is around 2GB. Once the size hits 100GB, it is considered a large company database and steps should be taken to put procedures in place to archive or purge data regularly for performance.
      • Some of these databases will have a high IO load during peak usage hours and would benefit from being stored on a dedicated drive, otherwise they should be stored on user database and user log drives.
    • RESCOXRM_CONFIG
      • Holds configuration data for Mobiletech.
      • Not a high use database.
      • Typical size if less than 10MB
      • Should be stored on user database and user log drive
    • {company}_RESCOXRM
      • There will be one database for each company that MobileTech is installed on.
      • Initial size is less than 100MB. Size will vary based on the amount of transactional data being exchanged with the mobile devices, but should not exceed the size of the company database.
      • Should be stored on user database and user log drive
    • ReportServer and ReportServerTempDB
      • Holds SSRS reports and cached report data
      • Typical size is less than 1GB.
      • This database is high use for read operations.
      • If possible, Report Server should be installed on a completed separate SQL Instance with separate storage. Otherwise, the files should be stored on the user database and user log drive.
    • Other user databases
      • Should be stored on the user data and user log drives.
SQL Backups

  • Whether Maintenance plans or SQL Jobs are used, a SQL instance will not run for very long without regular maintenance.
  • Backups are a part of that maintenance. There are routines that are run that allow the log file to write transactional data to the data file and clear out old transactional data that will only run when a backup is taken of the database.
  • SQL needs a drive to write the backup files to.
  • Best practice is to store these on a separate drive and to regularly make a copy of the backups and move off-site.
Drive setup

Drives should be sized to at least 25% larger than the initial size of the data. 
  • Drive 1
    • OS
    • SQL binaries
    • GP Test client
  • Drive 2
    • SSD if possible and physically located inside the server
    • Data and log files for Temp DB
  • Drive 3
    • User database data files
    • System database data files
  • Drive 4
    • User database log files
    • System database log files
  • Drive 5
    • Backup files

Types of disks

  • HDD
    • Spinning hard drive.
    • Today's 15k HDD drives can transfer data at around 130MB/s
    • A drive in the drive setup represents a separate physical HDD if that is what is being used.
  • SDD
    • Solid State Drive
    • A drive in the drive setup above represents a separate physical SDD if that is what is being used.
  • SAN
    • While the speed of the drives in the SAN can affect the performance of SQL Server, the bigger bottleneck is usually the Fibre Channel connection between the server and the SAN.
    • Specs aside, a 4GB Fibre Channel will perform in a real production environment about the same as a high speed enterprise local HDD. Adding more HBA’s may help, depending on your SAN brand.
    • SAN monitoring software can help diagnose pathing issues, but different SAN vendors have different definitions for active/active, so you need to make sure you do some research into how the arrays must be set up and where the SQL files must be stored to take advantage of it.
    • Most virtual environments cannot truly take advantage of multipathing, which can severely limit SQL performance.
    • A drive in the drive setup above represents a separate LUN, preferably each LUN representing a drive group or RAID array.
  • RAID
    • May use SDD or HDD
    • If RAID is used locally, RAID 10 is recommended with a separate array for each drive in the drive setup.
    • RAID recommendations when part of a SAN generally follow the San vendors recommendations.


Monday, April 27, 2020

SQL Change Data Capture vs Change tracking

If you have ever installed Management Reporter on Dynamics GP, you know that it increases SQL resource usage and can create performance issues on your SQL server.  This is in large part due to its heavy use of SQL Change Tracking.

Change Tracking (CT) was introduced in SQL 2008.
It uses a synchronous process.  That means it runs in-line with your data operation, adding time to your save, update, insert, or delete operation.  This translates directly into reduced performance for the user in Dynamics GP.  It does not use much disk space, since it is only capturing which rows have changed and the type of change, not the data that changed.  Since there is no point in tracking changes if you don't act on them, this creates the same overall background hit to performance that CDC does, in addition to the front end in-line performance hit. 
For example, lets say I have an application that is interested in changes to the GL10000 records. 

    • I turn on CT for GL10000.  
    • A user creates a GL transaction.  Their save operation takes slightly longer while CT records that fact that they inserted a record.  
    • My application gets this notification and then separately queries the GL10000 table to get the data I an interested in from the new record, which uses overall resources from the SQL server, as well as hitting the GL10000 table directly.

Change Data Capture (CDC) was also introduced in SQL 2008, but only in Developer and Enterprise additions.  It is now also supported in Standard edition starting with SQL 2016sp1.
It uses an asynchronous process.  That means it does not directly slow down your data operations.  The asynchronous process does use resources such as processor, memory, and disk IO, so it has an indirect effect on performance for the user.  It also uses more disk space than CT because it is storing a default of 3 days of historical data for the table being tracked in order to capture the actual data that was changed.
 For example, lets say I have the same application that is interested in changes to the GL10000 records. 

    • I turn on CDC for GL10000.  This creates a shadow table of GL10000 that will hold historical data.  
    • A user creates a GL transaction.  There is no performance hit to their save operation.
    • CDC's asynchronous process reads the database log file and sees the insert operation.  It saves the data from the record to the shadow table, which uses overall SQL resources but does not hit the GL10000 table directly.
    • My application reads the shadow table on a predefined interval to get the data I am interested in, which uses overall SQL resources but does not hit the GL10000 table directly.

Both of these solutions will perform similarly in a controlled test environment with a small data change rate.  The difference really becomes apparent as the system scales up.  My assumption is that this is why CDC was only included in Enterprise edition early on.  

In my opinion, it is much easier to address overall performance of the SQL instance by adjusting resources, than to try to address the performance degradation of the individual user operations.  So, where possible, I would recommend using CDC rather than CT for tracking changes in a Dynamics GP database.  Hopefully, Management Reporter will make this adjustment soon!


Friday, January 10, 2020

Cloning SQL Servers


It is common for our customers to clone virtual  servers.  Sometimes they even clone SQL servers.  We even tell them to do this sometimes, such as when we want to create an identical test environment.

It is a huge time saver, but when cloning a SQL server, there are a few additional things we need to watch out for.  If the customer IT person is not a DBA, they will most likely have no clue to watch out for these things.

*** If the SQL server they are cloning has drive mappings or any type of alias to the storage location where SQL data, log, or backup files reside, these may need to be re-mapped on the clone.  Before using a cloned SQL server, verify the drives you see are not really the production drives. 

The easiest way to check is to launch the cloned server and save an empty text file to each storage location, then launch the production server and see if you can see the files.  If you can, STOP!  It means the new SQL server may try to modify production data.  The SQL services on the cloned server should be stopped immediately until the problem can be addressed.  Each server should have completely physically isolated storage.

*** If the SQL server has maintenance plans:  Maintenance plans create jobs and have a hardcoded connection string to the jobs they create.  On the cloned server, these connection strings will point to the production server.  This may cause SQL jobs to fire twice.  This will be noticeable for backup plans because there will be 2 nighty backup files generated, or job failures due to conflicts.  This will also degrade performance during the day on the production server. 

The correct way to prevent this starts BEFORE the cloning.  Record the steps in the maintenance plans on the production server, then disable all the plans and jobs on the production server before cloning it.  Re-enable them after cloning is complete.  Then on the clone server, delete the maintenance plans and re-create them.  Check all the other jobs to make sure their connection (bottom left of the job properties window) is correct before re-enabling them.


SQL 2022 TSQL snapshot backups!

  SQL 2022 now actually supports snapshot backups!  More specifically, T-SQL snapshot backups. Of course this is hardware-dependent. Here ...