Monday, October 5, 2020

Which RAID should I use for my GP SQL server?

 I get this question from time to time.  Quite a bit of confusion seems to come from the way in which RAID is marketed.  There are also considerations on how a GP install uses SQL Server and the IO load it generates.

Typically the people setting up RAID arrays are more concerned with capacity vs cost.  However, they like to mix in performance numbers and this can be misleading.

First things first.  The RAID levels we should be concerned with for SQL Server are

RAID 0 - No redundancy.  Striping only.

RAID 10 - Mirror + striping.

RAID 6 - Double parity

(RAID 2-5 have been deprecated and 7 is pricy and overkill for most GP installations)


When asking a vendor about the differences in RAID, the following is a typical table you will see:


As you can see, it shows a modest read penalty with a huge up side in capacity.  Normally they leave off the write gain for RAID 6 (not sure why).  They also typically leave out RAID 0, probably because capacity is 1 to 1 for RAID 0.

This is useful for showing the difference in capacity for the same number of disks.  It is misleading when including the read and write gain.  Those numbers are best compared at similar capacities because the different RAID levels leave a differing number of disks available for IO.

Here is a much better table for comparing performance at the same capacity.


Now lets talk about how GP uses a SQL Server installation.
GP is an ERP.  Nearly everything the user does in the system requires a write operation to SQL.  There are plenty of reads, but the write speed seems to affect user experience more than read.  Since it is a multi-user system, resource intensive operations from a single user can cause a perceived slow-down to all the other users.  These large operations typically cause heavy use on tempdb and the log file of the company database.  There are also SQL jobs running throughout the work day related to GP that use resources in msdb as well as the DYNAMICS and company databases.  If SSRS is being used, there are additional read operations and heavy tempdb use (for ad-hoc table joins) that factor in.

In my experience, it is best to push the system RAM as high as possible, then move tempdb to a separate SSD, and then upgrade disks to bring write speed up as high as possible (read speed will follow) in that order.

RAID 6 is often pushed as the most economical way to add capacity, but in the GP world, write speed is king here.  A GP database that is more than 100GB in size is starting to be crushed under its own weight anyway.  The table and index structure, and the way in which the dexterity language uses the data begin causing performance degradation that is increasingly difficult and expensive to overcome.  The organization begins experiencing diminishing returns on further hardware investment.  In this scenario, the organization should stop investing in hardware until they develop and are ready to implement a strategy to reduce the size of the company database, or at least significantly slow its growth.  This can be done by either purging data or archiving it. If data is archived out of the company database and into another database, you now have 2 smaller databases.  While the overall capacity does not decrease, the options for storing and accessing the data multiply.  GP will perform better when reading and writing data on two 500,000 record tables than one 1,000,000 record table.  It is even better if the archive database can be moved to a different drive from the production database, since that way the two are not competing over read IO.

In a time when laptop hard drives are measured in Terabytes, it often does not make sense to build a huge RAID 6 array to hold GP SQL data.  This is data that needs to be separated from the rest of the organizations data, and its maintenance and backup procedures need to be handled differently.

So my answer to the question of which RAID to use for my GP server, I say all of them.

Best practices is to have at least 5 disks in a GP SQL Server setup.
Disk 1 - OS and SQL binaries
Disk 2 - tempdb
Disk 3 - system and user database data files
Disk 4 - system and user database log files
Disk 5 - backup files

Depending on your disk sizes and fault tolerance
Disk 1 - RAID 10
Disk 2 - no RAID - single SSD drive
Disk 3 - RAID 10
Disk 4 - RAID 10
Disk 5 - RAID 6 (or RAID 0 with frequent cloud backups)

Let's talk about these choices:

Disk 1
The OS and SQL binaries do not take enough space to warrant the capacity advantages of RAID 6, and do not need fast write speed.

Disk 2
There is no reason to need redundancy on the tempdb drive.  tempdb rebuilds itself every time SQL is restarted.  It is volatile, "in-flight" data, such as large data aggregations for reports or month-end routines.  It does not need to be backed up, but it does need the fastest disk IO possible.  That is why I recommend it be placed on a separate SSD.

Disk 3
The data files need fast read and write speed.  If there is a large amount of data, it may seems to be worth the lower cost to sacrifice performance and go with RAID 6.  However, if you find yourself considering that, first find out why there is so much data. (More on that in a later post)

Disk 4
The log files need the faster write speed.  The log file disk does not need to be as large as the data file disk.

Disk 5
The backup drive needs lots of capacity but not necessarily fast disk IO.  RAID 6 is a good fit for this drive.  Depending on the procedures in place for copying and retaining off-site backups, you may be able to forgo fault tolerance altogether and cut cost by using RAID 0.

Disk 6?
If you have an archive database, it may be beneficial to add Disk 6 to remove the capacity and IO from the production disks.  In this scenario, I would recommend placing both the data and log files for that database on Disk 6 and using RAID 0 or no RAID.  This database will only periodically experience writes and should be backed up immediately after each archiving session, eliminating the need for fault tolerance.



Here are some useful links to help you learn how to calculate IO and capacity for RAID.







SQL 2022 TSQL snapshot backups!

  SQL 2022 now actually supports snapshot backups!  More specifically, T-SQL snapshot backups. Of course this is hardware-dependent. Here ...