Mighty Code

Tuesday, July 9, 2013

Tweaking Office 2013

Ok, time to make the switch to Office 2013. After installing, the number one complaint (according to hundreds of message boards on the topic) is that the applications seem to have taken a huge step back in the "look and feel" department. There are even complaints that the white and gray color scheme is causing headaches.

While I agree that the default UI is strangely bland (considering the candy-coated colors of Win8), like any other MS transition, with a little poking around we can find ways to tweak office to make most users happy.

1. Hiding the start page.
The Start page is the page with the really huge icons that displays each time you open word or excel. If you are like me, it would save you a mouse click to just jump to the blank document 99% of the time. So here is how we hide it:
a. Open the office app
b. Click the blank document icon
c. Click File > Options > General
d. Uncheck the "Show the Start screen" option (it is in the same place for excel)

2. Getting your ribbon back
When I first started up word, the ribbons were hidden. It can be annoying to have to click on the ribbon tab to display it each time. To make it stick around:
a. First click on the ribbon tab (such as HOME)

b. Then click the pin on the right bottom of the ribbon

3. Change the Default Save location

When you click save as, it will default to sharepoint or skydrive. If you would like to change the default option that comes up and save another mouse click:

a. Go to Options > Save

b. Check the box for Save to Computer by Default and set the default file location

4. Remove the all-caps from the ribbon tabs.
If you are tired of the flaming all caps ribbon tabs, here is an easy way to change them. I was able to change all but the FILE tab. It is amazing how much more readable the tabs are with this simple change.
a. Go to Options > Customize Ribbon
b. In the right hand pane, right click on each ribbon name you want to change and choose rename
c. The name should already be in upper/lower case. Type a space after the name and click ok

5. Change the information your status bar displays
Yes, the little bar at the bottom of the your app can display lots of helpful info and functions.
a. Right click on the status bar and choose the options you want. I like to display line numbers in word and add num-lock and caps lock in excel.

6. Color scheme
Now for the most annoying part. The extreme white-out color scheme. The issue is really not the lack of color, but the high contrast color scheme. No additional color schemes have been offered, so there is not really a good fix, but here are some things you can try that might make it more tolerable for you.
a. Turn down the brightness on your monitor.
b. Go to File > Account and try a different color scheme. Granted, there is only white, light gray, and dark gray, but pick the one that is best for you.
c. In outlook, select view > View Settings > Other Settings and set the grid line style to solid.
d. In outlook, select view > View Settings > Conditional Formatting. For the unread messages, change the font to bold italic to help differentiate new messages.

7. Reclaim some screen real estate in outlook.
By default the new outlook places a one line preview on each message in the preview pane. This greatly reduces the number of messages you can see at once in the list and causes lots of scolling. To turn this off:
a. Click the View Tab
b. Choose Message Preview
c. Choose Off

Change view to compact
a. Click the View Tab
b. Click Change View
c. Choose Compact

Remove the Profile Picture from the email
a. Click File > Options > People
b. Uncheck show user photographs

Shrink the Task Bar at the bottom
a. Click the ellipse on the task bar and choose Navigation Options

b. Edit the display order if you like and choose compact navigation. This option will shrink the task bar and limit it to the left pane, reclaiming almost an inch of screen space in the reading pane.

Change to Reading Pane on the right layout.

a. Click View > Reading Pane > Right

Being a longtime "reading pane on the bottom" guy, it really ticks me off, but the engineers and testers have admitted that they all use reading pane on the right mode, so they didn't notice the severity to which they screwed up the other options with things like the email header that the user can't remove. Indeed, when you change it to the right, it is imediately apparent that the entire outlook app layout was redesigned to accomodate only this view.

Did this help you? If so, please leave a comment!

Tuesday, March 5, 2013

To embed or not to embed?

Here is a good read on the pros and cons of embedded in SSRS reports.
http://www.sqlchick.com/entries/2011/4/4/pros-and-cons-stored-proceduresvs-embedded-queriesvs-views-i.html

In a nutshell, the pros to embedding the SQL are:
Pros to developer:

No “Create” permissions needed on the source database
One-step deployment

Pros to user:

None

Now lets look at the pros to using stored procs or views for the datasource.
Benefits to developer:

Can reuse the same stored procedure for more than one report, so less code to write in the long run, and fewer bugs to track down in the short run
DBA can tune more effectively, if needed
Permits the DBMS to secure the object, if needed
Provides a layer of abstraction between the database tables & the report (for example: you can alias column names, create derived fields, minimize the effect of other physical object changes, or show less fields to simplify the reporting process), so less code to write in the long run
Provides an additional layer of security since "Execute" permissions are required for either the user running the report, or an impersonation account
Ability to query system tables to find usage of tables & columns, so easier change management
For a minor change, permits the ability to alter the stored procedure without requiring the RDL to be redeployed, so no time wasted redeploying the rdl
Can be used for gathering data in applications or smartlists with 100% assurance that the data matches what is on the report.
Views give you one more in that they don't need execute permission to run

Benefits to user:

Performance: Stored procs can take advantage of quey re-use plans to increase performance.
Flexibility: can base ad-hoc reporting (such as smartlist) on the database object that generates the report, rather than having to muddle through table joins.

Now that is a pretty lopsided set of lists. However, in my opinion, good reports make happy users. The report is the end product of the user's work, and for many managers, reports are their only interaction with software systems.

So from that perspective, we should throw away the lists pertaining to developers and just look at the lists of benefits to the users. That is when we can truly see that stored procs win this battle hands-down.

Did this help you? If so, please leave a comment!

Friday, February 8, 2013

Using dataset from stored proc in another proc

There are times when it is really convenient to gather data in a view and then use that view in one or more stored procs to build a complex dataset. Sometimes, however, that base dataset can run much more efficiently if you use one or more temp tables to join data. Views can;t handle this type of logic, but stored procs can. Historically, it was difficult to use the dataset returned rom a stored proc in the logic of other stored procs, but nowadays it is relatively easy using temp table variables.

Say we have a stored proc that returns a dataset like this:

create proc myBaseData(
@param1 int,
@param2 datetime
)
as
begin
select @param1 as int1,
   @param2 as date1
end
go

Now say we want to call that proc and use the dataset in another proc, but we don;t want to save the records to a physical table. We can insert the proc results in a table variable like so:

create proc MyCallingProc(
@param1 int
)
as
begin
declare @mytable as table(
  col1 int,
  col2 datetime
)

declare @d datetime
select @d=convert(varchar,GETDATE(),112)

insert into @mytable
exec myBaseData @param1,@d

select * from @mytable
end
go

We can then execute it as follows:

exec MyCallingProc 1

The big bonus here is that I don;t have to create a view and take the performance hit of filtering it every time I join it. The params on the proc can be used to filter the results before joining.

Another bonus is that I can set a primary key (unique index) on the table variable, which is something I can't do in the view.

Did this help you? If so, please leave a comment!

Wednesday, September 26, 2012

Table variable tests show that they are much faster than #temp!

Working on a client's data migration recently has given me the chance to benchmark table variables against traditional temp tables in SQL 2008.

The server I used was not especially fast and tempdb is even stored on the same drive as the rest of the databases. Still, the results have shown that table variables were at least 10x faster in every test.

The most telling scenario was as follows:

There was a table with 2.8Million rows. One of the fields of interest was a 500 character comment field.
The application that populated this table added a record for each comment with no regard to keys (there was no unique key). However, I noticed that there was a combination of fields that could be used for a unique key, but there were several thousand instances where I would need to concatenate the comment field from more than one record to make the key unique.

Example:
Key     Comment
1         abc
1         def
2         der
3         mmd
3        xyz

Need to be imported to the new table as:
1       abcdef
2       der
3       mmdxyz

There was a bit more complexity involved, but hopefully, you get the idea.

So I needed to use xml_path to concat the values like this:
SELECT A.KEY,
(SELECT COMMENTFIELD AS 'text()'
FROM MYTABLE
WHERE KEY=A.KEY
ORDER BY KEY ASC
FOR XML_PATH(''))
FROM MYTABLE A

The problem is that this returns a recordset like this:
1   abcdef
1   abcdef
2   der
3   mmdxyz
3   mmdxyz

No problem, I just need to group them, right? Well there are 2.8M rows here! Time to benchmark!

I first wrote this using #temp tables.
SELECT A.KEY,
(SELECT COMMENTFIELD AS 'text()'
FROM MYTABLE
WHERE KEY=A.KEY
ORDER BY KEY ASC
FOR XML_PATH(''))
INTO #MYTEMP
FROM MYTABLE A

The insert and grouping alone took 9.5 hours.

I then rewrote it to use table variables

DECLARE @MYTEMP AS TABLE(
KEY AS INT,
COMMENTFIELD AS VARCHAR(1000)
)

INSERT INTO @MYTEMP
SELECT A.KEY,
(SELECT COMMENTFIELD AS 'text()'
FROM MYTABLE
WHERE KEY=A.KEY
ORDER BY KEY ASC
FOR XML_PATH(''))
FROM MYTABLE A

This same logic took 21 minutes! That means it only took 4% as long to run!

If that doesn't make the case for using table variables over temp tables, I don't know what will. From now on, I will default to the table variable when writing code and only consider the temp table for special cases where I need to pass the dataset around.

Did this help you? If so, please leave a comment!

Thursday, August 9, 2012

Why you should not store user data on the same drive as SQL files

I am not aware of any official “Best Practice” statement from Microsoft in relation to storing user files on the same drive as SQL database and log files, however, I would recommend against the practice for three reasons:

Drive space. SQL database and log files can grow quickly without advance notice. Best practice is to keep 15% drive space free at all times to safeguard against unexpected growth causing the drive to run out of space. Users are much more unpredictable than the database, and can run the drive out of space without warning, causing SQL to crash.
Disk I/O. Each HBA (Host Bus Adapter) has a finite amount of I/O throughput that it can handle. Even if the concerns from #1 are handled using disk space quotas or some other limiting measure, I am not aware of a good way to limit certain users I/O capacity. User file operations are typically larger and slower than SQL data operations. So even if the users are not filling up the drive, they could easily use enough of the I/O throughput to cause a choke point for SQL and degrade performance. These types of performance hits would be very difficult to diagnose and track down, so better to avoid them up front.
Fragmentation. When set up properly, SQL server grows data and log files by allocating a large block of contiguous disk space. Since user files are much smaller, they can create thousands of tiny gaps in disk space. The result is that SQL Server cannot find a continuous block large enough for the next growth operation. The SQL data and log files will become fragmented. There is no safe way to defragment them with windows tools, and the SQL Server Maintenance operations that do so can only handle defragmenting the SQL file if there is sufficient continuous disk space, not moving the user files. The end result is a performance hit and can become significant over time. This is also one of the reasons that the SQL user databases and tempdb data and log files should not be stored on the OS drive.

(I did not list security concerns, as those can be overcome with proper AD setup and administration, but many organizations’ security policy still do not allow this for fear of undiscovered exploits in AD security.)

If this is still not clear, hopefully this analogy will shed some light.
Imagine a swimming pool filled with barrels with a garden hose running to it that can be moved from barrel to barrel.

The swimming pool is the hard drive.
The barrels are the database growth allocation blocks.
The garden hose is the HBA.
The water is the data.

SQL pulls and pushes water through the garden hose, usually at a trickle.
A user storing or reading a file blasts water at high pressure through the hose intermittently. There is nothing stopping the user from blasting so much water in that they overflow the pool. If they blast or suck water through the hose for an extended period of time (seconds is extended), then SQL has to wait for them to finish to resume the trickle.
Users can put water in any barrel, but SQL prefers an empty barrel. If there are no empty ones, SQL must use part of two or more barrels, but must remember which barrels go together.

This quickly becomes laborious to keep track of, and translates into degraded performance.

Did this help you? If so, please leave a comment!

Tuesday, July 3, 2012

Dream SQL Install

I see debates all the time over which way is best to set up the hardware for a particular SQL install. So I decided to lay out what the hardware setup would look like in my ideal GP SQL installation if money was no object.

First, I would buy a Dell PowerEdge and an HP Proliant and us them as doorstops while the boys carry in a new maxed out SGI IDC3212-RP4
Specs here: http://www.sgi.com/products/servers/infinitedata_cluster/configs.html

We may also need step-stools later, so I would then toss them in the corner. I would then hook up a SAN with a fibre channel backbone. I am only installing TWO in my imaginary world, so I would put in 5 2TB SSDs. I would install the OS and SQL Server binaries on SSD1. Then the following:
SSD2: MDF files (Except tempdb)
SSD3: LDF files
SSD4: tempdb files
SSD5: BAK files
I would set the initial db size on DYNAMICS and TWO to 250GB to prevent file growth for a while.

I would then connect the entire rig to a 16Tbps connection and replicate the install on the east and west coast, I would then set up SQL clustering to mirror the entire setup between the systems, and use SQL replication features to allow seamless switching from one instance to another in case of failure.

So there is the upper limit. Anything else is a compromise, and must be weighed against an organization's needs and budget.

I hope you enjoyed my little foray in dreamland.

Tuesday, June 19, 2012

SQL Table variables

How to use table variables (introduced in T-SQL in SQL Server 2000)

First, lets look at the reason we are using a non-permanent table in the first place. Say I have a table with 100,000 rows. I need to join that table to another and I will only find matches in a small number of records. Additionally, I only need a work with a few columns of mylarge table. Joining this large table to my query can severely degrade performance. I can often overcome this issue and increase performance by creating a table on the fly in code, loading it with the records from the large table that I will actually be working with, and using that instead.

In prior versions of SQL, the only way to do this was with a temp table.

Here is a sample of a temp table declaration:
CREATE TABLE #MYTEMPTABLE ([object_id] INT,[name] VARCHAR(128),max_length SMALLINT);

This creates a table in the temp database (on disk). I can then load this table like so:
INSERT INTO #MYTEMPTABLE SELECT [object_id],[name],max_length FROM sys.all_columns WHERE max_length>2000;

Note that if I were to just use the all_columns table, my query would have to do a table scann through many thousands of records to find the ones with max_length>2000. This way we will do that one time up front, before a join.

I can work with the temp table much like a normal table:
SELECT * FROM sys.columns C INNER JOIN #MYTEMPTABLE T ON C.[object_id]=T.[object_id];

I must remember to drop the temp table when I am done with it. Otherwise it could cause object conflicts.
DROP TABLE #MYTEMPTABLE;

The down side to doing it this way is that the few records I am working with are still being stored in a physical database (tempdb), which is always slower than RAM (though not by much if the database is on an SSD).

This code can be made to run faster (sometimes by orders of magnitude) by using a table variable instead of a temp table.
Here is the same functionality:
DECLARE @MYTEMPTABLE AS TABLE ([object_id] INT,[name] VARCHAR(128),max_length SMALLINT);

INSERT INTO @MYTEMPTABLE SELECT [object_id],[name],max_length FROM sys.all_columns WHERE max_length>2000;

SELECT * FROM sys.columns C INNER JOIN @MYTEMPTABLE T ON C.[object_id]=T.[object_id];

Advantages of table variables over temporary tables:
1. They have well-defined scopes.
2. They're cleaned up automatically at the end of the stored proc in which they're defined
3. They tend to result in fewer recompilations
4. They result in less locking in stored procs and transactions
5. For recordsets smaller than 100,000 they are almost always faster than a temporary table. In 12 years, I have never run into a real world instance where using the table variable was slower than the temporary table as long as the number of rows stored in the table variable was less than 100,000. In cases where the dataset is larger than that, you typically get more benefit from being able to define indexes on the temporary table.

Did this help you? If so, please leave a comment!