Posts Tagged Performance

Graphical Execution Plan Notes for SQL Server 2005

I often need a quick execution plan reference for query optimization in SQL Server 2005 that gives me a definition of the icon and a link to more details. The resource on page ms175913.aspx from the msdn website is very helpful but a bit incomplete.

For my personal use, I compiled the following document. I include the description of each plan icon and removed all items that do not exist in SQL Server 2005. The letters under the item name represent its type (N/A=none; P = Physical, L = Logical). If the item has any important optional Arguments, they are listed under the type. To keep the document short, details on optional arguments are not included.

You can download the document: ‘Graphical Execution Plan Notes‘. It is a zip file that contains a single Word document. If you have any suggestions on how to improve it, let me know!

Here’s a preview:

 

 Graphical Execution Plan Notes for SQL Server 2005

 

And here are some links to other blogs that discuss query optimization in SQL Server:

And a great book:
Inside Microsoft® SQL Server(TM) 2005: Query Tuning and Optimization

The document can be downloaded here, or on my download page. I hope you find this useful!

 

Tags: ,

2 Comments

What’s faster? SQL or looping to return a set of records?

When you need to get a set of records (for example, to return a list of items from a lookup table, to prepare a report, or to fill a read-only grid), which method should you use?

Of course, the answer is ‘it depends’ and many factors are at play. It is part of our job as developers (or programmers, if you’d prefer) to pick the solution that best fits the scenario. As much has been written on this topic already, I won’t be revealing anything new here: Just a re-package, if you will.

First, the basics. There are several ways to go through a table with the intention of producing a set of resulting records. You can SCAN, DO WHILE, use a FOR..NEXT loop, or use SQL syntax. The first three options are procedural and process one record at a time. The last (SQL) is set-based. To get an idea on what looping mechanism to use, take a look at Andy Kramek’s blog entry from March 2006 called “Writing Better Code (Part 2)”. In this entry, Andy does a great job of showing you the effects of indexes on the three looping mechanisms, and should give you a good idea which one to use and when.

Using SQL syntax, you can return a result of records as well. You can replace your looping syntax with SQL syntax and get the same results (using much less code). When gathering data for a report, for example, I have often used the loop method to populate a special report cursor. I was inclined to stick with the procedural approach in the past because (a) it was easy enough to write, (b) easy to debug, (c) allowed me to easily insert a custom progress bar if needed, and (d) because I thought it was best. As my SQL skills improved, I found that SQL syntax is actually (a) easier to write, (b) just as easy to debug, (c) capable of supporting a custom progress bar, and (d) may in fact be best under many circumstances. This is especially true if you are trying to write portable code (duh!).

To demonstrate my point, I first took a table with 125,000 records with no index (CDX) file. This first case will process each record in the table. My benchmarking methodology here would make anyone who has ever done benchmarking in a scientific way before blood boil, but hey, I never claimed to be a scientist (I don’t even play one on TV)!

First, are we all on the same page?

SET ANSI OFF
SET EXACT OFF
SET TALK OFF
SET DELETED ON
CLEAR

Carry on…

USE name_list IN 1 EXCL
nSecs = SECONDS()
CREATE CURSOR crName (name c(20))
SELECT name_list
LOCATE
DO WHILE NOT EOF()
    INSERT INTO crName VALUES (name_list.name)
    SKIP
ENDDO
? SECONDS() - nSecs && results: 0.313 seconds (10 runs)
USE IN name_list

*-----
 
USE name_list IN 1 EXCL
nSecs = SECONDS()
CREATE CURSOR crName (name c(20))
SELECT name_list
FOR nRec = 1 TO RECCOUNT()
    GOTO nRec
    INSERT INTO crName VALUES (name_list.name)
NEXT
? SECONDS() - nSecs && results: 0.281 seconds (10 runs)
USE IN name_list
 
*-----
 
USE name_list IN 1 EXCL
nSecs = SECONDS()
SELECT name FROM name_list INTO CURSOR crName
? SECONDS() - nSecs && results: 0.000 seconds (10 runs)
USE IN name_li
*-----
 
USE name_list IN 1 EXCL
nSecs = SECONDS()
CREATE CURSOR crName (name c(20))
SELECT name_list
FOR nRec = 1 TO RECCOUNT()
    GOTO nRec
    INSERT INTO crName VALUES (name_list.name)
NEXT
? SECONDS() - nSecs && results: 0.281 seconds (10 runs)
USE IN name_list

*-----
 
USE name_list IN 1 EXCL
nSecs = SECONDS()
SELECT name FROM name_list INTO CURSOR crName
? SECONDS() - nSecs && results: 0.000 seconds (10 runs)
USE IN name_li
*-----
 
USE name_list IN 1 EXCL
nSecs = SECONDS()
SELECT name FROM name_list INTO CURSOR crName
? SECONDS() - nSecs && results: 0.000 seconds (10 runs)
USE IN name_list

So what on earth is going on? The SQL Statement processes that entire result in (literally) no time at all! The answer can be found in a comment made in a recent blog posting of mine. Sergey explained that VFP doesn’t actually run any SQL. All it does is create a filtered cursor on the underlying dataset. This is why RECCOUNT could be “wrong”, as discussed in my blog entry. Check out that link to Sergey’s site to read more about this. You can check what you got simply by issuing JUSTEXT(DBF()) (as Sergey does): “DBF” means a filtered cursor was created, while “TMP” means that a new cursor was created.

Additional Notes:

When there is a CDX file (with no index set), the FOR loop seems to suffer a little: On average, the FOR loop takes an additional .020 seconds to run. The existence of a CDX file does not seem to bother the DO WHILE or SQL method in the above case.

Now, let’s add a condition that the name was added to the table on or after 01/01/2005. The first run of this test is with no CDX file at all. If a CDX file exists, but no index set, then the results are a bit different (see notes after the code):

 
USE name_list IN 1 EXCL
nSecs = SECONDS()
CREATE CURSOR crName (name c(20))
SELECT name_list
LOCATE
DO WHILE NOT EOF()
    IF created>={^2005-01-01}
        INSERT INTO crName VALUES (name_list.name)
    ENDIF
    SKIP
ENDDO
? SECONDS() - nSecs  && results: 0.234 seconds (10 runs)
USE IN name_list

*-----
 
USE name_list IN 1 EXCL
nSecs = SECONDS()
CREATE CURSOR crName (name c(20))
SELECT name_list
FOR nRec = 1 TO RECCOUNT()
    GOTO nRec
    IF created>={^2005-01-01}
        INSERT INTO crName VALUES (name_list.name)
    ENDIF
NEXT
? SECONDS() - nSecs  && results: 0.235 seconds (10 runs)
USE IN name_list
 
*-----
 
USE name_list IN 1 EXCL
nSecs = SECONDS()
SELECT name FROM name_list WHERE created>={^2005-01-01} INTO CURSOR crName
? SECONDS() - nSecs  && results: 0.125 seconds (10 runs)
USE IN name_li
*-----
 
USE name_list IN 1 EXCL
nSecs = SECONDS()
CREATE CURSOR crName (name c(20))
SELECT name_list
FOR nRec = 1 TO RECCOUNT()
    GOTO nRec
    IF created>={^2005-01-01}
        INSERT INTO crName VALUES (name_list.name)
    ENDIF
NEXT
? SECONDS() - nSecs  && results: 0.235 seconds (10 runs)
USE IN name_list

*-----
 
USE name_list IN 1 EXCL
nSecs = SECONDS()
SELECT name FROM name_list WHERE created>={^2005-01-01} INTO CURSOR crName
? SECONDS() - nSecs  && results: 0.125 seconds (10 runs)
USE IN name_li
*-----
 
USE name_list IN 1 EXCL
nSecs = SECONDS()
SELECT name FROM name_list WHERE created>={^2005-01-01} INTO CURSOR crName
? SECONDS() - nSecs  && results: 0.125 seconds (10 runs)
USE IN name_list

Here, with no CDX file, VFP seems to actually process the SQL (it doesn’t create a filtered cursor), and does so rather fast. If you add an INDEX on created date, FoxPro goes ahead and creates the filtered cursor (which again is processed in zero seconds).

OK, so now what if I actually use the INDEX, and SET ORDER TO before all of this?

USE name_list IN 1 EXCL ORDER created
nSecs = SECONDS()
CREATE CURSOR crName (name c(20))
SELECT name_list
LOCATE
DO WHILE NOT EOF()
    IF created>={^2005-01-01}
        INSERT INTO crName VALUES (name_list.name)
    ENDIF
    SKIP
ENDDO
? SECONDS() - nSecs  && results: 1.047 seconds (10 runs)
USE IN name_list

*-----
 
USE name_list IN 1 EXCL ORDER created
nSecs = SECONDS()
CREATE CURSOR crName (name c(20))
SELECT name_list
FOR nRec = 1 TO RECCOUNT()
    GOTO nRec
    IF created>={^2005-01-01}
        INSERT INTO crName VALUES (name_list.name)
    ENDIF
NEXT
? SECONDS() - nSecs  && results: 0.235 seconds (10 runs)
USE IN name_list
 
*-----
 
USE name_list IN 1 EXCL ORDER created
nSecs = SECONDS()
SELECT name FROM name_list WHERE created>={^2005-01-01} INTO CURSOR crName
? SECONDS() - nSecs  && results: 0.000 seconds (10 runs)
USE IN name_li
*-----
 
USE name_list IN 1 EXCL ORDER created
nSecs = SECONDS()
CREATE CURSOR crName (name c(20))
SELECT name_list
FOR nRec = 1 TO RECCOUNT()
    GOTO nRec
    IF created>={^2005-01-01}
        INSERT INTO crName VALUES (name_list.name)
    ENDIF
NEXT
? SECONDS() - nSecs  && results: 0.235 seconds (10 runs)
USE IN name_list

*-----
 
USE name_list IN 1 EXCL ORDER created
nSecs = SECONDS()
SELECT name FROM name_list WHERE created>={^2005-01-01} INTO CURSOR crName
? SECONDS() - nSecs  && results: 0.000 seconds (10 runs)
USE IN name_li
*-----
 
USE name_list IN 1 EXCL ORDER created
nSecs = SECONDS()
SELECT name FROM name_list WHERE created>={^2005-01-01} INTO CURSOR crName
? SECONDS() - nSecs  && results: 0.000 seconds (10 runs)
USE IN name_list

The Do WHILE really gets hit hard, as you can see. The FOR LOOP works as
well as before, and the SQL once again gives us a filtered cursor.

As you can see, you can really take advantage of this behavior to create simple cursors with simple predicates. Even if VFP can’t create the filtered cursor (for example, applying an ORDER BY clause on the SQL statements), the results are still usually faster.

In a subsequent posting, I’ll increase the complexity of these tests. In the meantime, I would be interested in any feedback to these cases.

Tags: , , ,

4 Comments

An Introduction to the Coverage Profiler Application

Visual FoxPro (starting in version 7) comes with a built-in tool called the Coverage Profiler Application. This tool is designed to both cover (log information about each line of code run during program execution) and profile (produce useful information about the lines of code that are run) source code. The primary use for this tool is to detect code bottlenecks (which lines of code are taking too long to execute) and to locate lines of code not executed. These two functions of the Coverage Profiler are worth their weight in gold.

Sometimes I get the feeling that software performance is an afterthought during the development cycle. The code must work first. If time permits, then it is customary to go back and fine tune the source for performance. But reality (which many of you are living in) soon sets in, and time just doesn’t permit for this type of engineering. Of course, if performance is absolutely horrible then that must be addressed, but a half a second here or a quarter second there isn’t going to stir any milkshakes.

But coverage logs can really help. Consider a process (like saving a record) that zips along on a development machine, but crawls on an older test machine built some time before iPods came out. Having some knowledge of the coverage profiler can not only identify the problem, but can also help you solve it. You can identify the trouble spots in the call stack right down to the line of code causing the problem. Each problem you encounter will be different from the last, but identifying the problem will be the same.

The other major benefit of this tool is seeing which lines of code did not run. For testing purposes, this ability can really go a long way in flexing a new function or in testing a ‘fixed’ bug. It can show you where test cases are falling short, and allow you to write updated scripts that touch each line of code. It will also expose unnecessary blocks of code that an eager developer might have thought necessary (Developers: when was the last time you added an OTHERWISE statement to a DO CASE that you knew would never run? Of course, this practice is a good idea for clarity and readability, but perhaps a comment or ASSERT would be sufficient?).

What’s more is that the source code for these tools (coverage and profiler applications) is included with VFP. The Fox Development team has always encouraged its users to modify and enhance these tools to suit individual needs. Unfortunately I have not seen too many enhancements online over the years from other developers. Maybe the core functionality is good enough? I think so. But Perhaps I’ll think of something to build and put it here on this blog…

Tags: , ,

3 Comments