Posts Tagged SQL

VFP INTERSECTing SQL

For a long time, we have had UNION and UNION ALL to write SQL statements that act as an ‘OR’ operation between two SQL statements. The ORing grabs records from either the first or the second SQL. INTERSECT, though, acts as an AND, meaning that the results must exist in both sets. Unfortunately, there is no INTERSECT keyword in VFP for us to use.

I have found that the need for INTERSECT usually accompanies partitioned files, or when doing comparisons between two versions of files. For example, to find all the customers that have placed an order in 2004 and 2005, using two partitioned customer tables (partitioned by calendar years 2004 and 2005), you can use intersect set logic. In SQL Server 2005, the syntax is quite simple:

*-- In SQL Server 2005, you can:
SELECT cust_id, cust_name, cust_dob FROM sales_2004
INTERSECT
SELECT cust_id, cust_name, cust_dob FROM sales_2005

But in VFP9, and in prior versions of SQL Server, you have to do a little more work:

*-- IN VFP9, the following is equivalent
SELECT cust_id, cust_name, cust_dob ;
  FROM (;
    SELECT cust_id, cust_name, cust_dob FROM sales_2004 ;
    UNION ALL ;
    SELECT cust_id, cust_name, cust_dob FROM sales_2005 ;
  ) AS tmp1 ;
  GROUP BY cust_id, cust_name, cust_dob ;
  HAVING COUNT(*) = 2 ;
  INTO CURSOR crIntersect

Please note these examples both make use of DISTINCT (SQL Server 2005 performs an implicit DISTINCT with INTERSECT). This means that it can only compare distinct records. What if the same customer ordered 12 products in 2004 and 8 in 2005? The above INTERSECT example would bring in the customer just once for each year. Although I have never had a need to write such a query, I have a few ideas on how it could be done. If anyone is bored or needs to burn 15 minutes, give it a try!

UPDATE

I failed to mention how the above code is better or different than simply using an inner join (as a colleague suggested). I had to think about this for a minute but then I recalled why using an inner join for intersecting isn’t perfect: NULLS. Inner joins treat nulls as different, whereas the above syntax does not. To test this out for yourself, run the above SQL on two tables that have a column set to NULL (for example, a NULL DOB) for the same customer in both tables. You’ll notice that the inner join refuses to recognize that NULL = NULL.

Tags: , , ,

2 Comments

UNION oddity (or maybe it’s just me)

I had a situation come up today where I needed to add the contents of one cursor (cursorA) to another (cursorB), without adding any duplicates in the process. Both cursors had the exact same structure. CursorA was going to be closed (a little later), while cursorB would persist (for another few processes). Immediately I turned to the UNION clause. My strategy was to select * from both tables, union them together, and put the results back into cursorB like this:

SELECT * FROM cursorB ;
    UNION ;
SELECT * FROM cursorA ;
INTO CURSOR cursorB READWRITE

I was surprised to see the open dialog box pop up, basically asking me to locate cursorB! What happens here is that the INTO clause seems to strike first, essentially closing cursorB. I suppose I was a little surprised because I do stuff like this all the time:

SELECT * FROM cursorB WHERE .T. INTO CURSOR cursorB

I worked around this by modyfing my SQL to the following:

SELECT * FROM (SELECT * FROM cursorB) as tmp ;
    UNION ;
SELECT * FROM cursorA ;
INTO CURSOR cursorB READWRITE

I should mention that this only works in VFP9. VFP7 will complain that the syntax is wrong (I presume the same in VFP8 too). The only solution that I can think of in VFP7 would be in two-steps:

SELECT * FROM cursorB ;
    UNION ;
SELECT * FROM cursorA ;
INTO CURSOR cursorC
 
SELECT * FROM cursorC INTO cursorB READWRITE

Tags: , ,

2 Comments

Sample Standard Deviation (and SQL)

The CALCULATE command contains a range of statistical operations that can be utilized in FoxPro. These commands work on fields in a table (or cursor) and can be filtered using a FOR clause and SCOPE operator. Calculating the mean, standard deviation, and variance is rather straightforward:

CALCULATE AVG(fieldname) TO variable
CALCULATE STD(fieldname) TO variable
CALCULATE VAR(fieldname) TO variable

And some other useful functions:

CALCULATE MIN(fieldname) TO variable
CALCULATE MAX(fieldname) TO variable
CALCULATE CNT(fieldname) TO variable
CALCULATE SUM(fieldname) TO variable

In a previous blog entry I wrote a program to calculate the median from a set of data. VFP does not have a native median function.

Population vs. Sample Data.

Visual FoxPro doesn’t have a built-in standard deviation formula for a sample. The functions CALCULATE STD() is a population function. The same is true of variance. The question then becomes, how can I get a sample standard deviation? I’m glad you asked! Here’s how:

FUNCTION std_devs
    LPARAMETERS tcTableField, tcPredicates
    LOCAL nRet, nsavg , nscnt AS float  
 
    SELECT(JUSTSTEM(tcTableField))
 
    IF !EMPTY(tcPredicates)
        CALCULATE AVG(&tcTableField) FOR &tcPredicates TO nsavg
        CALCULATE CNT(&tcTableField) FOR &tcPredicates TO nscnt
        SELECT SUM( (&tcTableField- nsavg) ^ 2 ) as ssum ;
            FROM JUSTSTEM(tcTableField)
            WHERE &tcPredicates
            INTO CURSOR crTmpSDEVS2
    ELSE
        CALCULATE AVG(&tcTableField) TO nsavg
        CALCULATE CNT(&tcTableField) TO nscnt
        SELECT SUM( (&tcTableField- nsavg) ^ 2 ) as ssum ;
            FROM JUSTSTEM(tcTableField)
            INTO CURSOR crTmpSDEVS2
    ENDIF
 
    nRet = SQRT(crTmpSDEVS2.ssum / (nscnt-1) )
 
    USE IN crTmpSDEVS2  RETURN nRet
 
ENDFUNC

What about STD or VAR in an SQL statement? Again, glad you asked!

SELECT DISTINCT std_devp("crtmp.value") AS stddev ;
    FROM crtmp INTO CURSOR cResult
SELECT DISTINCT std_devs("crtmp.value") AS stddev ;
    FROM crtmp INTO CURSOR cResult
 
*-- here is the population version fit for SQL:
FUNCTION std_devp
    LPARAMETERS tcTableField, tcPredicates
    LOCAL nRet AS float
 
    IF !EMPTY(tcPredicates)
        CALCULATE STD(&tcTableField) ;
            FOR &tcPredicates TO nRet IN JUSTSTEM(tcTableField)
    ELSE
        CALCULATE STD(&tcTableField) ;
            TO nRet IN JUSTSTEM(tcTableField)
    ENDIF
 
    RETURN nRet 
 
ENDFUNC

A big issue with a custom function in an SQL statement is performance. The sample standard deviation function I wrote above is certainly not going to win any sprints (but might just hold out during a marathon). Some environmental tests also need to be done in the above code (for example, resetting the alias back and checking for deleted would be nice improvements). For what it’s worth, this should be a good stepping stone for those in need of a good sample standard deviation formula in pure FoxPro. In addition, using it as a function call in an SQL statement increases its usability and worth.

Tags: , , , ,

1 Comment