The need to delete duplicate records from a table rises up all the time, whether you are importing data from a third party, integrating systems, or cleaning dirty data. Duplicate data is a reoccurring quality issue.

A duplicate records does not necessarily mean an exact copy of another record (as if you pressed CTRL+Y with SET CARRY ON in VFP). It could mean that all of the important identifying attributes are duplicated but other data, such as a surrogate key or a last updated timestamp are not. So here is a technique that I’ve used to delete a row if it is in fact a duplicate based on attributes in the table:

DELETE FROM [tablename] ;
  WHERE [primary key] NOT IN ;
    (SELECT MAX([primary key]) FROM [tablename] ;
      GROUP BY [attribute1], [attribute2], [attribute3])

To apply this to a Customer entity, for example, you can try the following (the first half of this code is setup, scroll down to see the delete):

*-- example table
CREATE TABLE Customer FREE   (;
    cust_key integer         ,;  && a surrogate key
    cust_id character(3)     ,;  && natural key
    cust_fname varchar(20)   ,;  && first name
    cust_lname varchar(20)   ,;  && last name
    cust_phone character(14) ,;  && phone number
    cust_lupd datetime       ,;  && last updated
    cust_user character(15)   )   && user last updated

*-- need data 
INSERT INTO Customer VALUES (1,"AM1","Andrew","MacNeill","555-5551",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (2,"AK1","Andy","Kramek","555-5552",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (3,"CC1","Cesar","Chalom","555-5553",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (4,"CB1","Craig","Baily","555-5554",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (5,"DC1","Dave","Crozier","555-5555",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (6,"DS1","David","Stevenson","555-5556",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (7,"GF1","Garrett","Fitzgerald","555-5557",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (8,"KR1","Kevin","Ragsdale","555-5558",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (9,"KR1","Kevin","Ragsdale","555-5558",DATETIME(),"ZorroMeansFox")
INSERT INTO Customer VALUES (10,"RS1","Rick","Schummer","555-5559",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (11,"RS2","Rick","Strahl","555-5510",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (12,"RS2","Rick","Strahl","555-5510",DATETIME(),"ZorroMeansFox")
INSERT INTO Customer VALUES (13,"RS2","Rick","Strahl","555-5510",DATETIME(),"FanakMeansFox")
INSERT INTO Customer VALUES (14,"SB1","Stephen","Bodnar","555-5511",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (15,"SD1","Stuart","Dunkeld","555-5512",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (16,"TR1","Ted","Roche","555-5513",DATETIME(),"TodMeansFox")
 
*-- delete duplicates (we don't care about the surrogate 
*-- key or the audit information).
DELETE FROM Customer ;
  WHERE cust_key NOT IN ;
    (SELECT MAX(cust_key) FROM Customer ;
      GROUP BY cust_id, cust_fname, cust_lname, cust_phone)
 
*-- results
SET DELETED OFF
BROWSE NORM
*-- need data 
INSERT INTO Customer VALUES (1,"AM1","Andrew","MacNeill","555-5551",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (2,"AK1","Andy","Kramek","555-5552",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (3,"CC1","Cesar","Chalom","555-5553",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (4,"CB1","Craig","Baily","555-5554",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (5,"DC1","Dave","Crozier","555-5555",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (6,"DS1","David","Stevenson","555-5556",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (7,"GF1","Garrett","Fitzgerald","555-5557",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (8,"KR1","Kevin","Ragsdale","555-5558",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (9,"KR1","Kevin","Ragsdale","555-5558",DATETIME(),"ZorroMeansFox")
INSERT INTO Customer VALUES (10,"RS1","Rick","Schummer","555-5559",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (11,"RS2","Rick","Strahl","555-5510",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (12,"RS2","Rick","Strahl","555-5510",DATETIME(),"ZorroMeansFox")
INSERT INTO Customer VALUES (13,"RS2","Rick","Strahl","555-5510",DATETIME(),"FanakMeansFox")
INSERT INTO Customer VALUES (14,"SB1","Stephen","Bodnar","555-5511",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (15,"SD1","Stuart","Dunkeld","555-5512",DATETIME(),"TodMeansFox")
INSERT INTO Customer VALUES (16,"TR1","Ted","Roche","555-5513",DATETIME(),"TodMeansFox")

*-- delete duplicates (we don't care about the surrogate 
*-- key or the audit information).
DELETE FROM Customer ;
  WHERE cust_key NOT IN ;
    (SELECT MAX(cust_key) FROM Customer ;
      GROUP BY cust_id, cust_fname, cust_lname, cust_phone)
 
*-- results
SET DELETED OFF
BROWSE NORM
*-- delete duplicates (we don't care about the surrogate 
*-- key or the audit information).
DELETE FROM Customer ;
  WHERE cust_key NOT IN ;
    (SELECT MAX(cust_key) FROM Customer ;
      GROUP BY cust_id, cust_fname, cust_lname, cust_phone)
 
*-- results
SET DELETED OFF
BROWSE NORM
*-- key or the audit information).
DELETE FROM Customer ;
  WHERE cust_key NOT IN ;
    (SELECT MAX(cust_key) FROM Customer ;
      GROUP BY cust_id, cust_fname, cust_lname, cust_phone)
        		
*-- results
SET DELETED OFF
BROWSE NORM
*-- results
SET DELETED OFF
BROWSE NORMAL

This works by selecting the most recent issuance of the primary key (using MAX) and discarding the others using the subquery. If you run the subquery on its own, you can see how SELECT MAX plus the GROUP BY clause gives you the most recent version of the row that is distinct from the others (‘Rick’ and ‘Kevin’ only show once) by leaving out fields that don’t identify the row (like auditing information or perhaps de-identified attributes). Next, we delete everything that is not in this list.

This works in all ANSI SQL environments were subqueries are supported in the DELETE statement.