Parsing City/State/Zip
Many applications today still carry the city, state, and zip in a composite field. The problem with this type of concatenation is that you can’t easily query the data or use it for other things (such as building a report by state). There really is no benefit or good reason to keep them together. I have yet to hear a good ‘excuse’:
1.) This is how I got the data
2.) It makes it easier for the end-user to enter everything in a single field
3.) I don’t have to worry about combining them on labels
4.) (insert your bad excuse here)
To add value to these legacy composite fields, you can write a program to split them up.
Usually, the data has some form such as “city, state zip” where the state is almost certainly abbreviated. The following program can be used to parse out city, state, and zip from a concatenated field into atomic parts. Two assumption are made with this program (important ones, at that): that state will be 2 digits and that city names don’t begin with a number. Another excellent approach would be to grab a zipcode file from the post office (or other data supplier) so that you could compare each segment to determine not only that it IS a zip code or city or state abbreviation (or full state name for that matter) but that it actually exists! I’ll leave that part up to you! Here is my simple solution to get things atomic:
lcCity = "" lcState = "" lcZip = "" lcCSZ= "Marlborough, MA 01752" *-- lcCSZ= "North Providence RI 02988" *-- lcCSZ= "Boston MA" ? parse_csz(lcCSZ , @lcCity, @lcState , @lcZip ) ? "city: " + lcCity ? "state: " + lcState ? "zip: " + lcZip FUNCTION parse_csz LPARAMETERS tcFullString , rcCity , rcState , rcZip LOCAL lcFullString AS String LOCAL lnSegments AS Number STORE "" TO rcCity , rcState , rcZip && reference parameters *-- remove any commas and convert to spaces lcFullString = STRTRAN( ALLTRIM(tcFullString ) , "," , " " ) *-- get the number of segments lnSegments = GETWORDCOUNT( lcFullString ) *-- try to be smart and figure out what to do depending *-- on the number of segments found DO CASE CASE lnSegments >= 3 rcZip = GETWORDNUM( lcFullString, lnSegments) rcState = GETWORDNUM( lcFullString, lnSegments- 1) FOR x = 1 TO lnSegments - 2 rcCity = rcCity + " " + GETWORDNUM( lcFullString, x ) NEXT rcCity = ALLTRIM( rcCity ) CASE lnSegments = 2 IF ISDIGIT( GETWORDNUM(lcFullString, 2) ) rcZip = GETWORDNUM(lcFullString, 2) ELSE IF LEN( GETWORDNUM(lcFullString, 2) ) > 2 rcCity = GETWORDNUM(lcFullString, 1) + " " + ; GETWORDNUM(lcFullString, 2) ELSE rcState = GETWORDNUM(lcFullString, 2) rcCity = GETWORDNUM(lcFullString, 1) ENDIF ENDIF CASE lnSegments = 1 IF ISDIGIT( lcFullString ) rcZip = lcFullString ELSE IF LEN( lcFullString ) > 2 rcCity = lcFullString ELSE rcState = lcFullString ENDIF ENDIF ENDCASE *-- Return in a mailing format RETURN rcCity + " " + rcState + " " + rcZip
I'm a Quant Technical Specialist (Data Warehousing and Business Intelligence), with expertise in business analysis, data modeling, and data integration. I have extensive experience developing vertical and integrated desktop, Internet, and BI applications spanning municipal, clinical, and financial industries.

March 14th, 2007 at 7:47 am
Using Outlook to do it for you is another possibility - http://www.civilsolutions.com.au/publications/parseaddress.htm