Tod means Fox

Business Intelligence, Data Warehousing, SQL, Visual FoxPro.

Archive for June, 2007


Published June 26th, 2007

SSIS Packages and setting up Jobs

I promise not to turn todmeansfox into an SSIS blog! But I thought that as I continued development on my SQL Server back to FoxPro project, I would share any acquired wisdom along the way.

Last post, I discussed some of the problems I faced with the drag-and-drop nature of SSIS. In a nutshell, I had to do a lot (and I mean A LOT) of extra non-drag-and-drop work to get the packages running. As of now, I have thoroughly tested the entire solution locally on my machine and have now uploaded the packages to the server. A couple of notes:

Upload Packages

The first thing I had to do was upload my packages to the server. Log into Integration Services, click ‘Stored Packages’ from the treeview and add the package to the appropriate folder (in my case, MSDB). Do this by right-clicking the folder and click ‘Import Package’. I had several packages to upload (a single general package that calls multiple specific packages). The Import Package dialog is easy to use, but takes a minute to get used to. There are some non-standard interface elements (the single ‘.’ next to the disabled text box ‘Protection level’ is actually very important to push).

Once uploaded, I went ahead and created a job using SQL Server Agent.

SQL Server Agent

Connect to the server through SSMS where you want to execute your package. Click SQL Server Agent, and then right-click Jobs and click ‘New Job…’. The New Job dialog will open.

There are quite a few important elements on this screen, all of which I’m sure I would do a poor job explaining. So instead, here are the basic things I did to get my job going:

  • On the General page, entered a name and description
  • On the Steps page, added a series of new steps designed to execute my main package in several different ways (passing in a variety of parameters for testing purposes):
    • Enter a Step name, and select the SQL Server Integration Services type
    • Select the Package source (SQL Server) and the Server name. Pick an uploaded package from the Package list
    • At this point, you’ll want to look into how you want to log and fail the job as needed
    • To get things up and running, I clicked on the ‘Set values’ tab to enter all relevant parameters (see Some Snags below)
  • On the Notifications page, I set myself up to receive an email when the job finishes

Some Snags

While setting up the job, I ran into a few snags that I thought I would pass along.

First on the list: Be sure to define and assign variables properly:
Property Path: \Package.Variables[Somevariable]
Value: The literal parameter (do not use quotes for character strings)

When you are satisfied with your job, and you have entered all required information, save it (press ‘Ok’) and create a script. Simply right on the job in the SQL Server Agent / Jobs treeview and click Script Job As - Create To - New Query Editor Window. You can now use this script to re-create the job. I find it infinitely easier to make minor modifications to the generated script (which is just some T-SQL) and re-create the package than use the bulky and slow interface (which I had to do). One thing, though, caught me by surprise:

If you added a variable that ends in a backslash, you need to escape it. The T-SQL script generated will treat the single backslash as an escape for the quote mark that is used for the command line parameters. For example, assume your value is (without the quotes): “\\machine\folder\temp\”. This will result in an error when executed because the last backslash is not escaped and on the command line will cause the a parenthesis mismatch. Instead, use something like “\\machine\folder\temp\\”

Hopefully someone, somewhere finds this useful!

Published June 21st, 2007

Broken Promises of Drag-and-Drop Programming in SSIS

I’ll come right out and say it: I am no fan of drag-and-drop programming. Perhaps I’ve been emotionally scarred by FrontPage 2000 or something? This doesn’t mean that I don’t appreciate the ability to drop a command button on a form, or create an HTML table with a pencil. But it does mean that I don’t like (or appreciate) companies telling me that their drag-and-drop environments will ‘jump start’ me or ‘do all the dirty work’, because inevitably, I find myself in a position where I have to either (a) debug the generated code, (b) work around an undesired behavior, or (c) write the code myself anyway.

Take SSIS for example.

I am in the process of a large ETL project involving VFP and SQL Server 2005. We are taking data from SQL Server and populating VFP free tables (for re-distribution with our installed apps). Yes, you read right: SQL Server back to FoxPro.

So the company, long ago (and maybe in a galaxy far, far away), decided to use SSIS as the ETL tool of choice. Now, everything is in SSIS and a lot of time and effort has been put into the endeavor. With scores of packages extracting data from a variety of sources, SSIS has proved reasonably useful from a management and execution point-of-view. But I have to tell you, writing complex SSIS packages is no breeze — and the whole ‘drag-and-drop’ approach is seriously getting in the (read: my) way.

I took an inventory of all the work I needed to do for this project. Here’s the skinny:

Variables

The extracted data can have one or multiple sources (spanning multiple SQL Server databases) and destinations (each with a different FoxPro data schema and each potentially on different computers on the Network). I also need to be able to call packages from other packages. As a result, a lot of time was spent on setting up variables (and configuring the child packages to receive them from their parent). These variables are used extensively throughout the solution (in the connection manager and in various expressions for example — and don’t get me started on the expression builder!).

Sources

Here’s were drag-and-drop programming really starts to fall short. My sources, almost every one of them (there are dozens of sources in the solution) use SQL statements — instead of actual Tables (or views) — to gather the data (using the OLE provider, not the native driver, by the way). This really isn’t a big deal and in fact gives me a ton of flexibility, and I’m sure this approach is rather common.

For example, the FoxPro datasets do not support NULLs in any column (don’t ask me why, and it’s too late to fix it!). In the drag-and-drop world, one might drag a Derived Column tool, fight with the expression field, and ‘replace’ the NULLed column using an expression with some non-null value one at a time. Ugh. I tried this once and let me tell you: just build NULL elimination into the SQL statement using coalesce instead. Save yourself some aggravation and a headache. Also, while you’re at it, do all your converts, trims, calculations in there as well. In my tests, performance was not affected by this approach (using a database with several million records). As I’ve mentioned many times before though, I’m not an expert on benchmarking. But you can write SQL much quicker than using that darned expression dialog box — not to mention everything is right out in front of you in an SQL statement, and if you need to debug or experiment with the SQL, just copy and paste it into SSMS and off you go (try doing that with the Derived Column tool).

Transformations

Why is it that, with the entire toolbox of transformations provided by SSIS, I am always forced into using the Script transformation, the Union All as a terminator (so my lookups don’t complain about error output), and non-standard tools (such as the Lookup) to manipulate and filter data? I’ve already mentioned that Derived Columns and similar tools can be accomplished using good old fashioned SQL.

Debugging

Sorry folks, I’m not going here. Forget it. I might be likely to say something I’ll regret.

Sigh

When looking at this integration project as a whole, I came to realize two important facts:

  • The core functionality of the package is hand-written code (the SQL, transformations, even setting up the variables);
  • And, that most of the time to put this together (about 60 hours of programming in total) was spent on getting SSIS to behave and ‘flow’.

It turns out that drag-and-drop programming isn’t really the answer IMO. Sure, it might get you off the ground, but I find that tools like SSIS just get in the way more than anything else.

So I have to ask: Why bother? If I had hand-coded this in VFP (or any other data-centric language), it would have been done last week, and it would have been just as easy to execute, maintain, and turn over. Especially if I had a framework to start off with. Hmmn…

Published June 19th, 2007

Certification revamp at MS does not inlude VFP

Well, this was certainly a logical development: no new FoxPro, no more certifications. Sorry that I’m not an expert on MS certs, but have VFP certifications been available in the pat year?Anyway, for those looking to get certified, there is a little hope: “We’re open to discussing it with customers but at this time we have no further plans to address FoxPro.” said Master of exams Lutz Ziob, GM of Microsoft Learning.

MasFoxProCerts anyone?

Published June 19th, 2007

Under rock? SP2 and Sedna betas released.

Others have blogged about this already, but just in case you use my humble blog for all your Visual FoxPro news and announcements (a man can dream, can’t he?), Microsoft has released FoxPro 9 Service Pack 2 and its “Sedna” extensions in beta.

Enjoy!

Published June 18th, 2007

Just a Cool Day

Ever get stuck on a word? Here it is, only 9:30am, and I must have used the word ‘cool’ 4 times already. Once in front of my boss (”Ok, that’s cool. Thanks”), another in front of our data operations director (”Cool, I like it.”), yet another with a colleague (discussing his recent trip to Scotland), and finally again on the phone with my doctor (I have a sore throat). What’s funny is just before I drove in I was thinking about Craig Bailey’s post from last year “VFP: How to make Visual FoxPro cool“.

But “cool” needs to take a back seat at the moment.

Now, for VFP, it’s about being a survivor.

In September, I will be presenting at Fox Forward. My discussion will be about using VFP for Data Warehousing — especially for the prototype/iterate phase of planning and as an ETL tool. There is nothing really ‘cool’ about using FoxPro for ETL (although, I’ll try my best on FoxETL.com, a site I’m in the process of putting together for this very topic). After all, we’re talking about lots of code, objects and design patterns, schedules, staging, shuffling data, etc. Nothing groundbreaking, nothing cool: But incredibly valuable.

When you think about how much it costs to build a full Business Intelligence solution (could be in the $-millions-$), you can really begin to lose sleep. Couple that with the fact that many small and medium organizations, who realize that their data is a valuable asset, simply do not have access to resources (DBAs, SQL Programmers, Business Objects gurus, etc.) to implement such a solution. FoxPro presents an interesting and powerful alternative to the big players, all for a fraction of the cost. Everything you can do in SSIS for example, you can do in FoxPro (What Tod? Are you serious? Yup. I am). And in my humble opinion, you can do it better.

I suppose — come to think of it — that this makes VFP pretty darn cool. No? And using VFP for these types of tasks not only fills a large gap in the market, but also gives us die-hard FoxPro programmers lots to do in the foreseeable future.

Published June 13th, 2007

Web Analysis with VFP: Download, Parse, and Stage

My Web Analysis application is beginning to take shape. At the moment, I have created a little road map document (which I’ll share shortly), defined all my base classes (the entire application will be made available in the coming months), and proved the general concept (see below).

To recap, this project is about building a tool to help me make better business decisions on and using the Internet. While there are a ton of canned products on the market, I feel that I can get what I need with VFP (and get it better). In general, I want to be able to download a variety of data from my server, analyze it, and use that data to make decisions (everything from what IPs to ban to how I might redesign a certain page). See my post “Business Intelligence Through Web Analysis” for more details.

The following zip file contains all the code necessary to prove the concept. It (a) establishes a database, (b) downloads the log files from the FTP server, and (c) parses the file, storing it in a stage directory, which I can then use to create a Dimensional Model for analysis purposes. Here’s the programs:

web_analytics.zip

In the zip, you will find:

  • wa_main.prg (the setup program, check the path and run this once)
  • functions.prg (a place to store my functions, this will become a class soon)
  • download.prg (set your server information up and run, it calls load_logfile(), which lives in functions.prg)
  • ftp/ftp.prg (FTP services class written by Robert Abram many moons ago — still works great, requires wininet.dll)

Now that the concept has been proved, and I can visually analyze my log files, I will start to build an application in VFP9 around it. Stay tuned!