In part 1 of this series, I discussed the ProvideComponentProperties method of my custom ID Lookup component in SSIS. In this post, I’ll discuss the AcquireConnections and PreExecute methods. Please read the introduction and part 1 if you haven’t already!
AcquireConnections
The purpose of AcquireConnections is to establish connections to the specified connection managers at design and runtime. Although not totally necessary, it is a best practice and is recommended to connect to all external data sources using this method.
The first step is to prepare a runtime connection. In the ProvideComponentProperties method, I created a new runtime connection using interface IDTSRuntimeConnection90. I added this connection to the RuntimeConnectionCollection of my ComponentMetaData and named it “IDLookupConnection”. The code in ProvideComponentProperties is straightforward and looks like this:
IDTSRuntimeConnectionCollection90 oRuntimeConnectionCollection
= ComponentMetaData.RuntimeConnectionCollection;
IDTSRuntimeConnection90 oRuntimeConnection;
oRuntimeConnection = oRuntimeConnectionCollection.New();
oRuntimeConnection.Name = "IDLookupConnection";As with my other examples, only the essential code is shown above. In my production copy, I use structured error handling and class members instead of literal strings.
Next, I wrote the following code in the AcquireConnection method:
private SqlConnection dbConn;
public override void AcquireConnections(object transaction)
{
IDTSRuntimeConnection90 conn = ComponentMetaData.RuntimeConnectionCollection["IDLookupConnection"];
ConnectionManager cm = DtsConvert.ToConnectionManage(conn.ConnectionManager);
ConnectionManagerAdoNet cmObj = cm.InnerObject as ConnectionManagerAdoNet;
dbConn = cmObj.AcquireConnection(transaction) as SqlConnection;
}In the above code, I am converting conn (IDTSConnectionManager90) to a managed ConnectionManager (cm). Then, I can use InnerObject to get an AdoNet Connection Manager (cmObj) object. The AcquireConnection method of cmObj returns a SqlConnection object (dbConn) which opens a connection to my SQL Server database. I’ll use dbConn later in PreExecute to create SqlCommands.
Not to be forgotten, I use the ReleaseConnections method (not shown) to release dbConn when the component finishes its work.
PreExecute
The PreExecute method is called once for the life of the component at runtime. It is a good idea to do as much ‘prep’ work in here as possible. This includes applying behavior based on property values, loading data for lookups, setting up ’static’ items, and preparing the component to start accepting input. There is a temptation at times to use ProcessInput for some of these things: resist it! ProcessInput runs once for each buffer.
It is in PreExecute where I think I could do some better optimization. Currently, I am returning about 100,000 values from a master lookup table in about 6 seconds (crude benchmarking). This data is then used in ProcessInput for matching. I believe I can do better in here, but for now, it works and won’t be modified any time in the near future. Of course, if you see something awry, let me know!
The meat of the programming is in this method, but a lot of the logic is repeated (for example, I fetch LineageIDs for several new output IDTSOutputColumn90 columns added to each of my Outputs in ProvideComponentProperties). Here is such an example:
// get references to the input so we can use its buffer metadata
IDTSInput90 input = ComponentMetaData.InputCollection["IDLookupInput"];
// get reference to the output collection and its columns
IDTSOutput90 output = ComponentMetaData.OutputCollection["IDLookupOut"];
IDTSOutputColumnCollection90 outputColumns = output.OutputColumnCollection;
foreach (IDTSOutputColumn90 column in outputColumns)
{
// NOTE: MS Documentation is not clear, cannot use output.buffer
// for synch transforms, need to use input.buffer instead (which
// is the same as what would be there for the output.buffer)
intLineageID = (int)BufferManager.FindColumnByLineageID(input.Buffer,column.LineageID);
strColumnName = column.Name.ToString().ToLower();
if (strColumnName == "id_matchvalue") {
indexMatchValue = intLineageID;
exit;
}
}
You might have noticed by now that much of what I am doing is more-or-less boilerplate code that is provided by Books Online. The trick is applying the right pieces for the right situation and building a solution that solves a specific problem better than what can be done using out-of-the-box tasks and components.
Despite all the good code examples in BOL, I still found some parts to be documented poorly. Take for example the issue with finding the LineageID of an output column for synchronous components. At first glance, you might think you can use BufferManager.FindColumnByLineageID(output.Buffer, column.LineageID). But for synchronous transforms, an output buffer is not created — it is the same as the input buffer. To find the LineageId, you need to use input.Buffer and not output.Buffer. Not very intuitive at all! But I digress…
In PreExecute, after I match a variety of LineageIDs to custom members of my IDLookup class, I continue by writing the code to fetch data from my lookup table. Actually, in the real implementation, this is also repeated for more than one data source — depending on how the developer set the properties of the component. But for the sake of simplicity, I’ll show one of the blocks of code for the primary lookup (the names have changed to protect the innocent!):
string sqlStr = "SELECT id, account, fname, lname, ssn, dob, ext_id, empid FROM mstr.MasterLookup WHERE status = 'A'"; SqlCommand sql = new SqlCommand(sqlStr, dbConn); daMstr = new SqlDataAdapter(); daMstr.SelectCommand = sql; dtMstr = new DataTable(); daMstr.Fill(dtMstr);
And there you have it! I simply fill a SqlDataAdapter (daMstr) with a DataTable (dtMstr) based on a SQL Statement (SqlCommand sql). The query returns several thousand rows from the master lookup table and stores them in daMstrSec, which I have defined as a custom member of my class.
Next, I’ll talk about ProcessInput!