Dear Friends,
I was looking for several ways to populate a dimension inside a dataflow and is not so simple find the right solution to that...
It does make sense populating dimensions only in case of the input database has redundance in their tables, or either, not normalized.
The database relation needed to this example is:
and the transformations inside the dataflow are:
Now, I will describe with more detail each of the dataflow transforms.
First step is to create a global variable to store the value of the identity key. I’m manually creating the identity key in my dimension table, so, I need to know the value on each row of the dataflow. (This value will be incremented in the script component transform)
Add a Execute SQL Task to the control flow to get the last identity key:
Add a LookUp transform, to LookUp for InstrumentName in the dimension table "Intrumento"
Add a MultiCast transform for the rows with new InstrumentNames founded.
Add an Aggregate transform to get the distinct InstrumentNames founded.
Add a Script Component transform to increment the INST_IDENTITY global variable
Script Component explanation:
PreExecute Method
Receives the global variable INST_IDENTITY with the last Identity value for the table Instrumento
Initialize the local variable counter, that will be incremented for each row.
Input0_ProcessInputRow Method
Increment counter variable and save it for each row in Row.INSTIDENTITY output column
PostExecute Method
Refresh the global variavle INST_IDENTITY with the counter value.
Add another MultiCast to allow you to Sort the new rows with Identity column created in the step before, and to sort these new records (InstrumentNames) on the Database.
The mappings for the OLEDB Destination that must be created, are:
Add a Merge Join, and configure as a LEFT JOIN like this:
Finally, you have all the rows with the respective Identity value.
I hope to receive some feedback from you!
Regards!