Data taps are one of the new features in SSIS 2012 in the data and performance troubleshooting
Let's break it down some more:
Run-time. Data taps are a runtime artifact, so as developer, you don't define them as you design and develop packages in SSDT. Rather, you do it after the packages are deployed to the server by way of running some stored procedures in SSMS.
Data Flow Path: Data taps are defined at the data flow level, and within a data flow, at the path level. Paths are the blue or red arrows in the data flow that connect inputs and outputs of data flow components. You can read the books online definition of a path here.
Execution Instance: A data tap captures data in a given data flow path and stores it in a .csv file. This happens within a single execution instance of the package. That means we need to add the data tap each time we run the package during our troubleshooting exercise.
You could think of data taps in a similar way you think about data viewers in
Setting up a Data TapNow, let’s see an example and some pictures to show how you can add a data tap to a data flow.
First, we create a package with a data flow in it and deploy it to the SSIS server. For this example, I have a pretty straight forward data flow that gets a list of 4 products via OLE DB Source, does some string manipulation and then loads the data into an OLE DB destination. As you can see in the picture, this data flow has 2 paths (blue arrows linking data flow components). Note the value of IdenificationString property of the data flow path as we will use it later.
Folder Name: The folder within SSIS server where the package is deployed.
Project Name: The name of the SSIS project that holds the package
Package Name: well, just the package name.
Data Flow Path ID: The value of the IdentificationString property of the data flow path where we want to add the data tap (highlighted in yellow in the picture above).
Data flow task path: The path of the data flow task within the package.
Once we have this information, it is time to add the data tap(s) and execute the package. We do all this in SQL Server Management Studio by running 3 stored procedures that are built-in the SSIS catalog:
- Create a execution instance for the package by running [catalog].[create_execution] stored procedure
- Add the data tap(s) by running [catalog].[add_data_tap]
- Run the package by running [catalog].[start_execution] package
If everything goes well, you will have the .csv files in the
|Folder with data tap files|
Is That it?Almost. The example in the post is quite simplistic and the bare minimum I could come up with to get a data tap working. There is additional information in books online about them that you may want to review:
Troubleshooting data issues can get hairy at times and data taps are an extra tool in our belt that can help in cases where logging (which has also improve a lot in SSIS 2012) and dumps may not give us the required level of information. In the other hand, I would have appreciated a more seamless user experience, specially when getting the parameters needed by the stored procedures, and perhaps having more control over the file format and its content. As it stands, we may always need to open a copy of the package in SSDT to get the required metadata and the jump back to SSMS to complete the work. It would be great to have a point and click interface within SSMS that allows to navigate the structure of the package/dataflow and let us add a data fow with few clicks. I would like to see an interface like the 'package explorer' in BIDS for this, but unfortunately that suggestion did not get too much traction last time around and it was closed as 'won't fix'. may be in SSIS 2014?