Pages

Showing posts with label Denali. Show all posts
Showing posts with label Denali. Show all posts

January 10, 2012

SSIS 2012: A Peek to Data Taps

Note: The example in post is based on SSIS 2012 RCO
Data taps are one of the new features in SSIS 2012 in the data and performance troubleshooting
category. In a nutshell, they allow us, at runtime,  to choose a given path from a data flow and capture a copy of the data at that specific point of the data flow in a .csv file on a given execution instance of the package.

Let's break it down some more:
Run-time. Data taps are a runtime artifact, so as developer, you don't  define them as you design and develop packages in SSDT. Rather, you do it after the packages are deployed to the server by way of running some stored procedures in SSMS.
Data Flow Path: Data taps are defined at the data flow level, and within a data flow, at the path level. Paths are the blue or red arrows in the data flow that connect inputs and outputs of data flow components. You can read the books online definition of a path here.
Execution Instance: A data tap captures data in a given data flow path and stores it in a .csv file. This happens within a single execution  instance of the package. That means we need to add the data tap each time we run the package during our troubleshooting exercise.

You could think of data taps in a similar way you think about data viewers in BIDS SSDT, except you don’t have to edit the package to add them.

Setting up a Data Tap

Now, let’s see an example and some pictures to show how you can add a data tap to a data flow.
First, we create a package with a data flow in it and deploy it to the SSIS server. For this example, I have a pretty straight forward data flow that gets a list of 4 products via OLE DB Source, does some string manipulation and then loads the data into an OLE DB destination.  As you can see in the picture, this data flow has 2 paths (blue arrows linking data flow components). Note the value of IdenificationString property of the data flow path as we will use it later.

We will need some pieces of information about the package we want to tap data from before we can setup the data tap:

Folder Name: The folder within SSIS server where the package is deployed.
Project Name: The name of the SSIS project that holds the package
Package Name: well, just the package name.
Data Flow Path ID: The value of the IdentificationString property of the data flow path where we want to add the data tap (highlighted in yellow in the picture above).
Data flow task path: The path of the data flow task within the package. As opposed to getting the Dataflow path ID, getting the task path  is tricky as it is not readily displayed in SSDT, an you typically have to resort to some sort of trick to get it. I will share a trick you can use to get value of the property path in another post. In this example, let’s assume we do know that the path to the data flow is \Package\Load Product Update: The path to the data flow task is readily available in the packagePath property of the task.  Thanks to David Joubert for pointing that out!




Once we have this information, it is time to add the data tap(s) and execute the package. We do all this in SQL Server Management Studio by running 3 stored procedures that are built-in the SSIS catalog:
  1. Create a execution instance for the package by running [catalog].[create_execution] stored procedure
  2. Add the data tap(s) by running [catalog].[add_data_tap]
  3. Run the package by running [catalog].[start_execution] package
Here is the script I used while running my sample package:

https://raw.github.com/gist/1558450/f0a254d68a047f6199e6f53566c195cfcf06e192/gistfile1.sql

If everything goes well, you will have the .csv files in the :\Program Files\Microsoft SQL Server\110\DTS\DataDumps folder of the machine you are running the stored procedures. You can always use the built-in package execution reports to validate the execution of the package was successful or to  look for error messages generated during the execution of the package.



image
Folder with data tap files

Is That it?

Almost. The example in the post is quite simplistic and the bare minimum I could come up with to get a data tap working. There is additional information in books online about them that you may want to review:

catalog.add_data_tap
catalog.remove_data_tap
catalog.execution_data_taps
catalog.execution_data_statistics


Conclusion


Troubleshooting data issues can get hairy at times and data taps are an extra tool in our belt that can help in cases where logging (which has also improve a lot in SSIS 2012) and dumps may not give us the required level of information. In the other hand,  I would have appreciated a more seamless user experience, specially when getting the parameters needed by the stored procedures, and perhaps having more control over the file format and its content. As it stands, we may always need to open a copy of the package in SSDT to get the required metadata and the jump back to SSMS to complete the work. It would be great to have a point and click interface within SSMS that allows to navigate the structure of the package/dataflow and let us add a data fow with few clicks. I would like to see an interface like the 'package explorer' in BIDS for this, but unfortunately that suggestion did not get too much traction last time around and it was closed as 'won't fix'.  may be in SSIS 2014?

January 3, 2012

SSIS 2012: Project Deployment Model and Build Configurations

With the introduction of the new project deployment model in SSIS 2012, we now have the ability to deploy a project to  a SSIS server and use parameters to packages and projects. With this, there may be instances where we need to change the name of the SSIS server and the value of parameters as we develop, debug and test packages within SQL Server Data Tools. Depending on the number of servers and parameters your project and package may have, making those changes may be rather tedious and prone to error. I am thinking in the many times people have mistakenly used the wrong connection strings when debugging a package in the old days - I have done it many time.
The good news is that projects using the new project deployment model in SSIS 2012 leverages build configurations from Visual Studio  in a more meaningful way than in earlier versions, making most of these changes a lot simpler. How? I am glad you ask. For one, both, project and package parameters can get their values from build configurations. For two, some project level properties can also be set via build configurations (things like Deployment Server name and project path); which makes debugging and deploying packages in SSDT (formerly known as BIDS)against different SSIS servers much simpler.


Binding Parameter values to Build Configurations

For this, you just need to go to the parameters tab (project or package) and click button with the weird icon. The tooltip actually says “Add Parameter to Configurations'”.

Then you have the opportunity to choose the parameters you want to associate with the multiple build configurations you may have created. In the picture below, you can see how my package has 5 parameters and I had created 2 build configurations, one called “Test” and one called ‘Development”. Now you can enter a value for each parameter and configuration combination.



So, how is this helpful? Well, Assuming that I need to change the value of the 5 parameters at design time as I debug the package in 2 different machines, I just need to switch between build configurations and to have all parameter values changed at once. Switching between build configurations is as simple as selecting a value from a dropdown list.










Binding Project Properties to Build Configurations

Similarly, there are some Build, Deploy and debugging properties of the project that can be assigned using build configurations. For that go properties of the project and select the the properties and the configuration and assigned the desired value.



Lastly, keep in mind that build configurations are not new to SSIS 2012 projects, but I honestly did not find them helpful until now.

December 28, 2011

SSIS 2012: Parameters and Variables, what is the difference?

Note: The information on this post is based SSIS 2012 RC0

This is a question I ran across recently while browsing the SSIS 2012 pre-lease forum. While there may be multiple ways to look at the differences and similarities among them, I thought I would give it a shot and share some of my observations. IMAG0833
If you are familiar with earlier versions of SSIS, it is worth noting that the concept and usability of variables have not changed much in the newer version. on the other hand, it  is also important to point that parameters are one of the new features introduced in SSIS 2012 – available when using the new project deployment model - that promises to address some of the shortcomings of package configurations – which are still available.

Scope

Parameters can be defined at 2 different levels:
Project. These type of parameters are available to all packages within the SSIS project. Think about them as global parameters. They come handy as it makes really easy to share a given value, such as the path to a file share or the name of a server, across all packages in a project.
         Package. These type of parameters are meant to affect only the package on which they were defined.

Variables are richer in scope as they can be scoped at the package, container, task or event handler level.

Execution Time behavior


The value of a parameter can’t change within the execution instance of a package. That means, its value remains the same for the entire execution of the package. Honestly, I don’t envision this being an issue in most cases.
On the other hand, the value of variables can change during the execution of the package, which make them suitable for scenarios where, for example,  looping, conditional or any other type of logic where the values of the variable must change during the execution of the package.

Consumption and Usage


Parameters are applied via expressions on the properties which are intended to be parameterized. The value of a parameter can be set at development time, after the package is deployed to the SSIS server (via SSMS) or at run time (SSMS or SS agent job) and seem the way to go to affect package execution imagewithout having to modify the package. At design time, you can quickly get access to the “parameterize” window by right clicking on a task, container or connection manager, which saves few click when compared to using the expression property of the object. Additionally, parameter values can be set via execute package task, when a package is called from another package.

The value of a variable can be set in many ways within a package and its value can be further referenced by other expressions, containers, tasks or components.
Note: Expressions are not new to this version of SSIS and something you should put in your short list of things to learn if you are serious about SSIS.

Data types


When compared to variables, parameters do not support the following data types: Char, DBNull, Object.

Visual Studio Configuration Manager


SNAGHTML1ac6e626One of the nicest improvements I have seen in  the RC0 build is the ability to bind parameters to the design time configurations offered by SQL Server Database Tools (formerly known as BIDS), which comes very handy when designing and  and debugging packages in Visual Studio in SQL Server Data Tools.



Conclusion


Parameters are the new kids on the block, and it is clear they are being treated as 1st class citizen within the new project deployment model. Although variables and parameters  have some similarities, trying to find out which one is better may not be the right thing to do. In my opinion, it is more a matter of understanding their capabilities and  choosing the right tool for the job.
I am pretty sure there may be more aspects to parameters and variables  than the ones I listed here, so feel free to leave your feedback in the comments below with anything I may missed.

November 17, 2010

Denali CTP 1: SSIS Parameters – Bring Them On!

The new version of SQL Server code-named Denali brings the concept of Parameters to SSIS. At first glance, I can say they close some of the existing gaps in the package configuration model form earlier versions and it seems a big step in the right direction for the product usability.
If you are interested in learning about the Developer Experience and Usability enchantments, you can read mi earlier post.

image

In a nutshell,you can think of SSIS parameters as the input parameters you would use in a function or stored procedure. See the link at the end of the post to learn how they work.

The Good

Flexibility. They can be declared at the project and/or package level and can take 3 types of values: Design default value, Server Default value, and Execution value. The last 2 type of values become relevant after the SSIS project has been deployed to the server.
Simplicity. SSIS parameters seems easier to understand and work with. You add them to the packages or projects and reference them via expressions. This is especially true when you compare them to package configurations in the previous version of the product.

Parameter –> Object Property (via expression)

Security. You can mark parameters as sensitive and have their values stored encrypted when the package is deployed. This is helpful when you have parameter values containing sensitive information like IDs or password. Notice that parameter values are not encrypted at design time (BIDS) even if you marked them as sensitive, so you are still responsible of protecting that information in developer’s machine, perhaps leveraging the ProtectionLevel property of the package.

Caveats

Parameters are available only in SSIS projects using the new Project Deployment Model. There are 2 models in the new version of SSIS: Project Deployment Model and Legacy Deployment Model. I will try to cover the differences about this model in a later post, but for now think about them as present and past. The legacy model promises to let you work on the way you were used to in the previous version of the product, but you will be forbidden from using parameters and the new server based SSIS catalog along with its deployment model. Any new project uses the new model by default  and there is a migration path from and to the Legacy model.
Learning curve. Are you of the type that don’t read the instructions and try to figure things out at your own? That did not work for me this time. I started playing around with the parameters and the deployment process, but could not get them to work quite well. It was not until I went trough some of the Wiki articles and videos some of the Microsoft folks put together that things starting to click in my head. You will need to understand how you could use 3 different type of values and how one could override the other. In the server side, you would need to get familiar with the concepts of Catalogs, Folder and Environments and how they play along with parameters. This is not rocket science, and certainly easier than package configurations to explain.

Final Comments and References

Parameters, in my opinion, are in overall better than the old package configuration model that was hard to explain, understand, and manage - especially for new users -, and they also seem a good fit for all the ETL scenarios I come across in a regular basis. However, I think the server part of the story around parameters may need some tweaking as there are some aspects that look little confusing and that could impact negatively the user experience. More on that later.

To learn more about parameters:
SQL Server “Denali” CTP1 - Integration Services (SSIS) Parameters

November 16, 2010

Denali CTP 1: SSIS Developer Experience & Usability Enhancements

This is the second post of a series about SSIS Enhancements in Denali CTP 1

Denali CTP 1 has been around for almost a week now, and I would like to take the opportunity to highlight some of the improvements and changes being introduced in usability area. This review is accurate to the best of my knowledge and is the results of  few hours playing with the tool, watching some demos and reviewing some of the documentation available – that means use with caution.


SNAGHTML26332afNew design surface. A slicker look and feel with a friendlier zoom functionality that is always visible. While zooming in and out is quicker now, one thing I am missing is the old functionality of the little pane in the bottom right corner (Does anyone know how it was called?) that allowed scrolling across the entire design surface with a single click– it came very handy to navigate control and data flows that have lots of objects.
  image
 
 











New SSIS toolbox. This is a new toolbox pane used only by SSIS and its items have a flashy look SNAGHTML25c61f9and are organized into a number of pre-defined groups similarly to earlier versions. However, something that seems to be missing from the previous version is the ability to create additional custom groups of items or Tabs – Not a big deal if you asked me, but something you may miss if your way of organizing toolbox items do not align with the one provided.
 
 
 
 
 





Flexible Authoring (data flow). Do you remember the annoying ‘Restore Invalid Column References’ SNAGHTML2689df0Editor? In case you don’t, it used to get in your way when trying to edit data flow components after breaking downstream changes were detected. I hated that editor as none of its 3 options did really help you, and it would get in your way again if you picked the wrong option. As far as I can tell, now you can:  
 
  • Edit transformations and components with invalid references. Now, the invalid references are imagemarked in the linkage between components (yes, the green arrows) while you can still access the downstream transformations. When you double click on invalid linkages you get a nice ‘resolve reference’ editor that allows you to re-map input and output columns. You can also export/import the mappings to/from excel – something handy when you have too many columns – or just delete the invalid references.
  • Add destination components without having any input connected to it and being able to add the connection settings and target object information only. Why would you want to do that? I am not sure either, but we are talking about being flexible. Right?

    image

Destination and Source Assistants (data flow). Just a convenience feature that allows saving few clicks when adding destinations and sources to your data flows. It also allows you to see the connection types for which you have drivers install if you wish.  
 
 
 
  






imageGrouping of control and data flow items. Now, you can select multiple items and group them. This is a design time only feature that may come handy to keep packages with lots of tasks and components visually clean and organized. It has no run time effect.
 
 
 
 
 
 







imageUndo and Redo. This long overdue functionality finally made it into SSIS. Now you can use CTRL+Z or CTRL+Y to undo and redo latest changes – I don’t know how deep it goes, but I can say it goes beyond the last save operation.
 




Multiline Annotations. Another minor but annoying issue from earlier versions. No more jumping to text editors or CTRL+ENTER in order to get annotations in multiple lines of text. Just hit enter image
 

 

 

Overall

While this new version has some notable improvements in this area, the most promising enhancements are in server components side, the overhaul of the parameterization and deployment processes, and the yet to be released data linage and impact analysis. Stay tuned for more details on those areas.

November 15, 2010

Denali CTP 1: What’s new in SSIS?

In short: A lot.
SNAGHTML5a893d
The first Community Technology Preview (CTP) of the next SQL Server version was made available to the general public last week during PASS Summit. I am very excited about some of the changes coming from the BI side of the product, especially from the SSIS side. After seeing some of the demos at the PASS Summit, and playing around with the tool, I tend to group the changes in 3 major areas:
I will cover changes in these areas in more detail in subsequent posts. But in the mean time you can use the following links if you want to read more - or even play with it:

Denali Resource center (download, videos, sample DB, etc.):
http://msdn.microsoft.com/en-us/sqlserver/denali_resource_center.aspx
Microsoft TechNet Wiki:
http://social.technet.microsoft.com/wiki/contents/articles/tags/Integration+Services/default.aspx
Denali Books on Line
http://msdn.microsoft.com/en-us/library/ms141026(v=SQL.110).aspx


Happy CTP!