Pages

Showing posts with label Lookup. Show all posts
Showing posts with label Lookup. Show all posts

July 7, 2010

SSIS: Enhancing Data Flow Performance with Raw Files

How often do you use raw files in your packages?
Last week, I used them for the first time when trying to improve performance in  a slow running package containing a data flow with large lookups.

What’s wrong with that package?
The data flow in question was very simple. An OLE DB source reading data from a staging table, a series of lookup transformations to add columns to the rows in transit, an additional lookup to check if the row existed , a conditional split, and 2 destinations, one for new rows and one of rows to be updated. I quickly came to the conclusion that the performance issue was caused by the large size of the 2 of the lookup’s data sets. Let’s call them the evil lookups. When the package ran, I noticed that it ran decently quick until it reached the 20GB of memory mark and the last lookup was still half way caching data, at that point things started to crawl. The picture below shows the layout of the data flow.



How did I fix it?
After running out of tricks, I decided to use 2 data flows instead of one with the goal of balancing the