October 3, 2007

Breaking the content of a flat file into several files using a grouping criteria

Today a user of the SSIS forum asked how he could shred a text file and generate individual files based in grouping column. E.g:

Source File:

Id, Name, Number
1 TOM 22
2 TOTO 44
3 SAM 44
4 RADIO 55

generate 3 output files:
USER_22.csv
USER_44.csv
USER_55.csv

where USER_44.csv contains:
2;TOTO;44
3;SAM;44


Well I created a sample package that does that by:

  1. Putting the distinct list of values of the grouping column (Number) into a RecordSet destination
  2. Using the content of the RecordSet Destination in a ForEach loop container to iterate to through the source file and create a destination file (one on each iteration).
  3. In order to name the destination file dynamically, I use an expression against the 'ConnectionString' Property of the destination file's connection manager.

Here are some screenshots of the package:

The Control Flow













The first dataflow:












The second dataflow:














You can download a copy of the .dtsx package and the source file here:




12 comments:

  1. I could not understand the third step for creating the package as shown in this article. Can you help me out.

    ReplyDelete
  2. Karthik,
    What is the part you don't understand? Did you download the package and try it out?

    ReplyDelete
  3. I created a package similar to this example. I need to create a different flat file for iterations 1-6. There is a column named catalog I am using. Based on Catalog value, I need a new file created. So for all records with catalog = 1 I need one file, catalog = 2, another. And so on. Can you help me with this? So far your package has really helped me out, I just need to figure out this last piece.

    ReplyDelete
  4. Hi Rafael,

    The package is corrupted. Could you please upload the package again. Thank you

    ReplyDelete
  5. Hello Rafael. Thank you for this nice blog, I've been looking everywhere for something like this as Im facing an scenario very similar to this one. I have tried to follow your explanation but im stucked into something. The information extraction and the foreach works just fine, but im having trouble while trying to read the information from the object variable using the expressions. Of course I do need to read that info in order to fetch the information from the database, in my case Im working with retail stores so the for each is iterating per each store number and I need every iterating value to use it as a condition for fetching its corresponding information, as well as for setting the file name. How did you achieve this in your example?

    Thanks in advance.

    ReplyDelete
  6. Anibal,
    You don't consume an Object type variable directly. You have to use additional variables inside of the ForEach Loop container to 'shred' the object variable. i hope this helps!

    ReplyDelete
  7. You are one AWESOME man.
    The package worked like charm.

    But I just have one problem.
    Instead of Flat File I am using SQL as a Source.
    In “SQL data source” I am running “SELECT * From Table”.
    And I place that DATA FLOW TASK in Foreach Loop Container.
    So when the package runs it is just keep running the same data over and over + it doesn’t stop (Foreach Loop Container) and it doesn’t even go to the next Number just keep running for example number “22” it doesn’t go to number “44”.
    It is not even creating another file with number “44” why because i guess in a data source I am running SELECT statement.
    How do I take care of this?
    Any Thoughts will help.


    Once again this Blog helped me a lot.

    ReplyDelete
  8. Hello Rafael,
    Jean Here.
    Right after i Posted Comment "Awesome man".
    I tried my package using flat file instead of SQL.
    What ever the data i needed i loaded them into Flat file from SQL and ran the package.
    But it is still running doesn't stop and keep writing the same data in the file OVER and OVER.

    Please Advice something on this issue.
    Thank You

    ReplyDelete
  9. Hello Rafael,
    Again this is Jean Here.
    Never mind,
    I figured it out.
    I was not checking the check box in SORT transformation “Remove rows with duplicate sort values” Which was actually looking for distinct values of column Qualifier in my package.
    Good Blog Rafael.

    Thank You

    ReplyDelete
  10. Hey Rafael,
    This is jean here.
    i have a question about this package.
    now i created the whole package and its working the way it suppose to be.
    but i want a little twist in this package.
    you are creating a FILES based on NUMBER field.
    but now i need to Split the files based on another field.
    i did place both filed in Record Set Destination.
    and i did all the transformation but package is failing saying "it can not open the file that just got created".
    is this because i am using same filename variable for both the files?
    Please let me know about this.
    if its not to much trouble for you could you please email me on "paterjean@gmail.com" so i can contact you on your email.
    i appreciate your help.

    Thank You
    Jean

    ReplyDelete
  11. Can't find the download file. Does anyone still have this?

    ReplyDelete

I will love to hear your feedback and comments. Thanks.