Pages

February 9, 2007

SSIS: Loop through files that are within a date range (if date is part of the file name)

For those lucky ones that have to process file system files based on a date range, AND the date is part of the file name; here is one possible solution:

This example assumes that the date is in the last 8 characters of the file name in a format mmddyyyy.

Use a ForEach Loop container to iterate through all files in the folder hosting the source files. The file name would be stored in a SSIS variable (FileName). Then using an additional SSIS variable (FileDate) with an expression, parse the file name and get the date portion to be compared with the StartDate/EndDate range. Then based on that result, decide if the file has to be processed or ignoreded. For that comparison I used a precedence constraint with an expression.

Here are a few screen shots of the package I built:

Control Flow


















Variables:









ForEach Loop Conatiner:


The Precedence constraint:
























This is the expression in Filedate variable to get the date portion of the file name:

(DT_DATE) (Substring(Right(@[User::FileName], 8),1,2) +"/" + Substring(Right(@[User::FileName], 8),3,2) +"/" + Substring(Right(@[User::FileName], 8),5,4))


The Expression in the precedence constraint:

@[User::FileDate] >= @[User::StartDate] && @[User::FileDate] <= @[User::EndDate]

28 comments:

  1. This is very helpful. What can I do if I don't want to hard code the FileName value?

    ReplyDelete
  2. The value you see 'hardcoded' in the FileName variable just acts as default values; I would put it there to test the expression in the FileDate variable. SSIS also needs it to validate the expression before executing the package. At run time, though, FileName will get diffrent values as the ForEach Loop container iterates through the files...

    ReplyDelete
  3. Rafael, where do you put (DT_DATE) (Substring(Right(@[User::FileName], 12),7,2) +"/" + Substring(Right(@[User::FileName], 12),5,2) +"/" + Substring(Right(@[User::FileName], 12),1,4))
    This is my version. You can't put it in the variable as it only allows date and time. Any help appreciated.

    ReplyDelete
  4. Armstrong,
    That is an expression, hence you have to place it in the expression property of the variable. Make sure you open the property window of the variable, set Evaluate as expression to true and then paste the expression into the expression variable.

    ReplyDelete
  5. Rafael, How will I pass a data function in a file. My situation is I have got a file name called ABC_20080209.csv and tomorrow its ABC_20089210.csv and every day it comes with a date stamp and I need to read that file. Another problem is its in an FTP location so I cannot use a script to modify the file either. I would appricate if you can help me out.

    ReplyDelete
  6. I have some data I need to export weekly. With ssis, can I create a new workbook dynamically, per month, then, for each week, dynamically create a worksheet for that data?

    ReplyDelete
  7. Uearl,

    Yes that is actually possible, and I have couple of entries on this blog that can help you on that. You can start here: http://rafael-salas.blogspot.com/2008/03/ssis-and-dynamic-excel-destinations_01.html

    ReplyDelete
  8. When I paste my version it errors with ... The expression for variable "jhhold_date" failed evaluation. There was an error in the expression.
    My expression is: (DT_DATE) (Substring(Right(@[User::jhhold_filename], 12),5,2) +"/" + Substring(Right(@[User::jhhold_filename],12),7,2) +"/" + Substring(Right(@[User::jhhold_filename], 12),1,4))

    ReplyDelete
  9. How would you do this if you're trying to delete an entire directory rather than individual files? My directory date format is like this: yyyymmdd-hhmmss

    ReplyDelete
  10. Charle,
    not sure what your problem is. You should try to use the evaulate expression option when entering the expression in the expression editor.

    LA,
    Interesting scenario. To delete the entire directory, I would probably use a foreach loop container to delete the individual files inside of the directory and the outside of the ForEach container I would use an Execute process task to delete the directory using a DOS command.

    ReplyDelete
  11. Rafael,
    What is the logic inside the Script Task? Please share that as well.

    ReplyDelete
  12. I am trying to use SSIS to create 1-many txt files. The file names to create will be file_name + ymd + seq
    where seq will be 1-? based on file sizes which will determine how many files to create.

    How would you advise on doing this?

    ReplyDelete
  13. Rafael, this is a pretty smart approach to looping through files. One issue that I have encountered though is that the file date variable only captures the first file date and it is not over-written with each loop. I am using 2008 so maybe there is a bug?

    On deleting a directory, there is a file system task to delete directory contents. This might be the simplest approach. But please note that if you intend to delete the files in a directory on another server, you need enterprise edition for this task to work on schedule. Otherwise if it is on the local server it will work fine.

    Mark

    ReplyDelete
  14. Hello Rafael,
    This is a perfect solution for a file deletion problem I have for SSIS 2008. Can you send me the SSIS package sample?

    jefobr@saif.com

    Thanks for your help.

    ReplyDelete
  15. Thanks. The script helped make my life easier.

    ReplyDelete
  16. Hi,
    My situation is that i have to read a file say product*.txt using SSIS. When i store the path in a variable along with the filename, following error get raised "The filename, directory name, or volume label syntax is incorrect.Cannot open the file."

    It will be very helpful if you can suggest any solution, as i have to read file and load the data.

    Thanks
    Preeti

    ReplyDelete
  17. How do I check from a specific folder if there are any files created or updated yesterday?

    ReplyDelete
  18. Bhaskardeb,
    This post will help you only of the date is withing the name of the file. You may need to use an script task if you need to check for other attributes of the file. The good news is that the MSDN forum has plenty of sample code for that. Here is an example: http://social.msdn.microsoft.com/forums/en-US/sqlintegrationservices/thread/2d15c297-4d5c-4b63-9f91-6691ff9d1bf5/

    ReplyDelete
  19. Preeti, I would check the expression. It look like the expression is not evaluating to a valid path. If that is ok, then make sure that the user running the package has access to that folder. Always avoid using network mapped drive. Use ONC paths instead.

    ReplyDelete
  20. Hi Rafael ;

    I am working on project and following my situation :

    I have to validate if the Coupon_YYMMDD.xml file exist in distination , once the xml file found I have to change the xml file to THANKYOU_YYMMDD and move this file to an SFTP , if the file doesn't exist I Have to send mail to Support team.

    Can you please let me know if exist any solution for validation step ( found xml file and change it name)

    Thank's
    Fateh

    ReplyDelete
  21. Fateh,
    You could create a table to log files that have been transferred. Then you can use an execute SQL task to query that table to do the validation. This is the approach I actually use in this type of scenarios.

    ReplyDelete
  22. Hi Rafael,
    This is the condition when ur package runs on daily basis. But if ur package runs 4 times in a day and the file name doesn't have any date time value then how we will identify the laste file.

    ReplyDelete
  23. dattatteys, you would need to look for another approach. an easy one is to keep track of processed files in a table and use it as lookup

    ReplyDelete
  24. Rafael, I am trying to do it through Window event watch..but that is not right solution. I wanna to know the other way to do it. I have tried many approach but not able to do it...Can u please tell me the way for doing it....

    ReplyDelete
  25. Gracielen Gonzalez5/12/10 12:40 PM

    Rafael it is clear thet the file name is a variable then each time that I run the package the name of the file is different how do you obtain the value of the the variable file naem if it is only created on the runtime and each time you run the package the name of the first file name is different for instance I run the package on november 23 my file name generated is:
    filename_11232010
    on december the 1st is:
    filename_12012010

    how do you hard code this value in order to evaluate the file date value I cannot never obtain the value for filedate because then I do not know how your solution can work for me is not working please give me details I do not understand
    how to make work the file date expression if I canot hard code the file_Name variable how do you do it?

    ReplyDelete
  26. You are awesome Rafael!!!!

    ReplyDelete
  27. Hi Rafael,

    I have a filename filname20120109.csv so i need to check for the latest file in the dir and need to pick it up to move to a different folder. i have used the foreach loop over the folder to check for the dates and compare it with today's date, i tried to use the expresssion but it failed can you give me some direction please

    ReplyDelete
  28. Hi Rafael,
    Could you please email the package to my emailID that would be very helpful.

    ranmehra@gmail.com

    Thanks
    Ran

    ReplyDelete

I will love to hear your feedback and comments. Thanks.