Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SSIS -- ETL, then move Excel file 2

Status
Not open for further replies.

RottPaws

Programmer
Mar 1, 2002
478
US
I've got an SSIS package that is supposed to loop through the tabs of an xlsx file, extract data, and then move that file to an archive folder. It sounds simple enough, but I can't it to work.

I have a Foreach Loop container that gets a collection from a network drive and gives the file name to an EXCEL Connection Manager. The Connection manager is then used by a series of Data Flow tasks that get data from the tabs of the spreadsheet and insert them into a SQL Server db through another Connection Manager object.

Next, it is supposed to move the file to an \Archive directory when it is finished. I'm trying to use a File System Task to accomplish this, but it fails. The EXCEL Connection Manager does not appear in the Source Connection drop-down list, so I apparently can't use that. I have a variable that uses an expression to build the path\filename to use as the source. If I disable all of the Data Flow tasks in the Foreach Loop container, it will move/rename the file as it should. But when the Data Flow tasks are enabled, it goes through them and then fails with an error message that says it can't move the file because it is in use. There are constraints on the Data Flow tasks so each runs in series and then the File System task is after they are all done.

Is there something I need to do force it to close the connection and release the file after the last Data Flow?

Is there some other work around?

Thanks in advance!


_________
RottPaws

If you don't report a problem, I probably won't fix it.
 
Is that one file with multiple tabs or multiple excel files?
if it is one file, file will be on lock until all data flow tasks completed. you should put all your data flow tasks in one Sequence Container and File System Task after Sequence Container
 
It's one Excel file with multiple tabs. The data flow tasks are all set up with constraints so they run in sequence and the last one leads to the file system task on success. I'll try the Sequence Container and see if that makes a difference.

Thanks!

_________
RottPaws

If you don't report a problem, I probably won't fix it.
 
And which driver are you using? Jet or ACE (advisable).

And have you tried using a normal OLEDB or ADO.NET connection to loop through the files?

sometimes the excel access does not release the file at the end hence the file in use error

Regards

Frederico Fonseca
SysSoft Integrated Ltd

FAQ219-2884
FAQ181-2886
 
I'm using the ACE driver. Provider=Microsoft.ACE.OLEDB.12.0;
I've never used an OLEDB or ADO.NET connection with an Excel file. I'll look in to that.

I put the data flow tasks in a sequence container, but it didn't make any difference. When it comes out of the sequence container, the file system task still fails saying the file is in use.

I tried setting the RetainSameConnection property to false for the data connection and now it fails sporadically.
SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER.

It sometimes fails on the first tab, sometimes the 2nd or 3rd, and once actually made it all the way through and even moved the file. I just don't know what to do to make it consistent.




_________
RottPaws

If you don't report a problem, I probably won't fix it.
 
one possible way.

instead of processing the file directly, create a copy of it on a temp folder. load tabs from that one - at the end move the real file to where you need, and using a c# script try and delete the temp file - do not fail task if you are unable to do it.

after the SSIS package is finished delete any file on that temp folder.

Regards

Frederico Fonseca
SysSoft Integrated Ltd

FAQ219-2884
FAQ181-2886
 
I looked at trying an ADO connection, but the examples I found seem to indicate that the format of all the sheets needs to be the same as it loops through the files and then a nested loop goes through the worksheets.

The file I'm working with has a different data structure for each sheet. So I have 6 different data flow tasks that each works with a specific sheet.

Here's what finally worked:

I moved the data flow tasks into a sequence container, set the RetainSameConnection property to False for the EXCEL Connection Manager, and added a Script Task in place of the File System task after the Sequence Container. The script task calls a C# script that copies the file to the archive folder and then deletes the original.

Thank you for your help!

_________
RottPaws

If you don't report a problem, I probably won't fix it.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top