Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Loop through all Sub Folders & copy data in .txt files into individual tabs 1

Status
Not open for further replies.

Nick V

Technical User
Oct 10, 2018
47
GB
Hi all,

Quickly : There are folders for each month of the calendar year and in those month's folders are sub-folders which belong to various companies. In the company's folder are more folders which holds .txt files that I need the the contents to be extracted from.

Essentially, in folder "03 March" there is a folder for company "Company Excel" and inside that company folder is the Workbook "03 March 2018 Company Excel.xlsm". This .xlsm file will look at all sub-folders within "Company Excel" folder (in "03 March") and copy the contents of all .txt files and paste it in their own individual tabs (with the tabs being labelled with the .txt filename). Within "03 March 2018 Company Excel" there is a tab calld "Master" which will remain blank.

Could the macro possibly not be directory specific? As going forward, I believe it would be best to copy the previous months' .xlsm file for the next month and would like to avoid having to rename the filepath in the code.

When copying the contents of the .txt files, could the name of the .txt files be put against the pasted contents?
Just as a rough idea, some .txt files contents could range from 0 - over 1,000 lines.



As a side note: there are folders called "Unsuccessful" which has .pdf files. I would like for their filenames to be pasted in a new tab with the same name as the folder. There are also .pdf files in the date folders (mentioned above) which I would like their filenames to be pasted in a tab called "Successful". (Maybe it would be easier to have those .pdf files in a folder called "Successful"?)

I think this is a big ask so thank you in advance!!!
 
Hi,

It is a big task. What have you done?

Please post any code you are using and where you are having specific issues.

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Hi SkipVought,

I have some coding which works for the .txt files within the same location as my master workbook however, upon my travels of searching for something similar I believe I have over encumbered myself with too much coding. Really, I have confused myself.

Sub TxtImporter()
Dim f As String, flPath As String
Dim i As Long, j As Long
Dim ws As Worksheet
Dim PathName As String

Application.DisplayAlerts = False

Application.ScreenUpdating = False

Invoice .pdf\2018\03 March\BJW\_All .txt\"
flPath = ThisWorkbook.Path & "\"
flPath = ThisWorkbook.Path & Application.PathSeparator

i = ThisWorkbook.Worksheets.Count
j = Application.Workbooks.Count
f = Dir(flPath & "*.txt")

Do Until f = ""
'Workbooks.OpenText flPath & f, _

Workbooks.OpenText PathName & f, _
StartRow:=1, DataType:=xlDelimited, TextQualifier:=xlDoubleQuote, _
ConsecutiveDelimiter:=False, Tab:=True, Semicolon:=False, Comma:=True, _
Space:=False, Other:=False, TrailingMinusNumbers:=True
Workbooks(j + 1).Worksheets(1).Copy After:=ThisWorkbook.Worksheets(i)
ThisWorkbook.Worksheets(i + 1).Name = Left(f, Len(f) - 4)
Workbooks(j + 1).Close SaveChanges:=False
i = i + 1
f = Dir

Loop

Application.DisplayAlerts = True

Worksheets("Master").Activate

End Sub

-----

I am finding it difficult to to implement the searching through subfolders and the consequent loop within this coding. Really annoying.
The above code works wonderfully however, with the amount of different .txt files that come in weekly I needed to file them away better.
I have coding for the .pdf part of my query but again, similar problem.

Many thanks in advance.
 
I’ve re-read your original post after briefly perusing your posted code.

You seem to be creating multiple workbooks along with multiple worksheets. What a mess!

You need a solution for drilling down through all folders to get to the text files of interest. That can be accomplished using File System Object objects in a recursive procedure (a procedurevthat calls itself to go to the next sub-folder.)

However, your solution for storing the data is absolutely horrible! It violates the best and accepted rules for data storage. You need one database/workbook and as few worksheets/tables as is reasonable. If you pursue your current objective, you will have a virtually useless and confusing storage system.

What do you intend to do with the data you plan to store in this multiple workbook “system”?

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Thank you for your reply.

The coding grabs all .txt files that are stored in the same location as the database/workbook and puts them in their own tab. The code doesn't create any more workbooks, just worksheets.

The reason for there being a tab per .txt file is that in parallel to the .txt files are .pdf files that serve as invoices. So where the .txt files holds basic information (date/amount/invoice reference/more data), the .pdf shows that information in greater detail. This is a way of checking the data of the .txt files without having to open every single one. I then compile all tabs into the "Master" tab in one easy to view way.

Going forward, I thought it best having a workbook per month, per company. Potentially three-four workbooks within a month. In the host folder/main folder that sits outside the months' folders would be another workbook which would grab all that data from the "Master" tabs and put them into one summary sheet (and excluding all the additional tabs).

I don't know the coding that loops through all sub folders to look at the .txt files nor where to put it. Originally I had a folder labelled "_All" where I would dump all .txt files and then I'd add a line of code which would look at that folder. But that was just getting tiresome and I felt another folder wasn't neccessary (but due to my lack of knowledge, it seems I'm having to use this way).

Thanks again.
 
To add to my previous post is a look at the director string.

I've moved it localled away from the network so I can fiddle around with the structure without getting prompted when someone has a file open.

I've uploaded the picture via Engineering - hopefully its viewable.
 
 https://files.engineering.com/getfile.aspx?folder=da9a16e8-735a-44b6-872a-5b7fc28f974f&file=folder_dir.PNG

"I don't know the coding that loops through all sub folders" - here is one link and another that should give you an idea of how to do it. :)

May come handy in other coding, but I am with Skip - your approach is a bad idea. :-(


---- Andy

There is a great need for a sarcasm font.
 
Thanks Andy - would you be able to suggest an alternative approach?
 
Going forward, I thought it best having a workbook per month, per company. Potentially three-four workbooks within a month.
WHAT???? THAT is a disasterous design strategy.

I then compile all tabs into the "Master" tab in one easy to view way.
YES!!! THAT is a useable design strategy!!! You need ONE workbook with a sheet/table that you can use to query, analyze or answer any questions regarding these invoices. I’d dispense with all the other tabs and just assemble ALL the invoices into the Master table as they are encountered.

And certainly, NOT a workbook per month! The invoice date must be a field in the table.

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
That all depends on the business rules and requirements of what you are trying to accomplish. What information someone is trying to access / analyze? All (most?) of your data is in txt and pdf format scattered all over some folder structure – is that the best way to have it? Maybe a little simple data base would do the trick instead?

Sometimes you need to step back and ask yourself – is there a better way to do it?


---- Andy

There is a great need for a sarcasm font.
 
RE: Skip
I would usually just compile all the information into one summary tab, throw in some formulas and that'll be it - I would have just summarised the data. The problem I find is within a single month under a single company, (under the current folder structure), there could be 11 subfolders :
(08 05 2018 LBCSBI / 08 05 2018 LBCSBN / 14 05 2018 LBCRT / 14 05 2018 LBCSBN / and so on) - I thought it best to lay it out this way.(?)

And also, the original .txt filename isn't specified by the LBCSBI/LBCSBN, therefore having them all in one folder would mean having to rename them. I do have a copy of the emails sent from the companies saved in their folders (08 05 2018 LBCSBI/Emails) to easily refer back to the communcation trail and attachments.

RE: Andy
The information needed within the .txt file would be the company name, date, amount, and a 10 character reference number.
The information needed from the .pdf file is in the filename by default as it includes the 10 character reference number. But it is within this .pdf file that hold specific information. (Also, there could be more than one .pdf with the same filename because it could be a recurring case, it is only identifiable by the month and week it is sent in via the company.

So essentially, for the month of May in the week of 14 05 2018 LBCRT there is a case XX12345678. I know to look within that same month and sub folder 14 05 2018 LBCRT for the .pdf file that relates to that case number.

Am I over complicating things? I thought I could just make do with folder for month, folder per week the invoice is sent in.
 
Don’t care about the FOLDER configuration.

I have not referred to changing any FOLDER configuration.

How you configure your folders is your business and concerns me not.

I am referring to many workbook with scads and scads of sheets.

I am suggesting that a much MUCH better Excel configuration would be ONE workbook with ONE worksheet containing ALL your invoices in ONE table. And this would not be a summary. You may have other sheets containing other data like references to your pdf files, which are basically like invoice images, if I understand you correctly.

We will need to know what your folder structure is in order that the essential data contained within the structure can be mapoedvtonthe file structure for your invoices, ie company name, date and sub date, etc

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
RE: Skip

Apologies for the miscommuncation.

You are saying that for the entirety of the year which invoices comes in to just have one Workbook with one Worksheet that holds all invoices. I suppose I coud have the end folder directoy against each invoice...
Invoice:
[D01CC345678 +0000011590105201820512123 Payment to Whoever & Whomever ] then have "05 May\Whoever & Whomever\08 05 2018 LBCRT"
 

“You are saying that for the entirety of the year now forward which invoices comes in to just have one Workbook with one Worksheet that holds all invoices.”

YES! In this way, there is ALWAYS one place for everything.

When your boss or the VP asks a question regarding first quarter or last year, the answer is there! This is how databases work. Professionals do not chop up data into arbitrary locations that makes such research or analysis much more difficult.

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
So my method of having an _All folder which holds all the .txt files would be a valid way of storing the data which in turn would make it easier for the macro to pull the data together?

But then I am in a situation where the .txt files that have the same names would need to be relabelled...although...we do use SharePoint which allows the same name filenames as they are differentiated via metadata. I could always transfer it over there which would side step the conflicting filenames when storing the .txt files.

With over hundreds of .txt files to copy the data from and paste into one workbook (and into one worksheet), would that in itself be a burden? Is there a way to avoid having to start from the very beginning (or I suppose it would be safer to capture everything?).

Thanks.
 
You don’t need to change your folder structure. We just need to know how to map the data in your folder structure to the table in order to store the data to uniquely identify each invoice in the table.

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
...so lets suppose your invoice file has this structure...
[tt]date/amount/invoice reference/more data[/tt]
4 fields. But that’s not enough data to identify this invoice uniquely from any other invoice. You need fields for...
[tt]
Invoice Nbr
Invoice Date (assuming that the date field is not the invoice date)
Company Name
[/tt]
...and maybe other fields as well.

This additional data will no doubt need to be extracted from the folder path structure.

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
In which case, having the main (or host) Workbook sitting outside the Months' folder would be ideal. If possible, that Workbok would look through all subfolders from the main/host Workbooks location and compile .txt data into one sheet.
 
Yes. The workbook could reside at the top of the folder structure. The code could search that folder for sub-folders, drilling down to the .txt files and sub-folders within sub-folders.

What would be done with the .pdf files?

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Skip said:
...so lets suppose your invoice file has this structure...
date/amount/invoice reference/more data
4 fields. But that’s not enough data to identify this invoice uniquely from any other invoice. You need fields for...

Invoice Nbr
Invoice Date (assuming that the date field is not the invoice date)
Company Name

...and maybe other fields as well.

This additional data will no doubt need to be extracted from the folder path structure.


True - details of the .txt are just simplified invoices:
[D01CC345678 +0000011590105201820512123 Payment to Whoever & Whomever ]
The CC######## is the case number.
The date is within the numeric segment [01052018]
The number at the end [20512123] is the number for the invoice/individual.
The numbers before the date [+000001159] is a debit amount £11.59 (credit would be a minus at the start.

I would then have the folder path structure assigned next to this information.
The filename of the .txt file has the date which differs from the date within said file.
.txt filename date = week of batch invoices
Probably 7 fields.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top