Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Help reformat file using excel or Access

Status
Not open for further replies.

malaygal

IS-IT--Management
Feb 22, 2006
192
US
I need help re-formatting a csv file with half-million records.
I created as script that will reformat the original file, but with this number of records, the script is still running after 10 hrs.

Let's say i have this records:

State Jan Feb Mar Apr May Jun

NY Days 31 29 31 30 31 30
NJ Days 30 30 30 30 30 30

I would like to reformat the file into:

NY Days Jan 31
NY Days Feb 29
NY Days Mar 31
NY Days Apr 30
NY Days May 31
NY Days Jun 30
NJ Days Jan 30
NJ Days Feb 30
NJ Days Mar 30
NJ Days Apr 30
NJ Days May 30
NJ Days Jun 30


''''''''''''''''''''here's my script
''there might be some syntax error (i fixed it for this thread), but it worked with smaller file

set fs = CreateObject ("Scripting.FileSystemObject")

strFile= "c:\test.CSV"

Set f = fs.OpenTextFile(strFile, 1, True)
sContents = ""

Do Until f.AtEndOfStream

'old file format
'NY,Days,31,29,31,30,31,30,,,,,,
'NJ,Days,30,30,30,30,30,30,,,,,,

'new file format
'NY,Days,Jan,31
'NY,Days,Feb,29
'NY,Days,Mar,31
'NY,Days,Apr,30
'NY,Days,May,31
'NY,Days,Jun,30
'......

nrec = ""
newline = ""

strLine = split(f.Readline,",")

nrec = strLine(0) & "," & strLine(1) & ","


for n = 2 to ubound(strLine)

if trim(strLine(n)) = "" then '
else
d = n - 5
newline = nrec & monthname(d,true) & "," & strLine(n)
sContents = sContents & newline & vbCrLf
end if

next

'sContents = sContents & newline & vbCrLf

loop

f.close

set f = fs.getfile(strFile)

nfile = fs.GetParentFolderName(f) & fs.GetBaseName(f) & "_REV." & fs.GetExtensionName(f)


Set f = fs.OpenTextFile(nfile, 2, True)
f.write sContents
f.close

wscript.echo("Completed")


'''''''''''''''''''''

With half-million records and Jan - Dec, my reformatted file will potentially have 6 million records.
I am hoping to import the file either to excel or access and format it from there.




 



If you have Excel 2007+, it can be done with a relatively simple process. Please repost in forum68 for assistance with this process.

Is this a recurring task or a one-timer?

Skip,

[glasses]Just traded in my old subtlety...
for a NUANCE![tongue]
 



oops. I missed that you have 6M+ rows.

Consider MS Access and doing 11 Union queries
Code:
select state, Jan, 'Jan' as mon
from [Your Table]

union 

select state, feb, 'Feb' 
from [Your Table]

......

Skip,

[glasses]Just traded in my old subtlety...
for a NUANCE![tongue]
 
Thanks, SkipVought, for that quick response.
I have tried it with the sql below:

SELECT testdb.[Field3], testdb.[Field5], 'Jan' as mon,testdb.[Field7]
FROM testdb
union
SELECT testdb.[Field3], testdb.[Field5], 'Feb' as mon, testdb.[Field8]
FROM testdb
union
SELECT testdb.[Field3], testdb.[Field5], 'Apr' as mon, testdb.[Field10]
FROM testdb;

But result has the Monht in Alpha order. 1st row is for "Apr", 2nd row is for Feb and 3rd row is "Mar"

Anyway I can have the rows display in the order of my sql query?

Thanks in advance.
 


THAT is the problem with month strings!

I'd add another column, for sorting, '01', 'Jan', '02', 'Feb', etc.

Skip,

[glasses]Just traded in my old subtlety...
for a NUANCE![tongue]
 
From the looks of it, the data is merely a state abbreviation with the number of days in every month of the year. This data can easily be created instead of read, reorganized, and rewriten.
Additionally, there aren't even close to half-million combinations of states and days of the month (50 states * 2 possible sets of "days in a month") which suggests that the data is duplicated a few 1,000 times.

Also, your data seems flawed as February never has 30 days. It has 28 days in a normal year and 29 days in a leap year (which occur every 4 years).

Which begs the question, what is this data?

Anyway, I would have to think the issue is where is calculates the [red]monthname. You are passing it negative values. This is probably causing an error which is halting execution.[/red] Furthermore, you are concantenating string to an ever growing string. With each concatenation, the string is copied in memory before it is appended to. If you did hav 500,000 records, how much memory do you think is required to add to the string. Check to see if you CPU usage of wscript is at 50% and the memory usage is ever growing.

[blue]Solve the memory issue by writting to the output file at the end of every loop.[/blue]

Code:
do
   str = ""   
   for (n = 2 to ubound(arr))
      d = n - 5
      [red]str = str & monthname(d, true)[/red]
   next
   [blue]f.write str[blue]
loop

-Geates

Reconsider your approach to solving your goal. From what was posted, the logic is fraught with flaws.




"I hope I can chill and see the change - stop the bleed inside and feel again. Cut the chain of lies you've been feeding my veins; I've got nothing to say to you!"
-Infected Mushroom

"I do not offer answers, only considerations."
- Geates's Disclaimer
 

I wondered about that, but then I saw Feb with 30, in the example!

Skip,

[glasses]Just traded in my old subtlety...
for a NUANCE![tongue]
 
These are sample data and not accurate data.
What we have in the actual file are area, account and the twelve periods. Again, I revised the code for this thread, it should be

for n = 2 to ubound(strFile) 'which is the 3rd field to last field Jan - Dec

d = n - 1 '2-1 = Jan, 3-1 = Feb and so on.....

 
Could post a sample of the actual data? As you can see, it is difficult to assist when your problem is not correctly protrayed.

-Geates

"I hope I can chill and see the change - stop the bleed inside and feel again. Cut the chain of lies you've been feeding my veins; I've got nothing to say to you!"
-Infected Mushroom

"I do not offer answers, only considerations."
- Geates's Disclaimer
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top