I have a tab delimited text file pulled from an AS400 system which contains over 350 columns of data. What I need to do is go in and replace certain columns of data with placeholder data (data scrubbing). I have been working on this for several days and keep hitting my head against a wall.
My first problem consists of the files the program must use. They are generated by the AS400 and downloaded directly from that server. Every time I try to read one in using my current code I get an error message “Input past end of file.” Yet if I edit the file in Textpad, press Enter at the bottom and save it, the data will parse, somewhat.
My next problem is, due to the length of each line in the file, the data seems to not be processing correctly. I had thought that each line, separated by vbCrLf, would be considered a single record, but it seems that my lines are too long as my array into which I am parsing the data has an inconstant number of fields in each element and the data is not in a consistent order, which is of course impossible to use.
Here is the code I am running to input and parse the file (it works for smaller files saved in Notepad, but not larger files nor AS400 downloaded files):
If the above cannot be made to work, is there a method for parsing a file character by character in VB (I know how to do it in Java and Perl, but not VB). I was thinking I could read in each character, if it’s a tab (or whatever delimiter) I tick off a counter, and as I read in the file character by character I write it to the output file character by character, except when I get to the proper numbered column in which I will output the mask data.
Any help will be greatly, greatly, greatly appreciated as hitting my head against my keyboard has ceased being fun.
My first problem consists of the files the program must use. They are generated by the AS400 and downloaded directly from that server. Every time I try to read one in using my current code I get an error message “Input past end of file.” Yet if I edit the file in Textpad, press Enter at the bottom and save it, the data will parse, somewhat.
My next problem is, due to the length of each line in the file, the data seems to not be processing correctly. I had thought that each line, separated by vbCrLf, would be considered a single record, but it seems that my lines are too long as my array into which I am parsing the data has an inconstant number of fields in each element and the data is not in a consistent order, which is of course impossible to use.
Here is the code I am running to input and parse the file (it works for smaller files saved in Notepad, but not larger files nor AS400 downloaded files):
Code:
Private Sub cmdProcess_Click()
' On Error GoTo Error_Handler
If error_check_input = False Then Exit Sub
Dim delim As String
Select Case cmbDelim
Case "<Tab>"
delim = vbTab
Case Else
delim = cmbDelim
End Select
Dim lines() As String, i As Long
lines() = Split(ReadTextFileContents(txtInput), vbCrLf)
'to quickly delete all empty lines, load them with a special char
For i = 1 To UBound(lines)
If Len(Trim(lines(i))) = 0 Then lines(i) = vbNullChar
Next
'then use the Filter function to delete these lines
lines() = Filter(lines(), vbNullChar, False)
'create a string array out of each line of text and store it in
'a Variant element
ReDim values(0 To UBound(lines)) As Variant
For i = 0 To UBound(lines)
values(i) = Split(lines(i), delim)
Next i
ExportDelimitedFile values(), txtOutput
Error_Handler:
If Err Then Err.Raise Err.Number, , Err.Description
End Sub
Function ReadTextFileContents(filename As String) As String
Dim fnum As Integer, isOpen As Boolean
On Error GoTo Error_Handler
' Get the next free file number
fnum = FreeFile()
Open filename For Append As #fnum
Print #fnum,
Close #fnum
Open filename For Input As #fnum
'If execution flow got here, the file has been opened without error
isOpen = True
' Read the entire contents in one single operation.
ReadTextFileContents = Input(LOF(fnum), fnum)
'Intentionally flow into the error handler to close the file
Error_Handler:
'Raise the error (if any) but first close the file.
If isOpen Then Close #fnum
If Err Then
If Err.Number <> 62 Then Err.Raise Err.Number, , Err.Description
End If
End Function
Sub ExportDelimitedFile(values() As Variant, filename As String, _
Optional delimiter As String = vbTab)
Dim i As Long
'Rebuild the individual lines of text of the file
ReDim lines(0 To UBound(values)) As String
For i = 0 To UBound(values)
lines(i) = Join(values(i), delimiter)
Next
'Create CRLFs among records and write them.
WriteTextFileContents Join(lines, vbCrLf), filename
End Sub
Sub WriteTextFileContents(text As String, filename As String, _
Optional AppendMode As Boolean)
Dim fnum As Integer, isOpen As Boolean
On Error GoTo Error_Handler
'get the next free file number
fnum = FreeFile()
If AppendMode Then
Open filename For Append As #fnum
Else
Open filename For Output As #fnum
End If
'If execution flow gets here the file was opened correctly
isOpen = True
'Print to the file in one single operation.
Print #fnum, text
'Intentionally flow into the error handler to close the file
Error_Handler:
If isOpen Then Close #fnum
If Err Then Err.Raise Err.Number, , Err.Description
End Sub
If the above cannot be made to work, is there a method for parsing a file character by character in VB (I know how to do it in Java and Perl, but not VB). I was thinking I could read in each character, if it’s a tab (or whatever delimiter) I tick off a counter, and as I read in the file character by character I write it to the output file character by character, except when I get to the proper numbered column in which I will output the mask data.
Any help will be greatly, greatly, greatly appreciated as hitting my head against my keyboard has ceased being fun.