Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Comparing CSV Files 1

Status
Not open for further replies.

ihaveaproblem

Programmer
Jul 4, 2003
14
0
0
GB
I need to compare 2 large(ish) csv files which change on a daily basis. Once I have compared them, if there is a difference in ANY field I need to write the WHOLE of the correspondin record to another file.

The data contains approximately 1000 records and each records contains over 50 fields, seperated by pipes.

Does anybody know the best way to do this in VB and have any sample code I could look at.

Much Appreciated.
 
Hi There

I've included a sample of code I used (with slight changes !) that will compare all records in the CSV files. This will only work if there are no commas in the fields as it will treat these as field separators. On the Print# command, I am writing the contents of the record in file 2 but you can change this to 1 if you like.

Hope this helps !!!


Private Sub btnCompare_Click()
Dim strRecord1 As String, strRecord2 As String
Dim strArray1 As Variant, strArray2 As Variant
Open "c:\file1.csv" For Input As #1
Open "c:\file2.csv" For Input As #2
Open "c:\diffs.csv" For Append As #3
Do While Not EOF(1) And Not EOF(2)
Line Input #1, strRecord1
Line Input #2, strRecord2
strArray1 = Split(strRecord1, ",")
strArray2 = Split(strRecord2, ",")
If RecordIsDifferent(strArray1, strArray2) = True Then Print #3, strRecord2
Loop
Close #1, #2, #3
End Sub

Function RecordIsDifferent(strArray1 As Variant, strArray2 As Variant) As Boolean
Dim i As Integer
RecordIsDifferent = False
For i = 0 To UBound(strArray1) - 1
If strArray1(i) <> strArray2(i) Then
RecordIsDifferent = True
Exit Function
End If
Next i
End Function
 
Hookbob,

Thanks for that..it works a treat! The only thing I need to work out now is how I determine if new records are added to the latest csv extract and add these to the diffs file as well.

I should be able to sus this out, but if you have any ideas let me know.

Thanks again
 
Hi there. I'm glad I could help. The following code should dump any records that exist in file1 but not in file2. I haven't tested it yet but it should be ok.


Private Sub btnCompare_Click()
Dim strRecord1 As String, strRecord2 As String
Dim strArray1 As Variant, strArray2 As Variant
Open &quot;c:\file1.csv&quot; For Input As #1
Open &quot;c:\file2.csv&quot; For Input As #2
Open &quot;c:\diffs.csv&quot; For Append As #3
Do While Not EOF(1)
Line Input #1, strRecord1
If EOF(2) Then
Print #3, strRecord1
Else
Line Input #2, strRecord2
strArray1 = Split(strRecord1, &quot;,&quot;)
strArray2 = Split(strRecord2, &quot;,&quot;)
If RecordIsDifferent(strArray1, strArray2) = True Then Print #3, strRecord2
End If
Loop
Close #1, #2, #3
End Sub
 
Thanks again. I had managed to work that one out but i'm now stuck on the opposite problem to which I cannot work out the solution.

The problem is sometimes a record has been deleted, so once the code hits this all the corresponding rows are out of sync and therefore flagged as different. I therefore need some way to identify if a record has been deleted and if it has remove it from the array.

By using a unique identifier I can work out when a record has been removed as the idebtifiers are different, however i'm unsure how to remove this entry from the array. Do you have any suggestions?

I hope this makes sense!
 
Hi there.

This procedure can get quite complicated but I have a sample of code that might work. It assumes that the unique identifier is the first field in each record i.e strArray1(0) and strArray2(0). It compares each record identifier and if the numbers are out of sync then it writes the missing record to the diffs file. This will work if a record is deleted from either file. It will also check the differences and any new records added to the end of either file. (N.B. this procedure assumes that you will have less than 999999999 records in each file, otherwise, increase this value)

Private Sub btnCompare_Click()
Dim strRecord1 As String, strRecord2 As String
Dim strArray1 As Variant, strArray2 As Variant
Open &quot;c:\file1.csv&quot; For Input As #1
Open &quot;c:\file2.csv&quot; For Input As #2
Open &quot;c:\diffs.csv&quot; For Append As #3
Do
If EOF(1) Then
strArray1(0) = 999999999
Else
Line Input #1, strRecord1
strArray1 = Split(strRecord1, &quot;,&quot;)
End If
If EOF(2) Then
strArray2(0) = 999999999
Else
Line Input #2, strRecord2
strArray2 = Split(strRecord2, &quot;,&quot;)
End If
Do While strArray1(0) < strArray2(0)
Print #3, strRecord1
If EOF(1) Then
strArray1(0) = 999999999
Else
Line Input #1, strRecord1
strArray1 = Split(strRecord1, &quot;,&quot;)
End If
Loop
Do While strArray1(0) > strArray2(0)
Print #3, strRecord2
If EOF(1) Then
strArray1(0) = 999999999
Else
Line Input #2, strRecord2
strArray2 = Split(strRecord2, &quot;,&quot;)
End If
Loop
If (strArray1(0) = strArray2(0)) And (strArray1(0) <> 999999999 And strArray2(0) <> 999999999) Then
If RecordIsDifferent(strArray1, strArray2) = True Then Print #3, strRecord2
End If
Loop Until EOF(1) And EOF(2)
Close #1, #2, #3
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top