Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

CompareFile for differences 1

Status
Not open for further replies.

Robertge

Programmer
Jul 4, 2002
18
IT
Hi,

I'm looking for a good algorithm for comparing two text files and find the lines inserted and/or deleted.

On tso mainframe this function is called SUPERC
For Unix is called DIFF.

I found som sample in C or Java but are very complicated for me.

Someone could hel me?

Thanks to everybody
 
Have you looked at thread222-287071 Let me know if this helps
________________________________________________________________
If you are worried about how to post, please check out FAQ222-2244 first

'There are 10 kinds of people in the world: those who understand binary, and those who don't.'
 
In VISUAL BASIC, look up the keywords :
InStr(), StrComp() and "Option Compare"

 
And then? *******************************************************
General remarks:
If this post contains any suggestions for the use or distribution of code, components or files of any sort, it is still your responsibility to assure that you have the proper license and distribution rights to do so!
 
Thank to everybody
I have visual basic 5 but this is not a big proble.

I look the link from JonWm but.. it does not work very weel, but is good sujest

Bye
 
you might use this as a starting point, its nowhere near bullet proff yet ;-)

Private Sub Command1_Click()
Dim iInFile1 As Integer
Dim iInFile2 As Integer
Dim iOutFile As Integer
Dim iLineCount As Integer
Dim sLine1 As String
Dim sLine2 As String
Dim sInFilename1 As String
Dim sInFilename2 As String
Dim sOutFilename As String

iInFile1 = 1
iInFile2 = 2
iOutFile = 3
sInFilename1 = "C:\jim1.txt" 'source file
sInFilename2 = "C:\jim2.txt" 'source file
sOutFilename = "C:\outjim.txt" 'file to hold diffs

On Error GoTo close_files
Open sInFilename1 For Input As iInFile1
Open sInFilename2 For Input As iInFile2
Open sOutFilename For Output As iOutFile

iLineCount = 1

While Not EOF(iInFile1)
Line Input #iInFile1, sLine1
'ensure we don't run out of lines in second source file
If Not EOF(iInFile2) Then
Line Input #iInFile2, sLine2
If StrComp(sLine1, sLine2) <> 0 Then
Print #iOutFile, &quot;Line:&quot;; iLineCount; &quot; &quot;; sInFilename1
Print #iOutFile, sLine1
Print #iOutFile, &quot;Line:&quot;; iLineCount; &quot; &quot;; sInFilename2
Print #iOutFile, sLine2
End If
Else
Print #iOutFile, &quot;Line:&quot;; iLineCount; &quot; in file: &quot;; sInFilename1
Print #iOutFile, sLine1
Print #iOutFile, &quot;line not in file:&quot;; sInFilename2
End If
iLineCount = iLineCount + 1
Wend

' ensure we process all lines if second source file has more lines
If Not EOF(iInFile2) Then
While Not EOF(iInFile2)
Line Input #iInFile2, sLine2
Print #iOutFile, &quot;Line:&quot;; iLineCount; &quot; in file: &quot;; sInFilename2
Print #iOutFile, sLine2
Print #iOutFile, &quot;line not in file: &quot;; sInFilename1
iLineCount = iLineCount + 1
Wend
End If

close_files:
Close iInFile1
Close iInFile2
Close iOutFile

End Sub
 
WOW! [surprise] Is there and echo!! If you choose to battle wits with the witless be prepared to lose.
[machinegun][hammer]
 
Hi

I saw some interesting news on internet about this.

Many people says that &quot;This problem has different solutions, because differents are the philosophy&quot;.

The only sure poiint for me, In this moment, is that
-- i have to load the 2 files on two arrays

For compare the 2 arrays, can only TRY TO DECIDE which are the inserted rows and deleted rows because in a text file all the possibilities are possible.

One the solution is based on LCS (Longest Common Subsequence) and it seems the most used: find the longest common subsequence between the two files for using like reference point and decide if the rows before and after this reference point are deleted or inserted

OK, thanks to everybody again
Greeeting from Italy.
:

 
The LCS is the best algorithm at this time, The problem with it is that it requires an M*N, 2 dimensional, array and M*N compares where M and N are the # of lines in the &quot;old&quot; and &quot;new&quot; files respectively. For 1000 lines, that means 2 meg for the array, assuming integers, and 1 meg separate compares. From that point, various methods are used to &quot;back off&quot; the absolute LCS to a good but less than provably perfect result. Forms/Controls Resizing/Tabbing Control
Compare Code (Text)
Generate Sort Class in VB or VBScript
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top