Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reading a Large text file line by line looking for items.. 2

Status
Not open for further replies.

bobbyforhire

Technical User
Mar 11, 2008
253
US
As the subject states i have a large text file that repeats itself alot with the same strings but with diffrent data. Here is an example.

<TD CLASS="products">  <A HREF=" Hollow</A></TD>

<TD CLASS="products" ALIGN="center">$0.10</TD>
<TD CLASS="products" ALIGN="center">8</TD>
<TD CLASS="products" ALIGN="center">
<form action=" method=post >
<input type=text size=2 name="16646:qnty" value="1" >
<input type=hidden name="storeid" value="*2645aea004a115553e5ef7e05eb7f1d1994c7c9f8b">
<input type=hidden name="dbname" value="products">
<input type=hidden name="function" value="add">
<input type=hidden name="itemnum" value="16646">
<input type=submit value="Buy">
</form>
</TD>
</TR>
<TR>
<!------ Start Sub-Products Table ------>

<TD CLASS="products">  <A HREF=" Hollow *Foil*</A></TD>
<TD CLASS="products" ALIGN="center">$0.65</TD>
<TD CLASS="products" ALIGN="center">3</TD>
<TD CLASS="products" ALIGN="center">
<form action=" method=post >
<input type=text size=2 name="16647:qnty" value="1" >
<input type=hidden name="storeid" value="*2645aea004a115553e5ef7e05eb7f1d1994c7c9f8b">
<input type=hidden name="dbname" value="products">
<input type=hidden name="function" value="add">
<input type=hidden name="itemnum" value="16647">
<input type=submit value="Buy">
</form>
</TD>




And it keeps going on with that same format but the names and prices change.

So how could i tell VB to go through this text file and grab me this info and put it in it's own text file.

Howltooth Hollow=0.10
Howltooth Hollow*=0.65

Any help would be awesome!
 
There are a lot of ways to do it. The simplest would be to use a stream reader.

Code:
    Public Sub ReadHtmlFile(ByVal path As String)
        If IO.File.Exists(path) Then
            Dim sr As New IO.StreamReader(path)

            Do Until sr.EndOfStream = True
                'Do something and read a line from the file sr.ReadLine.
            Loop
        End If
    End Sub


-I hate Microsoft!
-Forever and always forward.
-My kingdom for a edit button!
 
My problem was really wondering how can i tell it to extract the information that i need, reading the text file line by line really isn't that hard of an issue it's just the

"'Do something and read a line from the file sr.ReadLine."

is where I am drawing a blank
 
Get exactly which info though? What do you want to do with it? I mean you can simply say If sr.readline = "<input type=hidden name="function" value="add">" Then hopscotch, but your not giving enough information to fill in the do something part.

-I hate Microsoft!
-Forever and always forward.
-My kingdom for a edit button!
 
Out of all of that info i only need this

Howltooth Hollow=0.10
Howltooth Hollow*=0.65
 
What I think sorwen is trying to say is that you have not provided enough information but at the same time you have provided all the information as well.

You will need to read through the file one line at a time. You will need to evaluate each line, looking for your indicators. Your first indicator would be this line:

<TD CLASS="products"> <A HREF=" Hollow</A></TD>

This will tell you you have found a product. You will then need to parse the line and get the specific product out of it. I would also use a flag in my code to say I have found a product and am now searching for the price.

You continue reading lines...Checking each line for the existence of a dollar sign ($) (based on your sample...you would have to assume that the dollar sign is present). once you find a line with the dollar sign, you grab that value and then unflag your product flag.

<TD CLASS="products" ALIGN="center">$0.65</TD>

Now you are searching for another product... Repeat until the end of the file.

Since this is a sample of HTML, there also may be a way to parse the html and get just the values...This is definitely something i have never done and don't even know if it is possible, but you could do some web searching to see...

No matter what, this setup has a potential for error...you will want to review/validate yourself as you traverse through the file.

sorwen has provided a starter piece of code....give it a shot and let us know what you get stuck on.

=======================================
People think it must be fun to be a super genius, but they don't realize how hard it is to put up with all the idiots in the world. (Calvin from Calvin And Hobbs)

Robert L. Johnson III
CCNA, CCDA, MCSA, CNA, Net+, A+, CHDP
VB.NET Programmer
 
Ok so importing my path i was abled to find a trend as stated above.

All of the names i need will look like this
>NAME OF ITEM</A></TD>

So i will need to read that line and know that i need to chop off the </A></TD> at the end and grab everything that it has until it reaches a > then add that like this

NOI = NAME OF ITEM FOUND

Then go until I find a $ and grab everything it see's to the right of it unit it see's a < sign then add this

POI = PRICE OF THE ITEM

Then write that out to a file like this

GOTO NEXT LINE OF TEXT FILE
WRITEFILE TO TEXT = NOI + "=" + POI

I'm guessing RegEX is the way to go and ill have to look into how to use that, as usual if anyone know's of a quick and dirty way to get this done i'm all for it.


 
What mstrmage1768 said. I've always had a problem understanding RegEX so I can only suggest substringing it. I still can't help you totally as you still don't say what you want to do with it. Here is an example. In many ways it isn't a very good one because there could be a lot of problems with it. For example are we sure the alignments will always be the same, there may not be more that one "$" dollar sign used, etc.
Code:
    Public Sub ReadHtmlFile(ByVal path As String)
        If IO.File.Exists(path) Then
            Dim Name As String = ""
            Dim Price As String = ""
            Dim colNamePrice As New Collection
            Dim sr As New IO.StreamReader(path)

            Do Until sr.EndOfStream = True
                Dim line As String = sr.ReadLine

                If line.Contains("<TD CLASS=""products"">  <A HREF=") Then
                    Dim EndTag As Integer = line.LastIndexOf("</A>")
                    Dim EndStartTag As Integer = line.LastIndexOf(""">")

                    Name = line.Substring(EndStartTag + 2, ((EndTag) - (EndStartTag + 2)))
                End If

                If line.Contains("<TD CLASS=""products"" ALIGN=""center"">$") Then
                    Dim EndTag As Integer = line.LastIndexOf("</TD>")
                    Dim EndStartTag As Integer = line.LastIndexOf(""">$")

                    Price = line.Substring(EndStartTag + 2, ((EndTag) - (EndStartTag + 2)))

                    If Name <> "" Then
                        Dim NamePrice As String() = {Name, Price}
                        colNamePrice.Add(NamePrice)

                        Name = ""
                        Price = ""
                    End If
                End If
            Loop
        End If
    End Sub

-I hate Microsoft!
-Forever and always forward.
-My kingdom for a edit button!
 
Well, i didn't go that route i think i went a longer route but it does get the job done and gives me a text file with that i need :)


Dim PATH As String
Dim CARDCHECK As String
Dim PRICECHECK As String
Dim x As String
Dim ALLCHARS As String
Dim STARTOFTHEEND As String
Dim STARTOFTHEENDOG As String
Dim STARTOFTHEFRONT As String
Dim GETNAMECOUNTER As String
Dim GETSETCOUNTER As String
Dim cardlen As String
Dim FOUNDNAME As String
Dim GOTCHUFOO As String
Dim GOTCHUTOO As String
'Dim GOTCHUPRICE As Integer
Dim FOUNDPRICE As Integer
Dim GETPRICECOUNTER As String
Dim GETPRICECOUNTER2 As Integer
Dim FOUNDSET As String


PATH = ("c:\source.txt")
If IO.File.Exists(PATH) Then



Dim sr As New IO.StreamReader(PATH)
Do Until sr.EndOfStream = True
TextBox1.Text = sr.ReadLine
CARDCHECK = Microsoft.VisualBasic.Right(TextBox1.Text, 9)
If CARDCHECK = ("</A></TD>") Then
'RESET CHECKS
FOUNDNAME = False
FOUNDSET = False
GETNAMECOUNTER = 0
GETSETCOUNTER = 0
GETPRICECOUNTER = 0
FOUNDPRICE = False

ALLCHARS = TextBox1.Text.Length
STARTOFTHEEND = ALLCHARS - 9
STARTOFTHEENDOG = ALLCHARS - 9
'GET THE NAME OF THE CARD
Try


Do Until FOUNDNAME = True
GETNAMECOUNTER = GETNAMECOUNTER + 1
x = Microsoft.VisualBasic.Mid(TextBox1.Text, STARTOFTHEEND, 1)
If x = ">" Then
cardlen = GETNAMECOUNTER - 1
STARTOFTHEFRONT = STARTOFTHEENDOG - cardlen
'THIS IS THE NAME OF THE CARD VVVVVVVVVVVV
GOTCHUFOO = (Microsoft.VisualBasic.Mid(TextBox1.Text, STARTOFTHEFRONT + 1, cardlen))
'THIS IS THE NAME OF THE CARD ^^^^^^^^^^^^^^
If GOTCHUFOO = "" Then 'instances where the tag shows up but displays nothing
Else
TextBox2.Text = GOTCHUFOO
'GET THE SET OF THE CARD
Do Until FOUNDSET = True
GETSETCOUNTER = GETSETCOUNTER + 1
x = Microsoft.VisualBasic.Mid(TextBox1.Text, STARTOFTHEEND, 1)
If x = "/" Then
GOTCHUTOO = (Microsoft.VisualBasic.Mid(TextBox1.Text, STARTOFTHEEND + 1, 3))
TextBox3.Text = GOTCHUTOO

FOUNDSET = True
Else
STARTOFTHEEND = STARTOFTHEEND - 1
End If
Loop
End If
FOUNDNAME = True
Else
STARTOFTHEEND = STARTOFTHEEND - 1
End If

Loop
Catch ex As Exception

End Try
End If
FOUNDPRICE = False
GETPRICECOUNTER2 = 0
GETPRICECOUNTER = 39
PRICECHECK = Microsoft.VisualBasic.Left(TextBox1.Text, 38)

If PRICECHECK = " <TD CLASS=""products"" ALIGN=""center"">$" Then
ALLCHARS = TextBox1.Text.Length
Do Until FOUNDPRICE = True
x = Microsoft.VisualBasic.Mid(TextBox1.Text, GETPRICECOUNTER, 1)
If x = "<" Then
x = (Microsoft.VisualBasic.Mid(TextBox1.Text, 39, GETPRICECOUNTER2))
Dim swriter As StreamWriter
swriter = File.AppendText("C:\Users\broberts\Documents\MTGO_NS5_PRICELIST.txt")
swriter.WriteLine(TextBox2.Text & "(" & TextBox3.Text & ")" & "=" & x)
swriter.close()

FOUNDPRICE = True
Else
GETPRICECOUNTER = GETPRICECOUNTER + 1
GETPRICECOUNTER2 = GETPRICECOUNTER2 + 1
End If
Loop

End If








Loop

End If





End Sub
 
You found a solution...That is the most important thing. Glad you stuck with it and worked it out.

=======================================
People think it must be fun to be a super genius, but they don't realize how hard it is to put up with all the idiots in the world. (Calvin from Calvin And Hobbs)

Robert L. Johnson III
CCNA, CCDA, MCSA, CNA, Net+, A+, CHDP
VB.NET Programmer
 

bobbyforhire, you said the code works, but....

You have in your code:
Code:
Dim FOUNDNAME As String
Dim FOUNDSET As String

...
FOUNDNAME = False
FOUNDSET = False

...
Do Until FOUNDNAME = True
...
Do Until FOUNDSET = True
Shouldn't be Boolean instead of String?
How can you assign the value of False or True to the String?

I may be too picky, lost, or that's something new I don't know about (big possibility)

Have fun.

---- Andy
 
Like mstrmage1768 said. The rest all comes with time and practice.

-I hate Microsoft!
-Forever and always forward.
-My kingdom for a edit button!
 
By the way sorwen, you deserve a star for your work and example. [smile]

=======================================
People think it must be fun to be a super genius, but they don't realize how hard it is to put up with all the idiots in the world. (Calvin from Calvin And Hobbs)

Robert L. Johnson III
CCNA, CCDA, MCSA, CNA, Net+, A+, CHDP
VB.NET Programmer
 
Andrzejek - Good Point. It is working though and if i remove XXX = True after it finds something it will just loop forever, so while it might be bad code, it works.
 
Thanks. :)

Yeah, bobbyforhire you might want to add something else in that would cause it to stop those other loops. Perhaps add some or's to your do until and if any one of your counters exceeds a number then it will quit as well. Then add some error checking.

-I hate Microsoft!
-Forever and always forward.
-My kingdom for a edit button!
 

I know WHY you did it, I just do not know why did you use String instead of Boolean.

You may just do:
Code:
Dim FOUNDNAME [s]As String[/s] [blue]As Boolean[/blue]
and the same for all other Strings that should be Booleans, and you are done.

Have fun.

---- Andy
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top