Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How do I move through a range of URL IDs 1

Status
Not open for further replies.

DrMingle

Technical User
May 24, 2009
116
US
My objective: to move through a range of IDs, pull the HTML down, and convert it to plain text.

Below is the actual link:

Code:
[URL unfurl="true"]http://www.albme.org/index.cfm?fuseaction=app.LicenseeDetails2&ID=86699[/URL]

An example range: 86650 - 87000

Below is the actual code to pull down the requested data:

Code:
import sys, urllib
from StringIO import StringIO
import html2text

if __name__ == '__main__':
    url = '[URL unfurl="true"]http://www.albme.org/index.cfm?fuseaction=app.LicenseeDetails2&ID=86699'[/URL]
    encoding = 'utf-8'
    f = urllib.urlopen(url)
    try: s = f.read()
    finally: f.close()
    ustr = s.decode(encoding)
    b = StringIO()
    old = sys.stdout
    try:
        sys.stdout = b
        html2text.wrapwrite(html2text.html2text(ustr, url))
    finally: sys.stdout = old
    text = b.getvalue()
    b.close()
    print text

I am thinking I need to supply some sort of range and define a scenario of x+1 to move through the IDs...pulling down the data after each new URL ID is reached.

Any ideas would be helpful...
 
Code:
start, stop = 86650, 87000
format = '[URL unfurl="true"]http://www.albme.org/index.cfm?fuseaction=app.LicenseeDetails2&ID=%i'[/URL]
for i in range(start, stop+1):
    print format % (i,)

May I suggest that you go through the tutorials freely available on the web?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top