Hi.
I'm trying to extract telephone numbers from many files and save result to tab delimited file. Output file has 3 columns per record: telephone number, fax number, cellphone number.
My input looks like this (it's html page, and that's a part with numbers form it):
<div class='phones'>
<p...
Hi
That script works like a harm :]. Awesome. But I have question about some modification. I noticed that company name in <h1> tags sometimes has graphic logo, and then h1 tag looks like this:
<h1 class="wiz_tyt"><div style="height: 50px; float: left;"><img...
Hi
Thanks again for fast reply. In attachment I included package contains:
- run.bat (script running command)
- out.txt (output I'm recieving on my comp)
- test/ (directory with 65 html files I'm working on)
- program.awk (Your script)
E:\>gawk --version
GNU Awk 3.1.6
Copyright (C) 1989...
Hi!
Thanks for very fast response! Thanks alot for script, but i don't understand how it works exactly. And it doesn't do what I meant. Maybe I described it wrong. What Your script is actually doing is stripping html tags and that's not the point ;). I will describe this on example.
3 htm...
welcome everyone.
i'm new on this forum, same with awk programming. i'm from Poland so forgive me my english.
i'm looking for fast data extraction from over 1.000.000 html files (around 30kB each) and save it to txt with tab as delimiter. each html file has many informations about one company...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.