Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Is there software that can build a database by reading info from PDFs?

Status
Not open for further replies.

Mizugori

Programmer
May 21, 2007
56
US
Is there such a thing as software that could read through a number of PDF files and look for specific pieces of text (such as "Policy Number: 123456789") and use those pieces of information to create entries in a database??

I'm trying to find a simpler way to create a database from thousands of PDF documents from scanned client files... Typing them all into Access would be quite a project... and error prone...

Help? Please? Thanks!
 
Mizugori said:
...scanned client files...

You will need to look at OCR software since scanned documents are 'pictures'.

Adobe 6.0 included this functionality and I believe it is scriptable.

Hope this helps,
CMP

[small]For the best results do what I'm thinking, not what I'm saying.[/small]
(GMT-07:00) Mountain Time (US & Canada)
 
You might want to look at Monarch. I use Monarch to take PDF files and make get to the data. It can make database files from PDF and everything. You can create templates and apply them to standardized PDF files. Not the cheapest software, but worth every penny in my opinion. But you can get a demo and make sure it will meet your needs before buying....

=======================================
People think it must be fun to be a super genius, but they don't realize how hard it is to put up with all the idiots in the world. (Calvin from Calvin And Hobbs)

Robert L. Johnson III
CCNA, CCDA, MCSA, CNA, Net+, A+, CHDP
VB/Access Programmer
 
Our organization uses Monarch. What they have found is if the document was created as a PDF then they can convert it (normally they create an Excel workbook). But if the PDF was created from a scan they can't convert. A scan is a picture of the original.
 
but i mean i am talking about scanning it as OCR not just as an image file. sorry if i didn't really explain that..
 
I'm pretty sure any OCR software can capture text from an image at a certain location on the document....
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top