Hi All,
So I've managed to make good progress on my OCR issues, using EZTWAIN and Transym. But funny thing is, neither company have any VFP experience in this space.
EZTWAIN integrate nicely with it, and to that end I'm now able to scan (a business card) front and back simultaneously, and then output 2 (well 4) files.
First is a JPG image of the front and back of the card. Very useful for displaying in an image file which is linked by path in a Memo field.
Second, the Scan with Transym utilizing one of it's calls outputs a PDF that is OCR'd. So I have the file sitting there.
But... how do I "scrape" the data from a PDF (if I look at it manually I can "select" the PDF image, Open it, CTRL+A it (select all) then CTRL+C it (copy all) and then... how can I maybe, store it in a Memo file which will then allow me to "parse/evalute" it's contents so that I can use the results to populate known fields. (I fill figure out how to do all the parsing maybe with some questions later), but if I can just OPEN the PDF and CTRL+A then CTRL+C it, I can then REPLACE Table.OCR data with the contents of that copy, and THEN I can manipulate the memo field data.
Any ideas?
Many thanks,
Best Regards,
Scott
ATS, CDCE, CTIA, CTDC
"Everything should be made as simple as possible, and no simpler."
So I've managed to make good progress on my OCR issues, using EZTWAIN and Transym. But funny thing is, neither company have any VFP experience in this space.
EZTWAIN integrate nicely with it, and to that end I'm now able to scan (a business card) front and back simultaneously, and then output 2 (well 4) files.
First is a JPG image of the front and back of the card. Very useful for displaying in an image file which is linked by path in a Memo field.
Second, the Scan with Transym utilizing one of it's calls outputs a PDF that is OCR'd. So I have the file sitting there.
But... how do I "scrape" the data from a PDF (if I look at it manually I can "select" the PDF image, Open it, CTRL+A it (select all) then CTRL+C it (copy all) and then... how can I maybe, store it in a Memo file which will then allow me to "parse/evalute" it's contents so that I can use the results to populate known fields. (I fill figure out how to do all the parsing maybe with some questions later), but if I can just OPEN the PDF and CTRL+A then CTRL+C it, I can then REPLACE Table.OCR data with the contents of that copy, and THEN I can manipulate the memo field data.
Any ideas?
Many thanks,
Best Regards,
Scott
ATS, CDCE, CTIA, CTDC
"Everything should be made as simple as possible, and no simpler."