Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Auto-Scan and OCR Success

Status
Not open for further replies.

Scott24x7

Programmer
Jul 12, 2001
2,826
JP
Hi All,
I know a few of you have been following this for a while, and I said when I found a solution I would put it up. As of today we can with one click now, scan both sides of a business card (if back is blank, it gets ignored), automatically populate memo field with the location of the image, automatically store the image based on specific record ID (i.e. it makes directories to store the physical card image data which is then referenced and updates a image base-class on the form).

So Here's the solution (note this uses EZTWAIN and Transym OCR, though the latter is totally transparent):

[Note: we parameterized this with 4 parameters: Top, Left, Width and Height. This is because we discovered that cards basically come in 2 types. Short ones, and tall ones. (No need to worry over any other part). Interestingly Tall usually means "Tall narrow, just treat it as Tall and wide". So "click" is the event you are seeing the code for. We put a different set of parameters on the RIGHTClICK and just call it from the Rightclick method: This.Click(0.0,0,0,3.55,2.15) This makes for taller and wider cards. The default (when no parameters are passed) is (0.0,0.0,3.4,2.0) so slightly shorter and narrower. Since the only difference is the TWAIN_SetRegion, we can maintain a single simple code set to manage any card size functions we want, and just change the region. This is setup for our Canon P-208II duplex scanner, which we've hard-coded, because we don't want users trying to use any input scanner willy-nilly. But you can adjust it as you see fit, and even leave the OpenSource with () to call a list.

Code:
LPARAMETERS lnTop, lnLeft, lnWidth, lnHeight

IF PARAMETERS() = 0
	lnTop = 0.0
	lnLeft = 0.0
	lnWidth = 3.4
	lnHeight = 2.0
ENDIF

LOCAL fileName AS STRING
LOCAL i AS INT
LOCAL hdib AS HANDLE
*
ThisForm.LockScreen = .T.
*
TWAIN_SetHideUI(1)
TWAIN_SetFileAppendFlag(0)
OCR_SelectDefaultEngine()
TWAIN_SetAutoOCR(1)
TWAIN_SetJpegQuality(100)
IF TWAIN_OpenSource("Canon P-208II TWAIN")<>0
	TWAIN_EnableDuplex(1)
	TWAIN_SetPixelType(5)
	TWAIN_SetResolution(600)
	TWAIN_SetAutoDeskew(1)
	TWAIN_SetXferCount(-1)
	TWAIN_SetAutoScan(1)
	TWAIN_SetRegion(lnTop, lnLeft, lnWidth, lnHeight)
	TWAIN_SetMultiTransfer(1)
	i = 1
	DO WHILE .T.
		lcTextMergeStringIMG = ADDBS(SYS(5)+SYS(2003))+"\CARDS\"+;
			ADDBS(ALLTRIM(STR(COMPANY.COMPANYID)))+ADDBS(ALLTRIM(STR(CONTACT.CONTACTID)))+;
			ALLTRIM(CONTACT.CONTACTFIRSTNAME)+ALLTRIM(CONTACT.CONTACTLASTNAME)+"<<i>>.jpg"
*
		fileNameIMG = TEXTMERGE(lcTextMergeStringIMG)
*
		lcSaveNameIMG = TEXTMERGE("\CARDS\"+;
			ADDBS(ALLTRIM(STR(COMPANY.COMPANYID)))+ADDBS(ALLTRIM(STR(CONTACT.CONTACTID)))+;
			ALLTRIM(CONTACT.CONTACTFIRSTNAME)+ALLTRIM(CONTACT.CONTACTLASTNAME)+"<<i>>.jpg")
*
		lcTextMergeStringPDF = ADDBS(SYS(5)+SYS(2003))+"\CARDS\"+;
			ADDBS(ALLTRIM(STR(COMPANY.COMPANYID)))+ADDBS(ALLTRIM(STR(CONTACT.CONTACTID)))+;
			ALLTRIM(CONTACT.CONTACTFIRSTNAME)+ALLTRIM(CONTACT.CONTACTLASTNAME)+"<<i>>.PDF"
*
		fileNamePDF = TEXTMERGE(lcTextMergeStringPDF)
*
		lcSaveNamePDF = TEXTMERGE("\CARDS\"+;
			ADDBS(ALLTRIM(STR(COMPANY.COMPANYID)))+ADDBS(ALLTRIM(STR(CONTACT.CONTACTID)))+;
			ALLTRIM(CONTACT.CONTACTFIRSTNAME)+ALLTRIM(CONTACT.CONTACTLASTNAME)+"<<i>>.PDF")
*
* If you can't get a Window handle, use 0:
		lnHDIB = TWAIN_Acquire(THISFORM.HWND)
		IF lnHDIB = 0
			EXIT
		ENDIF
*
		SELECT (ThisForm.ActiveDBF)
* This processes the front of the card (in duplex still considered "first pass" it just buffers the 2nd page
		IF i = 1
			REPLACE CONTACT.BUSINESSCARDFRONT WITH lcSaveNameIMG &&stores the path and file name in a memo field
			DIB_WriteToFilename(lnHDIB, fileNameIMG) && writes the .JPG image file of front of card  to that path
			DIB_WriteTOFilename(lnHDIB, filenamePDF) && writes the PDF which has also been "magically" OCRd during scan
			REPLACE CONTACT.CONTACTCARDOCR WITH OCR_Text() && Puts the OCR text ouput into memo field in contact for use later
		ENDIF
* This processes the back of card: we're only interested in keeping the image
		IF i = 2
			REPLACE CONTACT.BUSINESSCARDBACK WITH lcSaveNameIMG && writes the path and file name of the card image to memo
			DIB_WriteToFilename(lnHDIB, fileNameIMG) && savest the back card image file
		ENDIF
* Cleanup and finish
		DIB_Free(lnHDIB)
		i = i+1
		IF TWAIN_IsDone()<>0 THEN
			EXIT
		ENDIF
*
* No need to keep the OCRd PDF around
		DELETE FILE (fileNamePDF)
	ENDDO
	TWAIN_CloseSource()
ENDIF
*
IF TWAIN_LastErrorCode()<>0
	TWAIN_ReportLastError("Unable to scan.")
ENDIF
*
SELECT (ThisForm.ActiveDBF)
*
IF NOT TABLEUPDATE()
	WAIT WINDOW "Unable to Save"
	TABLEREVERT()
ENDIF
*
ThisForm.Refresh()
ThisForm.LockScreen = .F.

It's actually a very elegant solution overall. EZTWAIN does an awesome job of just working with any default OCR engine, and we found Transym to be very reasonable cost ($100 USD) and very powerful (great OCR result).

Now, we have the text in a Memo field, which we can parse out into the various components... and that becomes the next step in our quest for "Card Automation".


Best Regards,
Scott
ATS, CDCE, CTIA, CTDC

"Everything should be made as simple as possible, and no simpler."[hammer]
 
Couple of additional notes about this:

There are a lot of settings that can be "tweaked" by EZTWAIN (And the OCR engine). We parameterized the height and width for two card sizes, but it could easily be made more dynamic. I'm now considering putting in slider bars on the form to allow slight increases or decreases in card size (we're finding a LOT of odd cards out there). Also out of about 400 cards we found half a dozen that just didn't scan/ocr well (or at all). One card we had has very very light gray print (why they did that is beyond me I can barely read the bloody thing). So far, we've not been able to get it to OCR. Another issue we found is some "clever" people have their card printed vertically instead of horizontally. There is a flag in EZTWAIN that will allow you to "reorient" it after scan, but this has proved problematic in the size, because the size definitions don't change. We're considering making that more dynamic so that if it "detects" an auto-rotate, we change the orientation of the lnWidth and lnHeight values at WRITE time.

This is another interesting point. The functions that start with TWAIN are functions that drive processing at SCAN time. But functions starting with DIB are "Device Independent Bitmap" which is the image while still in-memory as has been acquired by the scanner. This is rather "raw" image data, so post-processing can occur against the data that is independent of what scanner has been used to acquire them. EZTWAIN has some really extensive functionality that can manipulate the image while still in memory.

I'm playing around today a bit further with that to see if I can better resolve rotation and card sizing issues, which may enable the removal of the need for two card types if I can determine some "crop and size" in DIB stage before writing...


Best Regards,
Scott
ATS, CDCE, CTIA, CTDC

"Everything should be made as simple as possible, and no simpler."[hammer]
 
On further investigation I've found something that works extremely well. (At least with the Canon Scanner).

We discovered that the leading edge of the card always has a bit of a shadow to it, so we cut the Top element down to skip .15 cm (or a 1.5mm). That took that off.
But then we discovered that setting the image scan size larger than the image itself (in our case we set it as 12cm or around 5 inches) and then set:

TWAIN_SetAutoCrop(1)

We set both height and width to 5cm as some cards are "vertical", as mentioned. This covers those, as we set auto-rotate on as well. The catch in that case is, you need to detect which cards have been "autorotated" at scan time, and then rotate them back in DIB_ROTATE(lnHDIB, -1) (passes the handle of the image, and -1 is 1 step, counter clockwise).

Or you can do it manually later, but this works nicely, and allows the OCR functions to work, otherwise you just get a side-was scanned card.

Best Regards,
Scott
ATS, CDCE, CTIA, CTDC

"Everything should be made as simple as possible, and no simpler."[hammer]
 
Hi Scott,

I'm pleased to hear you managed to find a solution with EZTWAIN. I think you were a bit negative about it for a while, but it's good that you got it working in the end. Thanks for posting your solution.

For various reasons, I was off line the whole of last week. I just returned to the forum this Monday morning, and found around 20 new threads. It's going to take me a while to read through them all.

Mike


__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Mike,
No worries. I was just a little shocked that the EZTWAIN cost went from around $200 to $1,000 for a developer license. We contemplated the "free" version, but it's just not as robust, and to be fair, we've found the full EZTWAIN4 so powerful that we decided to bite the bullet and buy the full 4 version anyway, given it's seamless work with Transym (another $200)... So it's a pretty amazing solution.

We're writing our own PARSE for the text now, though we're investigating (another $1,000 tool) Abbyy Fine Engine, though we're waiting for them to give us a trial version of the API. We played with their business card scanner though, and it was really amazing in its ability to parse, so we will give it a try.

The approach we're taking now though, as the Transym OCR is just a rock star, we've been able to parse nicely about 95% of email address (what we started with first), now we're adding company URL and phone numbers... we may actually stop there anyway. We populate the OCR data into a text field into a "temporary" staging data table, which we present in a form. Then we populate the "fields we know" but leave the rest of the data in the memo field. Using then, drag and drop, we let the user select data elements and drag and drop them onto appropriate fields (like Company Name, First, Middle, Last, Address, etc). We find this is actually quite useful because it allows users to "fix" in OCR oddities. (1, l, o, O, 0 @... all culprits for poor OCR). So it really creates a simple data staging cleanup. Then we use that to launch three separate areas in the system and auto-creates records for company, location and contact (or just add location and/or contact if the company already exists). It's very slick, (in theory --- I'm still building that form, but should be ready in a day or two.) And it will make adding new contact data extremely fast, even with the drag/drop. But because of OCR, it's actually not a bad thing to just take a moment to clean up a few bits.

In the long term, we'll probably keep that, and if we get better parsing algorithms it will just make that step that much faster.
Thanks for pointing us in the right direction. It was really my first experience with an API, (particularly one that wasn't "meant" for VFP in Transym... they had no idea if it would work or not! Nor did EZTWAIN know... so that was kind of "pioneering"). A little daunting for me, but happy (very happy) with the result.

Best Regards,
Scott
ATS, CDCE, CTIA, CTDC

"Everything should be made as simple as possible, and no simpler."[hammer]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top