I'm trying to find a way to use a JPG or other image and OCR code in VB.NET.
I want to get data from the image which will have a part number in a specific location.
I want this part number placed in a VB variable so I can use it.
Using your friends (yahoo, google, ask, answers, bing) you should be able to find several different, good to excellent, ocr engines that will work with WIA (See M$ for more details on WIA).
They are out there. The company I got laid off from last summer used a couple of them (Don't remember the names tho). They aren't excatly cheap and you'll go through a lot of trial and error in order to get good, reliable data.
Hi - I did the same thing - recognising invoice numbers on PDF's & then storing them away.
I got good results using Tessaract. It's open source (used to be an HP research thing, releaassed to open source, looked after by Google). Google tessaract
I used it as a stand alone app - passed a trimmed tif to it & read the result text file. I believe there is an intergrated .NET version of it.
Some words of wisdom. By default - Tessaract only recognises basic text in a basic font. You can train it to recognise other fonts.
For me - I did some experimenting & then changed our default invoice to print the invoice number in a basic font. I then trimmed a section of the PDF - saved it as a tif to present to tessaract.
With a successful result - I stored the PDF away. With an unsuccesful result - e-mail the adnministrator the pdf & tell him to manually look at it.
We do 20,000 printed invoices a month. In 18 months - not one has been e-mailed - and so far - when we reference the stored invoices - they are all named correctly.
Also - did you know there was an OCR tool that came with Office 2003 & 2007 - AND did you know as long as you own the relevant version of Office on the PC you run your code on - it's license legal.
It works very well - fully automated - aligns the image, auto adjust the image for best scan results. You can use it from within .NET code. I got demo code from the Code Project web site & went from there.
I got the results I needed for this project using Tessaract - but the MS Office OCR tool was very interesting...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.