Need Help.....pls. !!

goldaquila · Jan 21, 2009

I've searched everywhere for a solution to this, but in vain. Pls. Help...

I have taken digital images (color) of a few pages from a printed book. These are all double pages (2 facing pages in 1 shot). Pls. see the attached sample image.

I want to use Paint shop Pro X2 to AUTOMATE the following :

1. Cut the single image (of two facing pages) into 2 single page images(in batch mode) to NEARLY FIT the single page dimensions of the book.

2. If possible, Straighten the resulting single page image (to look NEARLY like a flat eBook - with text in ALMOST straight lines).

3. If possible, Remove imprint of fingers (seen bottom of the pic- finger holding pages flat- I've held the same place in all images).

4. Convert color image to black and white, to remove unnecessary details.

5. CREATE A BATCH process of all above steps, so it can be done for the 30 or so double pages I have, creating 60 single pages.

After step 5, I will run Omnipage to OCR text from all pages (This is simple, I can do it) and create a clean B&W book type PDF of 60 pages or so.

Its really important that I do the batch mode for all...this is beacuse in sucesssive projects, I need to digitize a few pretty old books.

At wikihow, I did find some useful info to do this with Photoshop..for steps 4-5. I'm reproducing that info here, if someone wants to incorporate that into approach they suggest for me.

Pls. Help...I'm just an amateur with PSP. Thanks a ton..

-------------------- Notes from WikiHow ---------------
4 # Create a processing action: Photoshop (any version) is a good software for processing, but other software may work just as well. To save file size and increase readability, you want to strip out as much extra color information as possible. You can automate this process: Start recording a new action. Then select Image>adjustments>threshold to convert to black and white - drag the cutoff slider to where it looks most readable. Save the images using File > Save for web and devices. Save in the gif format. You only need two colors, and you may safely crank up the lossy compression. Using this method you can compress six megapixel images down to under 200 kb, down from the roughly 2 mb original files.

5 # Create your pages. Now that you have created your action, open Photoshop and select file > automate > batch. Select the action you just made, as well as the original images you want to work on. Then select a destination and numbering scheme. Run the process and you should end up with a number of compressed GIFs. Review your images again. If any of your images were underexposed or out of focus, they will be illegible.

ChrisHunt · Jan 21, 2009

1, 2 and 3 probably aren't possible to automate, unless your book is lined up in exactly the same way in every photograph.

PSP is going to struggle to identify thing like the edges of the pages and your fingers by "looking" at them. That's something humans are really good at, but computers struggle with.

In any case, Omnipage is quite clever at identifying where blocks of text are facing pages, snaking columns, or whatever - so the cutting up pages and splitting into two files bit possibly isn't necessary.

It should be possible to adapt the Wikihow technique to PSP when it comes to converting the original images to black & white. I don't have a copy of X2 handy, so I can't give you a blow-by-blow of how to do it. Essentially, the first step is to go through the process manually, and then automate it.

And that's where I think your main problem will lie. If the attached image is a true example of what you have to work with, I don't think Omnipage is going to like it, no matter what you do in PSP - it just isn't big enough. OCR programs tend to work with really big image files, the sort of thing you get from a scanner, rather than regular digital camera images. If you zoom in and look at just how few pixels you're offering the program to identify each letter. If it works at all, you'll be spending a lot of time correcting mistakes.

PSP is the wrong tool for this job. Get a scanner instead.

-- Chris Hunt
Webmaster & Tragedian
Extra Connections Ltd

harebrain · Jan 25, 2009

I'm with ChrisHunt on this: a digital camera is the wrong tool for this job. You'll get best results by placing each page flat on a scanner. I remember Google scanning old, rare books: must be non-destructive. They used a saddle scanner. I can't remember if they bought it or rigged it up themselves.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Need Help.....pls. !!

goldaquila

Vendor

ChrisHunt

Programmer

harebrain

MIS

Similar threads

Part and Inventory Search

Sponsor