Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Multiple page convert - PDF to Excel

Status
Not open for further replies.

Cliona

Technical User
Jan 13, 2003
6
FR
I want to convert a PDF table containing 4 columns and 229 pages into an XLS or RTF/DOC file for editing.
Converting each page one by one works fine (procedure used below) but takes way too long. How can I convert all pages in one go? My problem is with stage 3.

Procedure used:
1) Open PDF in Acrobat 5.0
2) Select “Table select tool”
3) Drag “table select tool” over page to convert (will not let me drag over more than 1 page at a time)
4) Right click>Check “Table”>Select “Save as”
5) Save as Ansii Txt
6) Open Txt in Excel

Thanks,

 
Are you able to use the File->Save As command on this file? If so you could Save As type .rtf Ahhhhh, I see you have a machine that goes Bing!
 
Unfortunatly, the File>Save As command removes the table formating.
I end up with a messy list of words separated by an inconsistent number of manual spaces (impossible to create a macro to rebuild the table using special format symbols).

Found a partial solution:
I used the BCL Drake plug-in, worked like a charm BUT the demo version only converts odd pages....
Maybe there is a way to renumber pages in Acrobat.
I'm not against buying the plug-in but price is a bit steep for a "once-in-a-while" usage.
 
I suppose you could save a copy of the pdf as a different name, then insert it at the end of the original doc. As there are 229 pages in the original, page 1 of the inserted doc would actualy be page 230, so then page 2 and all other 'even' pages would actually be an odd number. It's still messy but it would give you all the odd pages followed by all the even pages..........if you see what I mean. Ahhhhh, I see you have a machine that goes Bing!
 
Thanks Murge. Got my -almost- final file now. Just got to find a way to automate removal of every second page in Word (the plug-in actually adds a Drake message page every second page)and I'm done.

 
I think Adobe was being a bit too clever when they made it so difficult to remove features like tables from PDFs. Same problems apply to headers/footers, bullets, page numbers, graphcs and graphics. I seem to recall at the PDFzone website a list of tools for extracting text from more than one page, but whether it could handle tables I don't recall. Wouldn't it just be esasier to ask the originator of the PDF for the non-PDF version?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top