Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Convert PDF file to excel 2

Status
Not open for further replies.

robkar

Technical User
Apr 1, 2003
1
0
0
CA
Does anyone know how to convert a PDF file into excel format?
 
This is not possible.

PDF relies on PostScript to position elements on a page. Objects are positioned in absolute locations with X & Y coordinates.

XLS relies on tables to position elements on a page. Objects are positioned in relative locations with column and row identifiers.
 
If you just want to get a dump of the text elements from a pdf file into Excel, you could save the file as a .rtf file, open with Word and save again as a .htm/.mht file, and then create a new Excel workbook and via Data/Import External Data/import Data you can load the .htm/.mht file text (MS-Office XP) Lloyd Freiday
 
Won't this just place all of the text in a single Excel worksheet cell? I thought HTML import into Excel made use of <table>, <tr>, & <td> tags to create Excel columns and rows. Word will not automatically create these when the RTF is imported.
 
Exactly, but as you mentioned there isn't any way to save or export the pdf to Excel and preserve the layout. I proposed this method if the object is to dump a large volume of text to Excel for further formatting and use. Lloyd Freiday
 
I should have noted also that the text appears to be created in a single column, not a single cell, so the subsequent formatting of the text should be somewhat easier to perform. Lloyd Freiday
 
Hello, Actually there is a simple wayto convert PDF to Excel. Taking into account that the PDF file is also in the same layout as the Excel document that you want to create.

1. Open the PDF file
2. Go to View and make sure that CONTINOUS is Checked
3. After you have done that go to EDIT and use SELECT ALL
4. Go to EDIT and Select Copy
5. Now Open NOTEPAD
6. Go to EDIT and Use PASTE
7. Save the file
8. Open Excel and select OPEN from the File Menu
9. Make sure that the Files of Type is set to ALL FILES
10. By doing this you are allowing EXCEL to open the Text file

11. When the file opens you will see a window that reads TEXT IMPORT WIZARD. Select the following from this window.
DELIMITED Should be checked.

12. Press the Next Button and Make sure that TAB & SPACE are checked. If done succsfully you will see lines appear in the window below. This represents Columns. Thus Creating a Excel file from PDF. I hope this helps you

Jon Daddysman
 
Hello,
I have tried to convert a PDF to Excel, in all the ways written above and unfortunately none of them works for me. When I tried Jon Daddysman's way, the font appears to be strange characters and does not make a difference when font is changed.
I have a 10 page PDF file containing large amount of data. Any further help is highly appreciated.
 
Hello ra1518,
Hope this help if you are still seeking for the solution.
I have a pdf file (a page) with graphic on top, and two column of data at the bottom. To convert these two column into xls:
open pdf, save the page as rtf file.
use Word open rtf, delete the graphic (not grpahic anymore) part. save file as txt file.
Open txt file use excel, import using fix column as delimiter.
It works perfect for me.
Good luck

lfk

 
I am not clear in reading this thread of emails as to whether or not there was a solution to making a PDF file become a Excel files. I do have the full version of PDF. The last entry referring to the McGill website works great for just one page of documentation but not for mulitiple pages. It transfers great, again, for one page.

I get large document 50 pages and more. I have tried other ways that are mentioned above and I have tried diffent variables on my own. When transferring the pdf to excel it is a weird font and/or all the documentation becomes left justified.

Thank you
 
Acrobat V6.0 now has a table select tool. I have been using this for taking tables from PDF to Excel. However I have still been going via a table in Word.
It only works on a page by page basis and a bit hit and miss when it copies columns. For small single page spreadsheets it very useful.

Peter

 
I recently successfully copied a PDF to Excel. It was a bit long-winded, but worked perfectly.

I used the Column Select tool (Shift-V) in Reader, and selected each column of data separately, then dumped it into a Word doc. Each row in the column appeared in Word delimited by a paragraph mark. I kept adding each column to the Word file, adding a few returns between each column 'dump' so I could easily see where each column started and ended. Once all the data had been copied to the Word doc, I then selected each 'column' (now just a vertical list), copied and then clicked into the cell at the top of the column where I wanted it to appear in Excel. The data just flowed into a column, with each row placed correctly. I was amazed it worked so well.

One tip - make sure the cell format is set correctly. For example, although the data I was transferring was almost all numerals, Excel did not accept zeros at the start of the figures. By changing the format to 'text', everything came through exactly as it had been in the PDF.

If the text does not appear as it did in the PDF, it probably means that you don't have the font used in the PDF installed on the computer you are making the Excel file in. Highlight all the text in Word and change it to a font you know is on your computer.
 
This is my first use of TEK TIPS, please accept my appologies for any mistakes in this response.

I have found this discussion very useful and have used the solution from EGGLES successfully - thanks. I have 2 observations:

1) In most cases you do not need to go through WORD - you can column-select from Acrobat Reader and paste straight into Excel

2) I loaded the latest version of Acrobat Reader from a well respected magazine CD and found that Reader V6.0 does not appear to support &quot;column-select&quot; (loaded into Windows XP) - a wonderful step forward by Adobe!!

Thanks again for the help, and I hope this may also be useful.

Leslie
 
Leslie - I too have since discovered since writing that post that you can go straight from the PDF to Excel, as long as only one column at a time is selected AND that the formatting in the Excel column is set to 'Text' - otherwise some really weird things can happen to the data.

Are you serious that Adobe Reader 6 doesn't have a 'Select column' tool - that's terrible!!
 
Acrobat 6 has a column select function, but it's not a separate menu item. Use the Table Select tool to draw a box around a column you want to select, then copy and paste or drag into another application window. Or so says the onscreen help: I deal almost exclusively with text and haven't had Acrobat 6 long enough to have done this myself.
 
I have tried the above suggestions (I'm using Adobe Acrobat Reader). Eggles suggestion works for me except when there are blank &quot;cells&quot; in the PDF table. In those cases, if I copy and paste a single column, the data below moves up. If I copy the table to Notepad and import it into Excel, the data to the right of the blank cell moves left. Has anyone else experienced this? Any other suggestions?
 
>>when there are blank &quot;cells&quot; in the PDF table. In those cases, if I copy and paste a single column, the data below moves up.<<

Yes, I did experience this, and the only solution was to look at each column after pasting, and shift all the cells down one cell below the cell supposed to be blank. I didn't find any workaround to this, but since I didn't have too many blank cells, I just worked through each one individually. It became easy after a while as once I had pastd the first column, it was easy to see where a blank cell had occurred in subsequent columns as the bottom of the column didn't match up to the previous one(s).
 
I stumbled across this thread searching for how to do a "Select Column" in Acrobat Reader 6 ... it does appear that they took it out of the menus, but if you choose "Select Text" and hold down ALT while selecting the text it operates in Column Select mode.

Hope this helps someone.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top