Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Extract all hyperlinks from a Word 2010 Document 1

Status
Not open for further replies.

pattyjean

Technical User
Mar 3, 2006
53
US
I would like to Extract all hyperlinks from a word Document to list them all in one document.



 
Hi strongm,

I concede your point re the ultimate aim being to extract the addresses (something I hadn't picked up from pattyjean's last post), whereas my code was designed to preserve the hyperlinks as such.

FWIW, I tested a document containing 100,000 hyperlinks amongst 4,735,000 words spread over 19,003 pages. The optimised loop code to extract the addresses to a new document took 00:04:04, whereas the optimised F/R to delete everything except the hyperlinks took 00:07:53. I also tried an optimised loop to copy the hyperlinks to a new document, I gave up waiting after 01:30:00, by which time only 1/3rd of them had been processed.

Cheers
Paul Edstein
[MS MVP - Word]
 
Thanks everyone for the information.

Well after trail and error I found out that an easy way to do the same thing is same the word document as xml and when you open it in excel it gives you a clean column named target to easily identify all the linked documents.
Now that I have this part of the process complete, the next step is to match up the link names with the friendly name (excel formula = hyperlink() It enables me to rename the links into the text name but ........coping it back into the word document is the new challenge for me. Any ideas? I am going to post this into another category if it makes sense to you all.

 
Hi pattyjean,

So you have a set of hyperlinks in Word and, in Excel, a corresponding set of hyperlinks in one column and their 'friendly' names in another, and you want the Word hyperlinks to display the 'friendly' names. Corect? If so, that's easily enough done. A couple of questions, though:
1. Are the hyperlinks in Word & Excel listed in the same order?
2. Are there any duplicates or instances or the same hyperlink with two or more 'friendly' names?

Cheers
Paul Edstein
[MS MVP - Word]
 
Paul,
Thanks for the response, in answer to your questions I have to give you the whole picture.
There are 91 different word documents with 2000 attachments in pdf format.
Each word document contains the hyperlinks but at the end of the document we want to add a list of evidence with the list of hyperlinks and their friendly names.
All the hyperlinks are in one folder with the word documents outside the folder.
But the final document will be a pdf version with all sets of clickable links. So after the link of evidence the word document will be saved as pdf. The hyperlinks are on a drive and will be saved to flash drives.

1. Are the hyperlinks in Word & Excel listed in the same order? Could be haven't set it up yet.

2. Are there any duplicates or instances or the same hyperlink with two or more 'friendly' names? No each hyperlink might be multiple documents but the same friendly name.

Does that answer some of your questions?
 
Hi pattyjean,

If you're hyperlinking to documents, I think you'll find the hyperlinks will have the full filepaths, including drive letters, etc for the target files. So, when you do your PDF conversion, that's what'll be replicated in the PDF. If you then copy the files to a USB stick or CD and open them on another computer, the hyperlinks will still be looking for the original filepaths on your computer and, in all likelihood, will fail.

As for the "list of evidence with the list of hyperlinks and their friendly names", that suggests some form of table, but it's not clear how the 'list of evidence' entries are to be compiled and matched with the hyperlinks. Also, it seems to me you don't need both the 'hyperlinks and their friendly names'. Rather, you should be able to have the hyperlinks displaying only their friendly names.

Cheers
Paul Edstein
[MS MVP - Word]
 
In answer to
If you then copy the files to a USB stick or CD and open them on another computer, the hyperlinks will still be looking for the original filepaths on your computer and, in all likelihood, will fail.
The way we linked them and it works is to have the folder of attachments on each flash drive and in the word document it is linked like attachments\filename.pdf. It works for the current links but the

List of Evidence is a different story. I don't want to link each one separately. I used the list of hyperlinks from the word document so I have the name of them already from the other step mentioned above. I match them up with the friendly name for each document in an excel table and use the hyperlinks function. The problem is - how do I copy the friend name with the link to paste into the word document. It brings over the path of the excel file instead of the real link. I need some kind of function to keep together the pdf with the friendly name. Any clue? Does this make sense?

 
Hi pattyjean,

Even if your hyperlinked files are on a flash drive, by default they'll include the drive's letter in Word. Put the flash drive into another PC where it gets assigned a different drive letter and the hyperlinks will fail.

It's still not clear what you intend regarding the 'List of Evidence'. It is easy enough to modify the Word hyperlinks so they display the friendly names in the body of the document, rather than the actual paths, whilst hovering over them will display the actual paths. To that end, you don't need a separate 'List of Evidence'. if you want one, though, perhaps what you need is an Index to provide that list.

Cheers
Paul Edstein
[MS MVP - Word]
 
Yes I need an index. We got the links to work on the flash drives in word, now we are converting word to pdf. I will work on this project later today to figure it out than repost my questions. Thanks
 
We have approximately 3000 attachments that we have to hyperlink into different word documents with the final product in pdf (maintaining the hyperlinks). The final documentation will be on flash drives to 7 different reviewers.

We can save the word to pdf and maintain the links in pdf, but when we click to open the hyperlink it takes us to the attachment but on close, it closes everything.
For more detail, I found this same issue here: There are 3 ways we can do this:
1: Change the setting in pdf to 'Open cross-document links in same window' unchecked in Edit>Preferences>Documents (works great but the SACS reviewer would have to follow these steps also. (we are using Adobe X, don’t know what version they would use)
2: We can ask the SACS reviewer to hold Ctrl and enter to open the pdf in a new window, or
3: Can you deploy a configuration file (autorun) to add to the flash drive so they can just click the link and it opens in a new window?

We want to make this as easy as possible for them to review on a flashdrive. Can you give us any advice or help with deployment? Is it possible?
Thanks for any information you can provide. This would have to work in a MAC environment as well.
 
Hi pattyjean,

You have 3000 attachments, or 3000 links?

It's still not clear to me how the hyperlinked content in the body of a given document is intended to relate to the 'index', which apparently uses the 'friendly' name. Doesn't the 'friendly' name get used as the display text in the body also? If not, how is a user meant to recognise which 'friendly' name in the 'index' relates to a given hyperlink in the body?

It's also still not clear as to how the 'index' is to be compiled. Is the idea to go through all the hyperlinks in the body, find the corresponding entries in the Excel workbook, then insert the 'friendly name' hyperlinks into the 'index'? What happens if the same hyperlink is found more than once? Should the 'index' entries be sorted and, if so, how?

PS: I've been away for a fwe weeks, hence the delay in replying.

Cheers
Paul Edstein
[MS MVP - Word]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top