Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Identify Office 2007 word docs 3

Status
Not open for further replies.

petermeachem

Programmer
Aug 26, 2000
2,270
GB
We have a piece of software that generates word docs and will always give them a .doc extension. One Office 2007 user unfortunately wasn't setup to default to Office 2003 format so now we have a folder with a huge number of word docs in it, some of which are actually .docx but all of which have .doc extensions.
So I need to identify which are the .docx ones. How can I do this short of trying to open them all with office 2003?
 
I don't think there are any properties you can use, but the first four bytes of the file should contain a signature - 0xdocf11e0 for 97-2003 format and 0x504b0304.

Before relying on the codes, please confirm them as I have simply lifted them from random existing files of mine and I may have made a mistake.

Enjoy,
Tony

------------------------------------------------------------------------------------
We want to help you; help us to do it by reading this: Before you ask a question.

I'm working (slowly) on my own website
 
Good plan, I don't have a hex editor to hand, but the 2007 ones contain the string '[Content_Types].xml'. It is so unlikely that anyone is going to type this I will filter those.

Thanks for your help
 
Hi Peter,

Try this FixWordFileExtensions macro:
Code:
Sub FixWordFileExtensions()
Dim fs As Object
Dim fName As String
Dim i As Integer
Dim j As Integer
Dim IDChar As String
Dim objFSO As Object
Dim objFile As Object
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set fs = Application.FileSearch
j = 0
With fs
  .LookIn = GetFolder(Title:="Find a Folder", RootFolder:=&H400)
  .FileName = "*.doc" ' replace *.* with parameters of your choice (eg *.doc)
  If .Execute(SortBy:=msoSortByFileName, SortOrder:=msoSortOrderAscending) > 0 Then
    MsgBox "There were " & .FoundFiles.Count & " Word documents found."
    For i = 1 To .FoundFiles.Count
      Open fs.FoundFiles(i) For Input As #1
      Line Input #1, IDChar
      Close #1
      If Left(IDChar, 2) = "PK" Then
        objFSO.MoveFile fs.FoundFiles(i), fs.FoundFiles(i) & "x"
        j = j + 1
      End If
    Next i
    MsgBox j & " Word 2007 files were found and renamed."
  Else
    MsgBox "There were no Word 97-2003 files found."
  End If
End With
End Sub

Function GetFolder(Optional Title As String, Optional RootFolder As Variant) As String
On Error Resume Next
GetFolder = CreateObject("Shell.Application").BrowseForFolder(0, Title, 0, RootFolder).Items.Item.Path
End Function
The code allows you to browse to a folder where the suspect fiels are located. Having selected the folder, the macro then tests all 'doc' files to see whether they start with the 'PK' prefix that identifies files with the 'docx' format and, if so, gives those files a 'docx' extension.

Cheers

[MS MVP - Word]
 
Word 2007 file format is just a collection of xml files zipped together and assigned the appropriate dotx/m extension. So if you change the files to give them a .zip extension, the Word 2007 documetns will open up in WinZip (or any other ZIP utility) as a set of xml files. Other Word documetns will report wrong format.


Regards: tf1
 
Hi tf1,

That's why my macro tests each 'doc' file for the 'PK' identifier that denotes a zip file. If the file has both a 'doc' extension and the zip identifier, its a fair bet its a 'docx' (or 'docm') file. On a system with hundreds of files, making that change is a lot quicker than changing all the extensions to 'zip' just to see whether the the files will open with Winzip (or any other ZIP utility).

Cheers

[MS MVP - Word]
 
Yes. I see that now. I should have looked harder at the code!


Regards: tf1
 
PeterMeachem, macropod and tf1 00 is there a way to generate PowerDesigner Reports that make use of MS Word templates?

I'm grappling with the way PowerDesigner creates headings (Heading 1, Heading 2, etc) in rtf output: they're all "normal" paragraphs with {tc} bookmarks -- perhaps the conversion has to occur in Word so that the database schema report looks like other user documentation.

Any pointers are much appreciated.
 
Hi yenelli,

Please start a new thread. Your post (above) has nothing to do with the subject matter of this thread.

As for me, I have no knowledge of PowerDesigner.

Regards

[MS MVP - Word]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top