Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reading MS Word files in Java

Status
Not open for further replies.

lpm

Programmer
Jun 22, 1999
1
US
I am attempting to read and parse MS Words documents using java's DataInputStream, for example 'DataInputStream in = new DataInputStream(new FileInputStream(filename...));'<br>
<br>
This works fine for files saved as text, but does not for .doc formats.<br>
<br>
Any help would be much appreciated.<br>
<br>
Please respond to lpm@lpmconsulting.com
 
Hi!<br>
<br>
It is very difficult but not impossible :).<br>
Which version? The v1.0, 2.0, 6.0 and v7.0 (and above) has different structure.<br>
The M$ Word files not a plain text files rather than binary files, has a special structure in it, like a little file system..<br>
You should search for the word file format specification on the net, or the M$ Developer Network (CD). If you have found the specification, you should just write the code :)).<br>
Good luck!<br>

 
When you want to read Word documents, I think it's best if you use a COM-to-Java bridge. A tool like Linar JIntegra or Microsofts Jactivex allows you to create wrapperclasses for a COM components (such as MS Word) typelibrary. When you import these wrapperclasses into you project, you can fe. start up an instance of Word, open the file you want, and perform any action on it as if you were in Word itself.<br>
<br>
Hope this helps,<br>
Evert
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top