Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

What's the cleanest way to get Word text into Dreamweaver?

Status
Not open for further replies.

chuckdesign

Technical User
Sep 21, 2001
79
US
This is an example of advancing technologies making life more difficult for me.

I used to import MS Word text into Dreamweaver quite smoothly, with minimal cleanup. I would just save the file as HTML and import into Dreamweaver, and that was it.

Now i'm using Word X for OS X and Dreamweaver MX. Bringing in text is extremely messy!! There are numerous fonts, unnecessary <DIV> tags, extraneous codes, and CSS functions that Dreamweaver does not clean up, even when I use the &quot;Clean up Word HTML&quot; function. On a simple document, I have to spend as much as an hour hand-scouring the code to delete this stuff.

I've tried cutting the pasting text directly from Word into Dreamweaver, but of course that puts everything in one long paragraph.

Help!

-- Chuckdesign :)
 
Cutting and pasting isn't practical for long documents, because it merges everything into a single paragraph, with <BR>'s instead of <P>'s.

I did find that selecting &quot;Clear Formatting&quot; in MS Word's Style Sheet palette helps clear out a lot of the junk.

I'm still left with weird characters in place of apostrophes, ellipses, quotation marks, and dashes. I guess I have to do search-and-replace...unless anyone knows of a better way...? -- Chuckdesign :)
 
(1)copy and paste into blacnk document

(2)Click edit then fin and replace <br> with <p>. Make you select &quot;code&quot; and &quot;this document only&quot;

(3) Copy and paste into your final document.

Don't ever save as html, import then try and fix it, it just doesn't work.
 
I do a &quot;get text&quot; in Quark. Then cut and paste into Dreamweaver.
 
Yup! I load the word doc into Word Perfect, then export that as Rich text, or HTML...anything but Word's stuff!!! Oops! I've joined a club that'll have me as a member?
 
Hey people,

I often get BIG Word docs to be placed on the site from my customers. What I do
1) try to clean up Word doc somehow before converting;
2) save Word doc as HTML;
3) open this HTML, goto Commands, choose Clean Up Word HTML...
4) unfortunately using only this command is not enough, so I also goto Commands, Clean Up HTML, check Specific Tag(s) and write write there font, span;
5) after this I do some &quot;handwork&quot; cleaning, such as, remove style attribute from all the tags, clean up tables and so on...

Anyway, getting info from the Word doc is not the best time spending [bomb]

And one more thing about Word docs. Have you ever seen code that Word creates when saving files as HTML??? It is SO big!!! I remember once I had Word doc about 350K big, when I saved it as HTML I got something more than a 1Mb (!!!) and after I optimized it I got something about 70K... No comments... Good Luck! :)
 
More I think about this post, more I think the question's upside down...probly should be &quot;what's the least dirty way to get a Word doc into DW?&quot; [lol] - &quot;Oops! I've joined a club that'll have me as a member?&quot; -
 
When cutting and pasting from Word into Dreamweaver MX I have found that DWMX sees one paragraph mark as a line break, and two in a row as delimiting the end of a paragraph.

It deals with lists in an annoying way with some funny characters, but as long as your paragraphs are split with two paragraph marks the rest is easy to clean up. --
Dunx
 
I have a weekly job of taking the text from a Word document and updating a web page. I used to have to cut and paste one paragraph at a time. Now that I have upgraded to Dreaweaver MX I've discovered that I can take the whole document and paste it and the paragraphs are preserved, unnecessary space is removed and I only have minor changes to make such as tidy up a table or apply a heading style. It takes me a few minutes only. I don't know why your Word document would be merged into one long paragraph - perhaps you don't have paragraphs set up correctly in the Word document. Is the Word document using styles instead of normal text? Do your paragraphs end with line break markers rather than paragraph markers? Is it still a mess if you cut and paste via Notepad or other text editor?

Because search and replace jobs are really easy in Word even for formatting commands, I think you need to find what the problem is in your Word document and change it before you cut and paste.

A bit of trial and error. :-D
 
I think the problem is that Dreamweaver is set up to accept Word documents set up the way most people use Word, and not in the way that you are &quot;supposed&quot; to.

Dreamweaver treats paragraph marks as line breaks. Some people will have Word set up so that they use paragraphs marks as that - they set the space between paragraphs and use shift-return to enter line breaks. Most people use the default settings with no spacing between paragraphs, and use empty paragraphs to break up their text. Indeed, this is how you would create a distinct paragraph in the majority of text editors.

If you do use spacing between paragraphs rathen than empty ones, the best way of copying and pasting into Dreamweaver is first to do a Find and Replace in Word, replacing every paragraph mark (represented by ^p in the text box for Find) with two paragraph marks (^p^p in Replace). Then when you copy and paste Dreamweaver will see the paragraphs. Then you can Undo in Word, and your document is back to normal.

Also, there is an option in Word to save proper HTML without all the gunk. When you select Save As, look for Web Page, Filtered in the file format drop down. Apparently (I haven't really tried this) this will save as clean HTML with no extra tags. MS suggest you only save like this when you have finished editing the document. --
Dunx
 
Thanks for your advice. I don't think MS Word X for Mac has the &quot;filtered&quot; HTML option, but I'll keep digging.

But I tried cutting-and-pasting a Word doc with THREE (not two) paragraph spaces between each paragraph, and that worked beautifully! Except for bullets, which are still problematic, the text now comes in as separate paragraphs!

A bit of a pain, but at least it's a solution. -- Chuckdesign :)
 
Word 2000 for Windows doesn't have &quot;filtered&quot; HTML either.

You may copy-paste documents that have simple text, but if your docs have plenty of big tables - the only solution is to save as HTML and then clean it... Good Luck! :)
 
You might want to try this technique. It works for me every time. (I update an existing dreamweaver page with new text from a Word doc every month.)

1. Using MSWord, save the doc using the &quot;Save as htm&quot; command.

2. Close MSWord and forget it.

3. Using Dreamweaver, open the page that will receive the Word text. Leave it open.

4. Using Dreamweaver, also open the htm file that you just created from within MSWord.

5. Copy (or cut) and paste the MSWord created htm text into your destination page.

Some tabs may not hold well, but for the most part you'll have a properly spaced and formatted Dreamweaver page when you're done.

Now for tables: If your MSWord doc contains a table that spans a page break you might as well forget it. I've found nothing that will enable you to clean up the Dreamweaver page after you copy an MSWord multi-page table into Dreamweaver. Better to take the table into Excel (so there are no page breaks) and then copy it into Dreamweaver.

Best of luck to all!
 
So you say that you don't clean up the code you got from the Word? I personally always do it (even if it takes much time) and get files at least 4 smaller. Good Luck! :)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top