Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

how to get VB to split big word doc into (auto named) smaller ones

Status
Not open for further replies.

6ftAndClean

Programmer
Sep 24, 2003
19
GB
Assistance please.

I require assistance with some Visual Basic (V.6.0) macro programming for Word 2000 under Win 2000.


THE BIG PICTURE:

I have the output of a query of a database which is a 6.8MB text file. The contents of the text file is a list of possible duplications of data within the database. Each line of text is an individual entry from the d/base. So in the simplest situation a potential duplication is shown by 2 lines of nearly identical text, folowed by an empty line, followed by the next potential duplication, etc. etc. There are occurences of 30 lines (or more) of text for 30 (or more) possible individual records which are all potential duplicates of each other.

This data needs to be sorted by eye.

MY SOLUTION SO FAR:

To aid this process I have sought to cut the work load up into manageable bite-sized chunks of 500 potential duplications per file.

So, I have used 'Find and Replace' in Word 2000 (I have a limited range of software available due to network restrictions) to replace the empty lines between each group of potential duplicates with a manual page break. This gives me a page per group of possible duplicate records.

I have then created a macro to select the 1st 500 pages (i.e. 1st 500 potential duplicate records), copy the text, open a new document based on a selected template and paste the contents. I then use 'Find and Replace' again to return the text to its original formatting of one empty line each group of lines of text.

Finally I save the document to a folder "To Do" and number the records 01.txt, 02.txt, etc.

All of the above I have running via recorded macros BUT for the saving part.

HERE'S WHERE YOU COME IN - hopefully!

Can I get my macro to copy the contents of the 1st 500 pages, paste it into a new document, save the document (automaticallygiving it a numerical name - the 1st being 01, the next 02 etc), close the new document, return to the original document, delete the still highlit text and then start the loop over and run automatically until it runs out of document to copy and paste from?

Your help would be very much appreciated.
 
Yes you can.

But in order to make things easier I think we would best try to integrate it with your macro coden as you already have part of the code done, so if you could post it here..


Regards

Frederico Fonseca
SysSoft Integrated Ltd
 
the macros I have created so far appear at the end of this posting. I now offer this breakdown of what the 2 macros do and what I needs help with.

part1 - does the initial import (of data previously copied to the clipboard) and seperates the blocks of data to one per page. it then does a paste of the 1st 500 pages into a new doc and removes the page breaks from it.

at this point the document needs to be saved in a 'Text Only With Line Breaks' format. It also needs to be named 01.txt. The document then needs to be closed and focus switched back to the 1st doc.

part2 - (with focus back in the original doc) this macro removes the 1st 500 pages part1 just exported. (This is where the loop picks up - the code for this macro is a copy of the latter part of part1 macro). Copies and pastes 1st 500 pages to a new doc and removes the page breaks from it.

at this point the latest new doc needs to be saved, same file format but named 02.txt. The document then needs to be closed and focus switched back to the 1st doc. Repeat macro part2 until 1st document has been exhausted. On each loop the file name should be incremented to the next higher value (i.e. 01, 02, 03, 04, etc.).



Code:
Sub part1()
'
' part1 Macro
' Macro recorded 25/09/2003 by nick
'
    Documents.Add Template:= _
        "C:\Documents and Settings\dedpdpa\Application Data\Microsoft\Templates\Dedupe Page Splitter.dot" _
        , NewTemplate:=False, DocumentType:=0
    Selection.Paste
    Selection.HomeKey Unit:=wdStory
    Selection.MoveDown Unit:=wdLine, Count:=1
    Selection.Find.ClearFormatting
    Selection.Find.Replacement.ClearFormatting
    With Selection.Find
        .Text = "^p^p"
        .Replacement.Text = "^m"
        .Forward = True
        .Wrap = wdFindContinue
        .Format = False
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.HomeKey Unit:=wdStory
    Selection.GoTo What:=wdGoToPage, Which:=wdGoToNext, Count:=500, Name:=""
    Selection.Find.ClearFormatting
    With Selection.Find
        .Text = "^p^p"
        .Replacement.Text = "^m"
        .Forward = True
        .Wrap = wdFindContinue
        .Format = False
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.MoveUp Unit:=wdLine, Count:=1
    Selection.EndKey Unit:=wdLine
    Selection.HomeKey Unit:=wdStory, Extend:=wdExtend
    Selection.Copy
    Documents.Add Template:= _
        "C:\Documents and Settings\dedpdpa\Application Data\Microsoft\Templates\Dedupe Page Splitter.dot" _
        , NewTemplate:=False, DocumentType:=0
    Selection.Paste
    Selection.HomeKey Unit:=wdStory
    Selection.Find.ClearFormatting
    Selection.Find.Replacement.ClearFormatting
    With Selection.Find
        .Text = "^m"
        .Replacement.Text = "^p^p"
        .Forward = True
        .Wrap = wdFindContinue
        .Format = False
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
End Sub




Sub part2()
'
' part2 Macro
' Macro recorded 25/09/2003 by nick
'
    Selection.Delete Unit:=wdCharacter, Count:=1
    Selection.Delete Unit:=wdCharacter, Count:=1
Selection.HomeKey Unit:=wdStory
    Selection.GoTo What:=wdGoToPage, Which:=wdGoToNext, Count:=500, Name:=""
    Selection.Find.ClearFormatting
    With Selection.Find
        .Text = "^p^p"
        .Replacement.Text = "^m"
        .Forward = True
        .Wrap = wdFindContinue
        .Format = False
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.MoveUp Unit:=wdLine, Count:=1
    Selection.EndKey Unit:=wdLine
    Selection.HomeKey Unit:=wdStory, Extend:=wdExtend
    Selection.Copy
    Documents.Add Template:= _
        "C:\Documents and Settings\dedpdpa\Application Data\Microsoft\Templates\Dedupe Page Splitter.dot" _
        , NewTemplate:=False, DocumentType:=0
    Selection.Paste
    Selection.HomeKey Unit:=wdStory
    Selection.Find.ClearFormatting
    Selection.Find.Replacement.ClearFormatting
    With Selection.Find
        .Text = "^m"
        .Replacement.Text = "^p^p"
        .Forward = True
        .Wrap = wdFindContinue
        .Format = False
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
End Sub

Thanks for any help you can offer me. I just do not have the time available to learn VB sufficiently well to tackle this problem myself (nor are the help files available over the network!).

Thanks again.
 
Hi.
I have a similar situation.

I have got one word file that contains many mini reports. What I need to be able to do is to split the file into the various reports.

At the beginning of each report there is an identifier to indicate a new report: center 1234, center 2100, etc.
Each report has the same header.

So what I want to do is for the reports to split at each change in center keeping the report header with each report and then automatically save (preferred format is .pdf format) using the center number as the filename in a specified folder.

I would really appreciate help with this please.

Many thanks.

 
Fusion786,

Please create a new thread with your question.

The split over another .doc is possible and kind of easy, but doing a ".pdf" would imply a lot more time than what I am willing to spend, bearing in mind that you can do that automatically with pdfFactory (pdffactory.com) and other similar products.


6ftAndClean,

Sorry for a late answer.
I am on my holidays now and I will try and have a good look at your code during this week.




Regards

Frederico Fonseca
SysSoft Integrated Ltd
 
Cheers again Frederico,

I hope you do not mind but I have one other thing that the VB would ideally do. Could it prompt the user to specify/create an output folder? If so, great!

Enjoy your week off. Please don't spend any of your precious holidy working on this!

Have fun,

Nick.





 
Hi

Thanks Frederico

Enjoy your holiday.

If I can have help with the coding to be able to split the documents that would be great.

Saving the output into pdf is an ideal situation but not essential at this stage. I will have a look at pdffactory.

Being able to automatically split the reports and saving them even in the current word .doc format is ok.

All help is appreciated.

Thanks.
 
Hi.

I have created a new thread with my question. The thread is thread68-667786

I have been suggested a coding which enables to split the reports but they split at each page rather than at the beginning of each report as some reports go over a few pages.

An explanation and the coding can be seen at the above thread.

Any help and advice is appreciated.

Thanks
 
6ftAndClean

Here is my solution.
Your code was not bad, but was adding to much to it.

The following code assumes that you have your file loaded into an empty word document. This document will not be saved, and everything on it will be "lost".

There is one bit that I don't like (the repaginate) but this is required for the page count to work well.

If you do not understand a particular part of it just ask.

Here is the code.
Code:
Option Explicit
Const SplitPages = 501
Public filenumber As Long
Public basename As String

Sub main_macro()
'First clear everything and replace two sequencial linefeeds by a page break.
Selection.HomeKey unit:=wdStory
Selection.Find.ClearFormatting
With Selection.Find
    .Text = "^p^p"
    .Replacement.Text = "^m"
    .Forward = True
    .Wrap = wdFindContinue
    .Format = False
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
    
' now we are ready to process the main document
' This is done a loop made of the following steps
' 1 - get a chunk of 500 pages at one go
' 2 - cut it
' 3 - paste it onto a new document
' 4 - replace the page breaks with 2 linefeeds
' 5 - save and close the new document
filenumber = 1
While ActiveDocument.ComputeStatistics(wdStatisticLines) > 0
    process_document
Wend
End Sub

Sub process_document()
Dim totalpages As Long
Dim totallines As Long

'Total lines is used to determine if we have processed all the document
'Total pages is used to determine if we have more than the number of pages
'we wish to process. If we do we need to go to the end of the document instead
'of a particular page
ActiveDocument.Repaginate
totalpages = ActiveDocument.ComputeStatistics(wdStatisticPages)
totallines = ActiveDocument.ComputeStatistics(wdStatisticLines)

If totallines = 0 Then
    Exit Sub
End If

If totalpages > SplitPages Then

    Selection.GoTo what:=wdGoToPage, Which:=wdGoToAbsolute, Count:=SplitPages, Name:=""
Else
    Selection.EndKey unit:=wdStory
End If

Selection.HomeKey unit:=wdStory, Extend:=wdExtend
Selection.Cut
Documents.Add Template:="c:\tmp\demo.dot", NewTemplate:=False, DocumentType:=0
Selection.Paste
Selection.HomeKey unit:=wdStory
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
    .Text = "^m"
    .Replacement.Text = "^p^p"
    .Forward = True
    .Wrap = wdFindContinue
    .Format = False
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
If filenumber = 1 Then
    With Dialogs(wdDialogFileSaveAs)
        .Format = wdFormatTextLineBreaks
        .Name = "001.txt"
        .Show
    End With
    basename = ActiveDocument.Path
Else
    ActiveDocument.SaveAs FileName:=basename & "\" & Format(filenumber, "000") & ".txt", fileformat:=wdFormatTextLineBreaks
End If
ActiveDocument.Close
filenumber = filenumber + 1

End Sub


Regards

Frederico Fonseca
SysSoft Integrated Ltd
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top