Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Redaction: Batch processing

Status
Not open for further replies.

notoriusbug

Technical User
May 10, 2010
5
CA
Hi,
I need to redact a large number (thousands) of documents spread out over 100 folders. I know how to use the Apply Redaction Marks command to accomplish this. I've got my batch sequence set up so that my new copies of the documents (the ones with the redaction marks applied) are named r_<xxxxx>.pdf.

However, I only want new documents created that actually have redacted information in them - i.e., if a document doesn't have any redacation marks, then an r_<xxxxx>.pdf document shouldn't be created for that document.

Any ideas/thoughts would be greatly appreciated, as is this could save countless hours.

Andy
 
I think you'll have to put your logic into your redaction Macro, so that it searches for the text to redact, and if it finds some, redact and 'Save As' prepending R_ to the filename, and if it doesn't, exit without saving. I've been looking at the DOS FC command, and it wants to give a list of mismatched bytes, but doesn't seem to have a binary output to indicate (Same or Different).

Fred Wagner

 
Thanks, FredWagner.

I re-read my initial post and I should clarify that the way the batch process is going now an r_ document is being created for every document, regardless of whether it has any redaction in it. The goal is is to only generate an r_ document if something was actually redacted.



 
I was thinking through two approaches - the first was to compare the input and output files in the batch process, and discard the output if not different from the input. You could try that - the DOS command is FC and you get the arguments by entering FC /? at the command prompt.
the other would be to have some logic within Acrobat, or with the parameters you're passing it in the batch, to only redact if it found what it was looking for to redact.
Or you could just leave the process as is - the queries that look for redacted files (with the R_ prefix) will find something every time, whether there was actual redaction done or not. Disk space is cheap, getting cheaper!

Reading up on the Redaction topic on Adobe.com, an article cautions that redacting an Acrobat file will leave metadata that a motivated person can recover, that if you're really going to redact and make it stick, once you've replaced the material to redact with other characters, or blacked it out, save all the pages as TIF images, and then reassemble the TIF's into a new PDF that is totally without metadata. if the redacted file is searchable, then the redacted material CAN be recovered by anyone who really wants it.

Fred Wagner

 
Thanks again FredWagner.
However, this morning, I've realized a much easier solution to my problem. After the batch process is finished, it provides a list of documents where no redaction marks were found. Using that list, I can go into my folder and just delete the r_ documents that were created for the documents in the list.

As far as metadata goes, Adobe 9 actually contains an extra feature that will remove metadata when you apply the redaction marks.
 
If the batch process provides the list, you could just add a bit more logic, have it pause for you to review the list, and then let it delete the redundant files for you. I'm running Acro Pro version 6 - will ask for an upgrade to 9 next time they replace/upgrade my PC - will be getting Windows 7 then too.

Fred Wagner

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top