Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Now I'm really desperate 1

Status
Not open for further replies.

ironpawz

IS-IT--Management
Oct 8, 2002
44
0
0
NZ
I have been scanning the web, books and annoying my firends but I can't fix it. I have posted questions in many discussions and all are ignored. I am thinking someone must know or at least have a lead to more info so I am being bad and reposting it here.

Loads of companies change their names often (especially the gov). A tool to go over a bunch of docs and a>record last modified date, change company name, reset last modified date, would be very handy. Our docs have four different company names in 4 years! As soon as I can actually change that one field (and the script is almost there) I'll finish it and make it available.

I'm not a very good programmer so the fix is probably simple, please help if you can. I don't want to try doing it in VB (yuck).

Original code adapted from (which does not quite work). you need the addon dsofile installed (avail from that link or at least explained). Is it likely I am not using functions that update expects? The right section highlighted ###### that does not work fails if I put in a fail condition and returns no error as is but does not work.

Any leads anything? I really am getting desperate and vb is my last choice please save me from VB!!

use strict;
use Win32::OLE;
use File::Spec;


my $PropertyReader = Win32::OLE->new('DSOleFile.PropertyReader', 'Quit');

my $directory = "c:\\temp\\cst";


opendir(DNAME, $directory) || die "Unable to open the requested directory: $directory\n";

while( my $filename = readdir( DNAME ) ) {

next if ($filename eq '.' or $filename eq '..');

my $fullfilename = File::Spec->catdir($directory,$filename);
my $title = "title";
my $properties = $PropertyReader->GetDocumentProperties($fullfilename)
|| die("Unable read file properties for '$fullfilename' ", Win32::OLE->LastError());

if ( !$properties->{title} || length($properties->{title}) >= 0)
{
print "File '$filename' --- Title not set setting to '$filename'\n";

$properties->SetProperty('Title', $filename); #######not working#########

print "Title =" . $properties->{title} ."'\n";
print "\n\n";

} else {
print "File '$filename' --- Title property is set to '" . $properties->{title} ."'\n";

}

}

closedir(DNAME);
 
What about adding a
Code:
|| die("Unable set 'Title' for '$filename' ", Win32::OLE->LastError());
after you call the SetProperties method. Does that return anything useful?

jaa
 

I just get the output. The file is not read only and I def have permissions to it. There is a way to increase the debugging also isn't there I'll give that a go also.

Unable to change title: c:\temp\cst
 
Are you certain Win32::OLE and File::Spec are both installed?

I'm not familiar with Win32::OLE. What exactly is
Code:
SetProperty('Title', $filename)
supposed to do? Are you sure it is a method of the object created by
Code:
GetDocumentProperties()
?

I may be wrong, but I don't think
Code:
!$properties->{title}
will ever evaluate true if
Code:
$properties->{title}
was undefined since Perl will automatically create the reference for you. Try
Code:
!defined $properties->{title}
instead.

Also, if
Code:
length($properties->{title}) >= 0
then it seems to me that the "title" does exist, so the following line that says it's not set does not make sense to me.

How about this... after the line
Code:
my $properties =...
, go ahead and print out all of the values that object is supposed to hold, just to see what they are before you test against or change them.

Try checking out faq452-3023 for help debugging your Perl script. I slanted it towards CGI debugging, but there's some basic debugging techniques in there.

I hope this at least gives you some ideas!
Sincerely,

Tom Anderson
CEO, Order amid Chaos, Inc.
 
I noticed that you have set $title to "title", yet in your script you set and interogate $properties->{title}.

this makes me suspicious - you are setting and reading a reference to hash i.e. $properties = {}; # A ref to a hash.

In my experience you need to set $properties->{$title} or if you are using a n ordinary word then enclose it in single quotes i.e.

$properties->{'title'}.

What I'm even more suspicious about though is the fatc that use strict has not complained suffuciently to make you aware of this fact. Or maybe your script is just silently ignoring $proprties->{title}.

One way of helping with script debugging (apart from running perl with the -w switch as in perl -w script.pl) is to add ;

use diagnostics -verbose; to the top of your script.

Anyway see if my suggestions help.
 
greadey,

hash keys are allow to be bare words, even under the strict pragma.
so
Code:
use strict;
my %hash = ( title  => 'my title',
             author => 'me',
           );
$properties->{title};
are perfectly vaild syntax.

ironpawz,

can you post the exact error message you are getting? As Tom pointed out, are you sure you are entering that part of the if-statement?

jaa
 
justice41
Sorry I didn't read your message clearly enough so I tried again with your suggestion. I also tidied up the script a bit as it is causing other confusions I now have:

use strict;
use Win32::OLE;
use File::Spec;


my $PropertyReader = Win32::OLE->new('DSOleFile.PropertyReader', 'Quit');

my $directory = "c:\\temp\\test";

opendir(DNAME, $directory) || die "Unable to open the requested directory: $directory\n";

while( my $filename = readdir( DNAME ) ) {

next if ($filename eq '.' or $filename eq '..');

my $fullfilename = File::Spec->catdir($directory,$filename);
my $filenam3 = "somename";
my $properties = $PropertyReader->GetDocumentProperties($fullfilename)
|| die("Unable read file properties for '$fullfilename' ", Win32::OLE->LastError());

if ( !$properties->{title} || length($properties->{title}) == 0) # changed to only get empty title docs
{
print "File $filename --- Title not set setting to '$filenam3'\n";

$properties->SetProperty('title', $filenam3)|| die("Unable set 'Title' for '$filename' ", Win32::OLE->LastError());

print "title $properties->{title} ";
}
}

closedir(DNAME);

The output I get is this:

File New Microsoft Word Document.doc --- Title not set setting to 'somename'
Unable set 'Title' for 'New Microsoft Word Document.doc' Win32::OLE::0.0810 erro
r 0x80020006: "Unknown name"
in GetIDsOfNames "SetProperty" at C:\Test.pl line 25.

 
Tanderso
Win32::OLE and File::Spec. Both are installed we use em for lots of other things. My perl is very basic still but the main department create scripts for loads of things an I see them using both in scripts so they must be there.

>What exactly is SetProperty('Title', $filename)

Is suppose to set the title of a word document (open it and go properties to see title) to $filename (whatever.doc would have a title of whatever.doc).

!defined $properties->{title} I'm not following how will that set the property or is that suppose to clear the current title? sorry I'll get there (I'm mostly at that cut past modify read the book stage still).

>= 0 I have changed this back to ==. I was only trying to make it run against all docs including those with current titles. I had it as if 0 == 0 at one point <G>.

my $properties = $PropertyReader->GetDocumentProperties($fullfilename);
print &quot;\n&quot;;
print $properties;
print &quot;\n&quot;;

returns Win32::OLE=HASH(0x888818)

Thanks I'll check out the debug.
 
greadey
I'm not quite following all that.

I have an update from another thread
(message, please help simple script error in the general discussion I am ironpaw there). A great chap ran the script and it works fine. It might be my older perl version (5.00502). I am going to build a system on his spec (version 5.6.1 or later) and try again. Also his dsofile is an older one than mine so I'll build one all the same (only on 2000 verses 98 but I'll even go 98 if I have to).

Perl version seems the most likely problem then?

I got the original code from perlmonks

I am picking all is fine (what I originally posted had some post first try fiddling from me) with the original perlmonk code and it is my system? I'll test and repost results.

Thanks for helping chaps the debugging advise especially.

Expect I'll miss basics like older perl versions and you'll not be dissapointed <G>.

Ironpaw(z)
 
I was able to get something similar to work fine using [tt]$properties->{'title'} = $filenam3;[/tt] whereas [tt]$properties->SetProperty()[/tt] fails with $properties being an unblessed reference (different error might have something to do with version differences.)
 
rosenk

You are a champion. I have been at this for ages and it has been bugging me. Especially seeing as it was so likely to be a small error and you are 100% correct it works like a charm. THANK YOU THANK YOU THANK YOU

gready
I see you where on track I just couldn't quite grasp what you where saying (give me a few months and I'll be sweet).

amazing how a little syntax can screw with your whole day.

I'll send the final result and do a tek-tip if I can ever figure out how to do one (had a quick look which should have been enough but apparently not).

OHHH YES half a page of code to complete a job that took so long in VB and didn't keep the date properties (though I realise it could have if I'd asked for it at the time :).

You've made my week people thank you all.
 
Yes cooking with gas this script will do an entire directory structure and change all document company properties. change company properties.

My perl is pretty weak so there's a good laugh in here for the rest of you but it works and I am real happy about it! It is reasonably fast also. Test hard!

use File::Find;
use Win32::OLE;

$feeddir = @ARGV[0];
$feeddir =~ s/\\/\//; # If the dir is given using \ (like c:\temp) then subsitute
$feeddir =~ s/\\/\//; # there is something strange here is only works one dir
$feeddir =~ s/\\/\//; #deep so c:\temp works but c:\temp\temp don't
if ($feeddir == &quot; &quot;)
{
system(&quot;cls&quot;);
print &quot;$feeddir\n&quot;;
print &quot;\n\n\n\nNo directory specified\n\n&quot;;
print &quot;You need to specify a driectory to search down EG... csvdup.pl c:\\csvdir\n\n\n\n\n&quot;;
die;
}

chdir(&quot;$feeddir&quot;); # Change working directory to the now corrected directory given on the command line

find(\&wanted, &quot;$feeddir&quot;); # use the built in find command with the wanted subroutine and send it the $feeddir variable

sub wanted # the wanted subrouting is used to select only documents and change the company name

{
# pass over any file not ending in doc/DOC/xls/XLS etc

/\.doc$/ or /\.DOC$/ or /\.xls$/ or /\.XLS$/ or /\.ppt$/ or /\.PPT$/ or /\.dot$/ or /\.DOT$/ or return;
$filename = $File::Find::name;

my $PropertyReader = Win32::OLE->new('DSOleFile.PropertyReader', 'Quit');
my $comp = &quot;Department of Child Youth and Family&quot;;
print &quot;$filename\n&quot;;
my $properties = $PropertyReader->GetDocumentProperties($filename)
|| die(&quot;Unable read file properties for '$fullfilename' &quot;, Win32::OLE->LastError());

if ( !$properties->{title} || length($properties->{title}) >= 0)
{

# heart of the file Record filename / current company field / current date
# change the company name and print file name again and company name

$properties->{'DateLastSaved'}\n&quot;;
$properties->{'company'} = $comp || die(&quot;Unable set 'Title' for '$filename' &quot;, Win32::OLE->LastError());


}
}
closedir(DNAME);


I wrote a script I run before this one to capture all the current access dates for the files and another to set them all back afterwards. If you get here and want em try

bruce_taylor005@cyf.govt.nz

I will also try a tek-tip if I can figure out how as I it took me ages to do this.

Thanks again rosenk you where a massive help.








 
Any time. A couple quick tips:

[tt]$feeddir =~ s/\\/\//g;[/tt]

The trailing 'g' will do the replacement globally, not just for one occurrence.

[tt]/\.doc$/i[/tt]

The trailing 'i' makes the regex case insensitive, so no need for both [tt]/\.doc$/[/tt] and [tt]/\.DOC$/[/tt]

There are other regex modifiers as well, but those are certainly the ones I find most useful.
 
$feeddir =~ s/\\/\//g;

Thats a good one I knew there was a better way I'll use that.

/\.doc$/i

I did look at this but then perl searches for
Doc, dOc, DOc, doC, DoC, dOC, DOC
I checked and there are only DOC and doc files. The /i apparently really slows down a big search and I have 100,000 odd docs in a structure containing probably 300,000 files. Still I'll only do at most 20,000 at a time. I might speed test the two methods just to see. You are correct of course but is there a speed hit in running it?

another thing I had pointed out is
if ($feeddir == &quot; &quot;)
should be
if ($feeddir eq &quot;&quot;) # the other was only kind of working so I'll try this also.

Thanks rosenk you advice as always is top quality.
 
I wouldn't particularly expect there to be a performance hit when using case insensitive regular expressions. perl does not actually compare the string to the 6 possible capitalization mixes, but instead compiles the regular expression (into a state machine, I believe.)

One thing that might improve the speed a bit would be to use one regex for all the possible extensions:
[tt] /\.(doc|xls|ppt|dot)$/io[/tt]

I'd throw the [tt]o[/tt] regex modifier in there to have perl only compile the regex a single time. If you're running the test a lot that might improve performance, but I'm not certain.

If you run benchmarks on any of this I'd be interested in seeing results!

Keith
 
Ok....

Did some a little benchmarking, surprising results:

Two scripts:

First one using multiple regex's
# reg_test_multi.pl
while(<>){
next if /OPEN/;
next if /CLOSE/;
}

And then using 'or' in the regex
# reg_test_or.pl
while(<>){
next if /OPEN/;
next if /CLOSE/;
}

I ran these scripts over the same 45,000 line file. I did it several times to get rid of any timing problems resulting from the large file being read several times. The scripts were run on an otherwise quiet machine.

So -- not what you'd call a real benchmark, but close enough for jazz as they say.

# timex ./reg_test_or.pl tmp.txt

real 2.51
user 2.08
sys 0.03

# timex ./reg_test_multi.pl tmp.txt

real 0.47
user 0.38
sys 0.02

Using the | character in the, very simple, regex made it run 5 and a bit times slower.

Surprised me, as I say. Moral is -- don't use | in regexes if you want it to go quickly. Mike

Want to get great answers to your Tek-Tips questions? Have a look at faq219-2884

It's like this; even samurai have teddy bears, and even teddy bears get drunk.
 
Using /\.doc$/ or /\.DOC$/ or /\.xls$/ or /\.XLS$/ or /\.ppt$/ or /\.PPT$/ or /\.dot$/ or /\.DOT$/

on about 530 documents
Start time 1043183211
finish time 1043183223 (12 seconds)

using /\.(doc|xls|ppt|dot)$

Start time 1043183422
finish time 1043183435 (13 seconds)

Nothing in it and too close to judge as anything could slow the system down a few seconds either way (my increased logging prob slowed it a little also). This is on a PII400/256meg so I am happy with performance.

Piping it is the only true way display to screen really slows it (this was pipped)

C:\>chgfinal.pl c:\temp > logchg.txt

You other changes are all well advised and it is ready to go. One script to record all files and there access dates, one to change all the company fields, one to change all the access dates back to what they should be and all full logging. Im not desperate now <G>.

perl is the source, thanks again
 
Doh!! Cut and paste error

Second example script should read:

And then using 'or' in the regex
# reg_test_or.pl
while(<>){
next if /OPEN|CLOSE/;
}

Mike

Want to get great answers to your Tek-Tips questions? Have a look at faq219-2884

It's like this; even samurai have teddy bears, and even teddy bears get drunk.
 
I just did a similar test and got approximately the same results as you did, Mike. I then tried the same test, but with the regex [tt]/(open|close)$/[/tt].

This last was still slower than the individual regex tests, but by a much lesser amount (and both were significantly faster.)

The addition of [tt]$[/tt] speeding things up isn't surprising, but I'm certainly surprised at how much slower | is.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top