Problem with Blank spaces when using the Unpack Function. 1

Geek8 · Mar 31, 2004

The problem I am having with the UNPACK function is the following:

I am currently unpacking a record of 220 bytes into 3 variables. The first is to be left untouched, the second has some search and replace stuff going on in it, and the third is untouched. When I print out the results after all the manipulation, I notice that the blank spaces in the 3rd variable are not there anymore. Does "unpack" do away with those? How can I keep them so I can keep my byte size per record the same.

Here is the code I am using for unpack. If you have any suggestions, please let me know.

while (<INPUTFILE>)
{
my $begin;
my $names;
my $end;
chomp;
($begin, $names, $end) = unpack('A10 A80 A129',$_);

Thanks in advance for your help.

Geek

Coderifous · Mar 31, 2004

No, unpack doesn't take out spaces... when you output the record, are you making sure to add back the new-line that you are removing w/ the chomp?

This is going to be a fore-head smacker, can we get a shot of the code from the birth if $end (which I believe we see here) to the output of $end, where you are saying the symptom is noticeable. Perhaps something silly is happening in between those to spots in the code.

--jim

Geek8 · Mar 31, 2004

Thanks Jim for your quick response. Here is the code I am using. Actually in my previous post I made a mistake and said the "end" has the blanks, but the blanks are actually in the "names" record. Could it be the actual search and replace that is causing the blanks to vanish? If you have any other suggestions, please let me know.

while (<INPUTFILE>)
{
my $begin;
my $names;
my $end;
chomp;
($begin, $names, $end) = unpack("A10 A80 A129",$_); #get values from record#
my $i=0;

for ( $i=0 ; $i < 19 ; $i++ )
{
#Check to see if values are in the name fields#
my $search1;
$search1 = $SEARCH[$i];
my $replace1;
$replace1 = $REPLACE[$i];
my $newrec = $names;
$newrec=~s/$search1/$replace1/g;

$names=$newrec
#print OUTPUTFILE "2nd Output: $record[1]\n";#
}
print OUTPUTFILE "$begin$names$end\n";
$i++;
}

Coderifous · Mar 31, 2004

Below is a minor rewrite of your code to maximize the clarity of intent.
The code is doing multiple search-and-replace on EACH record, and outputting the result. You are storing your search and replace regexes in two arrays right now it appears, I would argue that there is probably a better way to accomplish your goal, but that's not to say that it still shouldn't work this way. So with that in mind:

Code:

while (<INPUTFILE>){
    my($begin, $names, $end)  = unpack("A10 A80 A129",$_);

    for( my $i = 0; $i >= $#SEARCH; $i++) {
        die "Sync error" unless defined $SEARCH[$i] && defined $REPLACE[$i];       
        $names =~ s/$SEARCH[$i]/$REPLACE[$i]/g;
    }
      
    print OUTPUTFILE  "$begin$names$end\n";
    
}

Now, from this code, there is no doubt in my mind that your spaces are accidentally being removed during the substituting regex. So at this point, it is necessary to reveal the contents of the SEARCH and REPLACE arrays, and let's have a look at the regexes in them to shut this problem down.

--jim

Coderifous · Mar 31, 2004

You now what, a good way to debug this might be to put a

Code:

print "$SEARCH[$i] / $REPLACE[$i] ---> $names\n";

inside of the for-loop AFTER the substitution. This way you could watch the value change as the code executed, and possibly figure it out quite speedily.

--jim

Geek8 · Mar 31, 2004

That code upgrade is great. Thanks for making it a lot more concise. I am pretty new to Perl as I am sure you can tell, but this helps a lot to learn. I know I need to learn "hashing" but basically the contents of the search and replace arrays are the following:

Search is:
my @SEARCH = (" SEP IRA "," IRA C "," IRA R "," CUST FOR "," TRS FOR "," JTWROS "," JT TEN "," JTTEN "," ROTH IRA "," UGMA/WA "," OR "," TTEE "," TTEES "," TEN COM "," TRUSTEES OF ",", IN TRUST FOR ",", TRUSTEE ",", TRS ",", TRUSTEES OF ");

and Replace is actually just the spaces for those values, so if we want to replace " IRA C ", the corresponding value would be " ". Does that make sense? I just want to basically delete those values, but not adjust the record byte count by removing them.

Sameer

Coderifous · Mar 31, 2004

I'm glad this is helpful for you. Learning is a fun experience, alot of it seems inexplicable and nearly random, overwhelming at the very least, but I promise that after doing perl for a while, you will be able to explain each atom of each statement and what it is doing. We all started out asking questions, so it's nice to be able to return the favor.

Try changing your code to this:

Code:

my @SEARCH = (
' SEP IRA '      ,  ' IRA C '   ,  ' IRA R '      , ' CUST FOR ',
' TRS FOR '      ,  ' JTWROS '  ,  ' JT TEN '     , ' JTTEN ',
' ROTH IRA '     ,  ' UGMA/WA ' ,  ' OR '         , ' TTEE ',
' TTEES '        ,  ' TEN COM ' ,  ' TRUSTEES OF ',
', IN TRUST FOR ',  ', TRUSTEE ',  ', TRS '       , ', TRUSTEES OF '
);

open(INPUTFILE, '/path/to/file') or die $!;
open(OUTPUTFILE, '>', '/path/to/otherfile') or die $!;

while (<INPUTFILE>){
    my($begin, $names, $end)  = unpack("A10 A80 A129",$_);

    foreach my $search ( @SEARCH ){      
        $names =~ s/$search/' ' x length($search)/eg;
    }
      
    print OUTPUTFILE  "$begin$names$end\n";
    
}

close INPUTFILE;
close OUTPUTFILE;

The code is untested, but I don't think there will be any problems with it. The big change here is that instead of maintaining a separate array where you've manually created strings of the proper amount of space characters, we let perl do the work for us.

--jim

Geek8 · Mar 31, 2004

Jim,
Thanks for that code change. It is so small and easy with what you have done. I tried it out and still am having the same problem. The values I want erased are going away, but when I print the record back out, the 2nd upack function "names" is dropping the blanks off the end of that section and thus shortening the record. Do you have any other suggestions?

Geek.

Geek8 · Mar 31, 2004

Jim,
I noticed that if I put the following print statement after the unpack function:

print "$names\t";

I get a page full of the second unpack variable. The records in this are all without the trailing blanks. Wouldn't that mean that Unpack is doing that?

Geek.

Coderifous · Mar 31, 2004

Oh, right, when unpacking 'A' strips trailing spaces, where as 'a' returns data verbatim...

so change your unpack template to use lower case a's instead of upper case ones, and you should be set.

--jim

Geek8 · Mar 31, 2004

That was it! Thanks for all your help on that Jim. I really appreciate it. I hope through practice I'll be able to return the favour someday.

Thanks again.

Geek.

Coderifous · Mar 31, 2004

FYI: At the beginning of this trouble-shoot, I was believing that all spaces were being removed from the string, beginning, middle, and end. That's why I was believing the regex was the culprit. After your post where you mentioned the "dropping blanks off the end", that gave me the clue I needed to figure out what the problem was. A fine days work I'd say.

Good stuff, glad I could help. Have fun coding,

--jim

Geek8 · Mar 31, 2004

Just a few quick questions for you Jim.

1. In this statement "foreach my $search ( @SEARCH )", Does the statement dynamically create an "internal" array that it then uses to reference? I am not sure of the process here so I would like to know if you get a moment.

2. In the following statment "s/$search/' ' x length($search)/eg;" does the "x" mean multiply? This would make the most sense since you are also finding the value of the length of search.

Thanks again, and I look forward to reading your responses.

Geek.

Coderifous · Apr 1, 2004

If you haven't gotten it already, they "Programming Perl" book, 3d edition is worth the money. There are many tangents I could go on when answering your above two questions (mainly the first one), but who has the time? I'll try to keep it short and sweet, but make sure that you come away with a full understanding. And TIMTOWTDI.

Code:

foreach my $thing ( @list_of_things ){
 #... do something to each $thing
}

This is NEARLY equivelent to:

Code:

foreach $thing ( @list_of_things ){
#...
}

The 'my' just scopes the variable, and makes the code work if you are using the 'strict' pragma. For a short script, I wouldn't worry about it.
The 'foreach' keyword in perl is EXACTLY like the 'for' keyword in perl, it just reads nicer in a situation like this. You could use either one. OK, so the code above simply iterates over each element in the list (@list_of_things) and $thing holds the value for each iteration.

Code:

@canines = ('fido', 'spot', 'barfus', 'snoopy');

foreach $dog ( @canines ){
  print "$dog\n";
}

The code above would output a nice list of dog names. We could go ahead and just put the list definition in the foreach statement:

Code:

foreach $day (qw/Sun Mon Tues Wed Thurs Fri Sat/){
  # do something w/ $day
}

And there's a ton of other cool things you can do. But basically, the idea is the same: use the 'for' keyword and iterate over a list. Oh, and I it's definitely worth knowing that if you leave out the variable name ($thing, $dog, or $day in the above examples) then the value is stored in $_, perl's default variable.

Code:

foreach ( qw/red green blue/ ){
   print;  # print will use $_ if you don't give it anything... hence the 'default'
}

OK, on to your seconde question... the 'x' operator. It is not the mathematical multiplication operator... that is '*'. 'x' is the the repitition operator.

From `perldoc perlop` at the command line:

perldoc said:
Binary "x" is the repetition operator. In scalar context or if the
left operand is not enclosed in parentheses, it returns a string con-
sisting of the left operand repeated the number of times specified by
the right operand. In list context, if the left operand is enclosed in
parentheses, it repeats the list.

print '-' x 80; # print row of dashes

print "\t" x ($tab/8), ' ' x ($tab%8); # tab over

@ones = (1) x 80; # a list of 80 1's
@ones = (5) x @ones; # set all elements to 5

so:

Code:

print ' ' x length("foo");

Gives you 3 (precisely the return value of the length function when fed the string "foo") spaces.

--jim

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Problem with Blank spaces when using the Unpack Function. 1

Geek8

Programmer

Coderifous

Programmer

Geek8

Programmer

Coderifous

Programmer

Coderifous

Programmer

Geek8

Programmer

Coderifous

Programmer

Geek8

Programmer

Geek8

Programmer

Coderifous

Programmer

Geek8

Programmer

Coderifous

Programmer

Geek8

Programmer

Coderifous

Programmer

Similar threads

Part and Inventory Search

Sponsor