Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to find if a string has more than 50 characters? 1

Status
Not open for further replies.

Ramnarayan

Programmer
Jan 15, 2003
56
US
Hi,

I am trying to parse a file that has multiple lines in a xml format. Now I have to find out if a line is starting with a tag like <B> or <I>. If the number of characters i.e. alpha numeric, puctuation, spaces starting from <B> is more than 50 characters, I have to output the line to a file not having the <B>.

For example:

<B>this is a test; and cannot exceed more than 50 characters</B>. But this line has more than 50 characters.

OUTPUT:
this is a test; and cannot exceed more than 50 characters. But this line has more than 50 characters.

In above example, the first case, has more than 50 characters in the tag <B> ...</B>. So can someone guide on how the coding should be done interms of a function?

 
maybe something like:

if ($line =~ /^\<B\>/) {
$line2 = $line1;
$line2 =~ s/\<*B\>//g;
@count = split //, $line2;
print $#count >= 50 ? $line2 : $line, "\n";
}


definitely not tested but might give you something to go on..




~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those Who Say It Cannot Be Done Are Usually Interrupted by Someone Else Doing It; Give the wrong symptoms, get the wrong solutions;
 
Doh

[bigglasses]

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those Who Say It Cannot Be Done Are Usually Interrupted by Someone Else Doing It; Give the wrong symptoms, get the wrong solutions;
 
one possibility:

Code:
$_ = '<B>this is a test; and cannot exceed more than 50 characters</B>. But this line has more than 50 characters.';
m#^<(\w+)>(.+)</\1>(.*)$#;
if (length $2 > 50) {
   print "$2$3";
}

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Can't call me lazy :p
Code:
use warnings;
use strict;
use OLE;
use Win32::OLE::Const "Microsoft Excel";

my $line = "<B>this is a test; and cannot exceed more than 50 characters</B>. But this line has more than 50 characters.";

if ( $line =~ /^\<B\>/ )
{
   my $line2 = $line;
   $line2 =~ s/\<B\>//g;
   $line2 =~ s/\<\/B\>//g;

   my @count = split //, $line2;

   if ( &count_line ( @count ) >= 50 )
   {
      print "$line2\n";
   }
}

sub count_line
{
   my ( @items ) = @_;
   my ( $excel, $workbook, $sheet, $row, $last_row, $count );

   $excel = CreateObject OLE 'Excel.Application';
   $excel -> {Visible} = 0;
   $excel -> {DisplayAlerts} = 0;
   $workbook = $excel -> Workbooks -> Add;
   $sheet = $workbook -> Worksheets ( 1 );
   $row = 1;

   foreach ( @items )
   {
      $sheet -> Range("A".$row) -> {Value} = $_;
      $row++;
   }

   $last_row = $row - 1;
   $sheet -> Range("A".$row) -> {FormulaR1C1} = "=COUNTA(R[-$last_row]C:R[-1]C)";
   $count =  $sheet -> Range("A".$row) -> {Value};
   $workbook -> Close;
   $excel    -> Quit;

   return ( $count );
}
 
lol @ WinblowsME

I had to actually run that code to confirm that the return value actually matched the passed parameter.

Btw, thanks for teaching me that Win32::OLE had a visibility option. I've always favored Spreadsheet::parseExcel over Win32::OLE, and all the examples for Win32::OLE I've seen always loaded up a visible copy of Excel.

Star for the laugh and tip :)

- Miller
 
KevinADC said:
one possibility:

Code:
$_ = '<B>this is a test; and cannot exceed more than 50 characters</B>. But this line has more than 50 characters.';
m#^<(\w+)>(.+)</\1>(.*)$#;
if (length $2 > 50) {
   print "$2$3";
}

@ Kevin

That would of course work for almost all cases. However one should always use non-greedy matching with the any character class unless the greedy feature of regexes is intentionally desired.

Code:
$_ = '<B>this is a test; and cannot exceed more than 50 characters</B>. But this line has more than 50 characters.';
m#^<(\w+)>(.+[COLOR=red]?[/color])</\1>(.*)$#;
if (length $2 > 50) {
   print "$2$3";
}

- Miller
 
Yes, I was just going by his sample data. It could always be adjusted if necessary. But the point is well taken. [smile]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top