Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

TCL basic regexp

Status
Not open for further replies.

stratusmark

Programmer
Feb 23, 2013
3
0
0
GB
Hi,

I am trying some regexp, but so far without major success... The thing I want to achieve is the following: having a string:
"-------------------------- some string ----------------------"

I want to check if the string-in-question contains three parts (i.e. heading dashes, then some text and then ending dashes) and if yes, then have the text passed in specific variable, i.e. procedure which returns 1 if string that is passed to it match this regexp and (as second argument) the content of the comment (in the above example 'some string'.

Can you please advise how to achieve this goal ? I tried with something like:

set testString "--------------- check the wheels ----------------"
regexp -- {(-*)(([a-zA-Z0-9]+ )+)(-*)} $testString result dashes1 ans dashes2
puts $result

but that does not work and I cannot split original string into these three sections.

Thanks, Mark
 
This works:

statusmark.tcl
Code:
[COLOR=#804040][b]set[/b][/color] mystring [COLOR=#ff00ff]"------ some string ---------"[/color] 
[COLOR=#804040][b]set[/b][/color] result [[COLOR=#804040][b]regexp[/b][/color] {(-+)\s*([^-]*)\s*(-+)} [COLOR=#008080]$mystring[/color] match first middle rest]

[COLOR=#804040][b]if[/b][/color] {[COLOR=#008080]$result[/color]} {
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"String to apply the pattern: '$mystring'"[/color]
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"result=$result"[/color]
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"This was matched: '$match'"[/color]
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"First part (extracted): '$first'"[/color]
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"First part (extracted): '$middle'"[/color]  
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"First part (extracted): '$rest'"[/color]  
} [COLOR=#804040][b]else[/b][/color] {
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"Nothing matched !"[/color]
}

Output:
Code:
C:\Work>tclsh85 statusmark.tcl
String to apply the pattern: '------ some string ---------'
result=1
This was matched: '------ some string ---------'
First part (extracted): '------'
First part (extracted): 'some string '
First part (extracted): '---------'
 
I have some typos, please correct to
Code:
[COLOR=#804040][b]set[/b][/color] mystring [COLOR=#ff00ff]"------ some string ---------"[/color] 
[COLOR=#804040][b]set[/b][/color] result [[COLOR=#804040][b]regexp[/b][/color] {(-+)\s*([^-]*)\s*(-+)} [COLOR=#008080]$mystring[/color] match first middle last]

[COLOR=#804040][b]if[/b][/color] {[COLOR=#008080]$result[/color]} {
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"String to apply the pattern: '$mystring'"[/color]
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"result=$result"[/color]
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"This was matched: '$match'"[/color]
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"First part (extracted) : '$first'"[/color]
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"Middle part (extracted): '$middle'"[/color]  
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"Last part (extracted)  : '$last'"[/color]  
} [COLOR=#804040][b]else[/b][/color] {
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"Nothing matched !"[/color]
}
Output:
Code:
C:\Work>tclsh85 statusmark.tcl
String to apply the pattern: '------ some string ---------'
result=1
This was matched: '------ some string ---------'
First part (extracted) : '------'
Middle part (extracted): 'some string '
Last part (extracted)  : '---------'
 
Hi Mikrom,

Thanks a lot - this is exactly what I wanted to achieve :). Still, in order to better understand how it works, can you please clarify the following... ?

1) I believe you make an exact match for dashes (-+) so that it can be associated to $first and $flast. Can you advise if \s* is mandatory to have it working (I tried with (-+)([^-]*)(-+) ) and seems to be working as well) ?

2). Why when I put extra brackets (as far as I know brackets just allow to collect matched regexp to the variable at the very end of the command, i.e. middle) to match middle part (i.e. {(-+)(\s*([^-]*)\s*)(-+)} ) I get as result:
First part (extracted) : '------'
Middle part (extracted): ' some string '
Last part (extracted) : 'some string '
can you please clarify why this approach does not work ?

3) can you confirm that $match will be always equal to $mystring as in this example (as well as any other regexp similar to this one) ?

Thanks,
Mark
 
1. If you have
Code:
set mystring "------     some string     ---------" 
set result [regexp {(-+)([^-]*)(-+)} $mystring match first middle last]
then you will get the result with leading spaces
Code:
Middle part (extracted): '     some string     '
But when you use
Code:
set mystring "------     some string     ---------" 
set result [regexp {(-+)\s*([^-]*)(-+)} $mystring match first middle last]
you will get it without leading spaces
Code:
Middle part (extracted): 'some string     '
If you are sure that between '----' and 'some string' is minimal one space then you can use
Code:
set mystring "------ some string ---------" 
set result [regexp {(-+)\s+([^-]+)\s+(-+)} $mystring match first middle last]
which extracts the result without leading and trailing spaces
Code:
Middle part (extracted): 'some string'


2. What you tried is to extract 4 groups
Code:
set mystring "------ some string ---------" 
set result [regexp {(-+)(\s*([^-]*)\s*)(-+)} $mystring match first middle1 middle2 last]

if {$result} {
  puts "String to apply the pattern: '$mystring'"
  puts "result=$result"
  puts "This was matched: '$match'"
  puts "First    : '$first'"
  puts "Middle1  : '$middle1'" 
  puts "Middle2  : '$middle2'"  
  puts "Last     : '$last'"  
} else {
  puts "Nothing matched !"
}
it is:
Code:
C:\Work>tclsh85 statusmark.tcl
String to apply the pattern: '------ some string ---------'
result=1
This was matched: '------ some string ---------'
First    : '------'
Middle1  : ' some string '
Middle2  : 'some string '
Last     : '---------'
with outer brackets
Code:
[highlight]([/highlight]\s*([^-]*)\s*[highlight])[/highlight]
it extracts $middle1 with leading spaces and with inner brackets
Code:
(\s*[highlight]([/highlight][^-]*[highlight])[/highlight]\s*)
it extracts $middle2 without leading spaces.
Unfortunatelly the trailing spaces will be matched by
Code:
[^-]*
which means every character except '-', so the marker
Code:
(\s*([^-]*)[highlight]\s*[/highlight])
doesn't have a function here and you can leave it out.

3. $match returns the whole string matched, if it is matched that is if the $result is true
 
Hi,

thank you very much! I believe I now understand how it works in TCL. Great!
Still, struggling with the last element: I am trying to get all occurrences of specific regex and create a list which I could then sort.

My example is the following. I plan to put it all in the list (top element of list would be a row which will contain sub-elements (sub-list) that I will be able to sort further, e.g.
{{1 2 3 ... Gamma} {2 68 108633 ... Beta} {2588 9728 .... Alpha}}

still, by doing regexp (even with -all & -inline) result is different from my expectations (I get not only a subset, but also each and every matching part):

set test {
1 2 3 4 5.55% 6.66% 7.77% 8 Gamma
2 68 108633 0 0.00% 0.00% 0.00% 0 Beta
3 2588 9728 266 11.35% 2.84% 0.62% 0 Alpha
}
set result [regexp -all -inline {([0-9]+)\s*([0-9]+)\s*([0-9]+)\s*([0-9]+)\s*([0-9]+.[0-9][0-9]%)\s*([0-9]+.[0-9][0-9]%)\s*([0-9]+.[0-9][0-9]%)\s*([0-9]+)\s*([A-Za-z\s]*)} $test]

{1 2 3 4 5.55% 6.66% 7.77% 8 Gamma
} 1 2 3 4 5.55% 6.66% 7.77% 8 {Gamma
} {2 68 108633 0 0.00% 0.00% 0.00% 0 Beta
} 2 68 108633 0 0.00% 0.00% 0.00% 0 {Beta
} {3 2588 9728 266 11.35% 2.84% 0.62% 0 Alpha
} 3 2588 9728 266 11.35% 2.84% 0.62% 0 {Alpha
}

(System32) 22 % lindex $result 0
1 2 3 4 5.55% 6.66% 7.77% 8 Gamma

(System32) 23 % lindex $result 5
5.55%

Can you advise what would be the most effective way to achieve the goal (in C I would create a class with all elements as class-fields and a method to sort it, but TCL seems to be working totally different and now I am trying to find the best way to get it done :)).

Thank you for help,
Mark
 
In your case the result contains a list of matched string and matched variables. You can iterate throught list as follows
Code:
[COLOR=#804040][b]set[/b][/color] test {
[COLOR=#ff00ff]1[/color] [COLOR=#ff00ff]2[/color] [COLOR=#ff00ff]3[/color] [COLOR=#ff00ff]4[/color] [COLOR=#ff00ff]5.55[/color]% [COLOR=#ff00ff]6.66[/color]% [COLOR=#ff00ff]7.77[/color]% [COLOR=#ff00ff]8[/color] Gamma[COLOR=#ff0000]\ [/color]
[COLOR=#ff00ff]2[/color] [COLOR=#ff00ff]68[/color] [COLOR=#ff00ff]108633[/color] [COLOR=#ff00ff]0[/color] [COLOR=#ff00ff]0.00[/color]% [COLOR=#ff00ff]0.00[/color]% [COLOR=#ff00ff]0.00[/color]% [COLOR=#ff00ff]0[/color] Beta[COLOR=#ff0000]\[/color]
[COLOR=#ff00ff]3[/color] [COLOR=#ff00ff]2588[/color] [COLOR=#ff00ff]9728[/color] [COLOR=#ff00ff]266[/color] [COLOR=#ff00ff]11.35[/color]% [COLOR=#ff00ff]2.84[/color]% [COLOR=#ff00ff]0.62[/color]% [COLOR=#ff00ff]0[/color] Alpha}
[COLOR=#804040][b]set[/b][/color] result [[COLOR=#804040][b]regexp[/b][/color] -all -inline {([[COLOR=#ff00ff]0[/color]-[COLOR=#ff00ff]9[/color]]+)\s+([[COLOR=#ff00ff]0[/color]-[COLOR=#ff00ff]9[/color]]+)\s+([[COLOR=#ff00ff]0[/color]-[COLOR=#ff00ff]9[/color]]+)\s+([[COLOR=#ff00ff]0[/color]-[COLOR=#ff00ff]9[/color]]+)\s+([[COLOR=#ff00ff]0[/color]-[COLOR=#ff00ff]9[/color]]+.[[COLOR=#ff00ff]0[/color]-[COLOR=#ff00ff]9[/color]][[COLOR=#ff00ff]0[/color]-[COLOR=#ff00ff]9[/color]]%)\s+([[COLOR=#ff00ff]0[/color]-[COLOR=#ff00ff]9[/color]]+.[[COLOR=#ff00ff]0[/color]-[COLOR=#ff00ff]9[/color]][[COLOR=#ff00ff]0[/color]-[COLOR=#ff00ff]9[/color]]%)\s+([[COLOR=#ff00ff]0[/color]-[COLOR=#ff00ff]9[/color]]+.[[COLOR=#ff00ff]0[/color]-[COLOR=#ff00ff]9[/color]][[COLOR=#ff00ff]0[/color]-[COLOR=#ff00ff]9[/color]]%)\s+([[COLOR=#ff00ff]0[/color]-[COLOR=#ff00ff]9[/color]]+)\s+([A-Za-z]+)} [COLOR=#008080]$test[/color]]

[COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"[/color][COLOR=#6a5acd]\$[/color][COLOR=#ff00ff]result='$result'"[/color]
[COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"----------------------------------------------------------------"[/color]

[COLOR=#804040][b]set[/b][/color] count [COLOR=#ff00ff]1[/color]
[COLOR=#804040][b]foreach[/b][/color] {match x1 x2 x3 x4 x5 x6 x7 x8 x9} [COLOR=#008080]$result[/color] {
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"[/color][COLOR=#6a5acd]\$[/color][COLOR=#ff00ff]count = $count:  [/color][COLOR=#6a5acd]\$[/color][COLOR=#ff00ff]match='$match'"[/color]
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"[/color][COLOR=#6a5acd]\$[/color][COLOR=#ff00ff]x1='$x1', [/color][COLOR=#6a5acd]\$[/color][COLOR=#ff00ff]x2='$x2', [/color][COLOR=#6a5acd]\$[/color][COLOR=#ff00ff]x3='$x3', [/color][COLOR=#6a5acd]\$[/color][COLOR=#ff00ff]x4='$x4', [/color][COLOR=#6a5acd]\$[/color][COLOR=#ff00ff]x5='$x5'"[/color]
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"[/color][COLOR=#6a5acd]\$[/color][COLOR=#ff00ff]x6='$x6', [/color][COLOR=#6a5acd]\$[/color][COLOR=#ff00ff]x7='$x7', [/color][COLOR=#6a5acd]\$[/color][COLOR=#ff00ff]x8='$x8', [/color][COLOR=#6a5acd]\$[/color][COLOR=#ff00ff]x9='$x9'"[/color]
  [COLOR=#804040][b]puts[/b][/color] [COLOR=#ff00ff]"----------------------------------------------------------------"[/color]
  [COLOR=#804040][b]incr[/b][/color] count
}

Output:
Code:
C:\Work>tclsh85 statusmark2.tcl
$result='{1 2 3 4 5.55% 6.66% 7.77% 8 Gamma} 1 2 3 4 5.55% 6.66% 7.77% 8 Gamma {
2 68 108633 0 0.00% 0.00% 0.00% 0 Beta} 2 68 108633 0 0.00% 0.00% 0.00% 0 Beta {
3 2588 9728 266 11.35% 2.84% 0.62% 0 Alpha} 3 2588 9728 266 11.35% 2.84% 0.62% 0
 Alpha'
----------------------------------------------------------------
$count = 1:  $match='1 2 3 4 5.55% 6.66% 7.77% 8 Gamma'
$x1='1', $x2='2', $x3='3', $x4='4', $x5='5.55%'
$x6='6.66%', $x7='7.77%', $x8='8', $x9='Gamma'
----------------------------------------------------------------
$count = 2:  $match='2 68 108633 0 0.00% 0.00% 0.00% 0 Beta'
$x1='2', $x2='68', $x3='108633', $x4='0', $x5='0.00%'
$x6='0.00%', $x7='0.00%', $x8='0', $x9='Beta'
----------------------------------------------------------------
$count = 3:  $match='3 2588 9728 266 11.35% 2.84% 0.62% 0 Alpha'
$x1='3', $x2='2588', $x3='9728', $x4='266', $x5='11.35%'
$x6='2.84%', $x7='0.62%', $x8='0', $x9='Alpha'
----------------------------------------------------------------
 
Errata: ... result contains a list of matched string and [highlight]extracted[/highlight] variables
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top