Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

how to remove new line

Status
Not open for further replies.

domenicodl

Programmer
Jul 2, 2011
15
IT
Hi,
I would lie to know how to remove the new line from a string.
Googling I found the command prints(%s, $0) that is not what I'm looking for.
I have to parse a file like the following one

000020E0 3D 20 00 40 lis r9,400000
000020E4 39 29 A0 00 addi r9,-6000
000020E8 80 09 00 00 lwz r0,00(r9)
000020EC 7C 03 03 78 mr r3,r0
000020F0 81 61 00 00 lwz r11,00(r1)
000020F4 83 EB FF FC lwz r31,-04(r11)
000020F8 7D 61 5B 78 mr r1,r11


Since I want to count the occurrence of each "rx" string (e.g., r1 appears 1, r9, 3 times and so on)
The fragment of code with which I find the string rx is the following
register = $7;
split(register,r,",");
for (i in r){
sub(/\)/,"",r); #I remove "() -" that I do not need
sub(/\(/,"",r);
sub(/\-/,"", r);

Now from the r I have to remove also the new line, how can I do it?
of course I can't use the command printf(%s, r) because I need to use r for further processing and not to print it.
Thank you in advance.




 
Hi

I would use abit different approach :
Code:
[teal]{[/teal]
  register [teal]=[/teal] [navy]$7[/navy][teal];[/teal]
  [b]split[/b][teal]([/teal]register[teal],[/teal]r[teal],[/teal][fuchsia]/[,()-]/[/fuchsia][teal]);[/teal] [gray]# <- other unwanted characters here[/gray]
  [b]for[/b] [teal]([/teal]i [b]in[/b] r[teal])[/teal]
    [b]if[/b] [teal]([/teal]r[teal][[/teal]i[teal]]~[/teal][fuchsia]/^r[0-9]+$/[/fuchsia][teal])[/teal]
      rcount[teal][[/teal]r[teal][[/teal]i[teal]]]++[/teal]
[teal]}[/teal]
There is easy to add more unwanted characters as delimiters for [tt]split()[/tt], for example the newline : [tt]/[,()[highlight]\n[/highlight]-]/[/tt] .

But note that Awk says :
man gawk said:
RS The input record separator, by default a newline.
So newlines should be consumed by default as record separators. Are you sure those are your problems ?


Feherke.
 
Hi feherke,
thank you for your reply. I think my problem is the new line because after splitting and removing "()"if I use "length" on "00r1" I get 5 rather than four. The same for r11 at line 0x20F8.
However, I tried your code fragment, it doesn't work properly.
For example on the line 0x20F8 it doesn't coun't the string r11.
If I understood your script correctly. it count all the string in the form rx where x = 0,...,9.
Probably, I haven't been clear formerly, I also need to count the string rxx
e.g. r31, r10 and so on.
Thank you again
 
Hi

domenicodl said:
For example on the line 0x20F8 it doesn't coun't the string r11.
It counts it for me on Linux using [tt]gawk[/tt] and [tt]mawk[/tt], in CygWin using [tt]gawk[/tt] and on Windows using [tt]awk95[/tt]. What are you using ?
domenicodl said:
If I understood your script correctly. it count all the string in the form rx where x = 0,...,9.
Probably, I haven't been clear formerly, I also need to count the string rxx
e.g. r31, r10 and so on.
Code:
 +---------- string starts here
 |+--------- literal "r"
 ||  +------ any character between "0" and "9" inclusively
 ||  |  [red]+--- previous entity repeated 1 or more times[/red]
 || / \ [red]|[/red]+-- string ends here
 ||/   \[red]|[/red]| 
/^r[0-9][red]+[/red]$/

Feherke.
 
Hi

domenicodl said:
I'm using awk on Ubutnu
Code:
[blue]master #[/blue] cat /etc/*release*
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=10.04
DISTRIB_CODENAME=lucid
DISTRIB_DESCRIPTION="Ubuntu 10.04.2 LTS"

[blue]master #[/blue] awk --version | head -1
GNU Awk 3.1.6

[blue]master #[/blue] cat domenicodl.txt
000020E0  3D 20 00 40  lis      r9,400000
000020E4  39 29 A0 00  addi     r9,-6000
000020E8  80 09 00 00  lwz      r0,00(r9)
000020EC  7C 03 03 78  mr       r3,r0
000020F0  81 61 00 00  lwz      [highlight]r11[/highlight],00(r1)
000020F4  83 EB FF FC  lwz      r31,-04([highlight]r11[/highlight])
000020F8  7D 61 5B 78  mr       r1,[highlight]r11[/highlight]

[blue]master #[/blue] cat domenicodl.awk 
{
  print
  register = $7;
  split(register,r,/[,()-]/);
  for (i in r)
    if (r[i]~/^r[0-9]+$/) {
      rcount[r[i]]++
      print r[i]
    }
}

END {
  for (i in rcount) print i,rcount[i]
}

[blue]master #[/blue] awk -f domenicodl.awk domenicodl.txt 
000020E0  3D 20 00 40  lis      r9,400000
r9
000020E4  39 29 A0 00  addi     r9,-6000
r9
000020E8  80 09 00 00  lwz      r0,00(r9)
r0
r9
000020EC  7C 03 03 78  mr       r3,r0
r3
r0
000020F0  81 61 00 00  lwz      [highlight]r11[/highlight],00(r1)
[highlight]r11[/highlight]
r1
000020F4  83 EB FF FC  lwz      r31,-04([highlight]r11[/highlight])
[highlight]r11[/highlight]
r31
000020F8  7D 61 5B 78  mr       r1,[highlight]r11[/highlight]
r1
[highlight]r11[/highlight]
r9 3
[highlight]r11 3[/highlight]
r0 2
r31 1
r1 2
r3 1
Sorry, I think the bug is somewhere else.


Feherke.
 
Hi

The disassembler file attached to your thread271-1655176 reveals the problem : there are carriage return ( [tt]\r[/tt] ) characters, not new line ( [tt]\n[/tt] ) characters. Add those to the regular expression as [tt]/[,()[highlight]\r[/highlight]-]/[/tt] and will work.


Feherke.
 
uhm, ok I will check the code around the code fragment you provided to me. Sorry for making you waste time.
Thank you.
 
Thank you,
it 's very difficult to find people kind like you
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top