Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Help with sed script

Status
Not open for further replies.

petperson

Programmer
Oct 2, 2002
106
US
Hi,
I am learning UNIX and writing a sed script. I have a data file that has pipe symbol as the first character and the last character of each line. Every way I try to do this, I remove the entire line. Can someone help me please?

thanks
P.P.
 
what is it that you're trying to do?
explain and provide sample input and a desired output.

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
My data file looks like this:

|123-45-6789|Bob |Jones |10 |20 |30 |
|111-22-3333|Bill |Brown |15 |25 |35 |

I need to remove the first pipe and the last pipe on
each line.

I will then remove the rest of the pipes and replace
them with commas. I thought if I could first remove
the first and last pipe, I could then substitute the
comma's for the remaining pipes. But each time I try
the entire line gets deleted.

thanks
 
something to start with

sed -e 's/^|\(.*\)|$/\1/g;s/|/,/g' file.txt

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
thank you, that worked. One more question though...can you explain what the first part is doing when it deletes the pipe? It's the (.*\) that threw me off..and also the \1.

thanks for your help...
pp
 
s/^|\(.*\)|$/\1/g

^ - beginning of the line
| - followed by a pipe
\(.*\) - any number of characters captured - the \(...\) is called capture that can be back-referenced in the 'replacement' part

| - followed by a pipe
$ - end of line

\1 - back reference of the FIRST capture

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Another way, without back reference, so faster:
sed 's!^|!!;s!|!,!g;s!,$!!' /path/to/input

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
thanks for all your help. I have another problem now. My file now looks like this:

id ,last ,first ,wk1,wk2,wk3
pts ,,,100,150,100
111-78-7878,Jones ,Ken ,95,143,79
123-45-6789,Rich ,Donald,99,123,89

I need to remove all the empty spaces. So I tried using this:

sed -e 's/\([,]\)[ ]*/\1/g' test

but it doesn't appear to do anything. What am I missing?

thanks

 
assuming there are no embedded spaces:

sed -e 's/[ ][ ]*//g' file

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
thanks..I was thinking I had to do something between the commas. Probably thinking too much!
 
A safer way:
sed 's! *,!,!g;s! *$!!' test

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
and what about:
111-78-7878,Jones ,Ken ,95,143,79


vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Now I cannot seem to reverse the order of the names. My file now looks like this:

id,last,first,wk1,wk2,wk3
pts,,,100,150,100
111-78-7878,Jones,Ken,95,143,79
123-45-6789,Rich,Donald,99,123,89

I've tried several different things to see if I have it right but nothing works. I think the SSN in the front confuses me. Is that the first 'field'? What determines the breaks getween fields? I'm confused...

sed 's/\(..*\)\(..*\)/\3 \2/' test

sed 's/\(Jones\)\(Ken\)/\3 \2/' test
 
For field manipulations, you may consider awk:
awk -F',' 'BEGIN{OFS=","}{a=$3;$3=$2;$2=a;print}' test

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
sed -e 's/^\([^,][^,]*\),\([^,][^,]*\),\([^,][^,]*\),\(.*\)/\1,\3,\2,\4/g' test

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
If awk is not an option:
sed 's!^\([^,]\{1,\}\),\([^,]\{1,\}\),\([^,]\{1,\}\),!\1,\3,\2,!' test

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
your suggestions are working. I just need to keep practicing to understand all of this. thanks so much.

p.p.
 
Let awk do it all.
awk -f petperson.awk file.txt

petperson.awk contains
Code:
BEGIN { FS="|"; OFS="," }
{ gsub( /^[|] *| *[|]$/, "" )
  gsub( / +[|]/, "|" )
  t=$2; $2=$3; $3=t
  print
}


Before reading file, set Field Separator and Output Field Separator:
[tt]BEGIN { FS="|"; OFS="," }
[/tt]
Remove | and spaces at beginning and end of line:
[tt]{ gsub( /^[|] *| *[|]$/, "" )
[/tt]
Remove trailing spaces in each field:
[tt] gsub( / +[|]/, "|" )
[/tt]
Swap field 2 and field 3; a side-effect is that [tt]$0[/tt] is rebuilt with [tt]OFS[/tt]
between the fields:
[tt] t=$2; $2=$3; $3=t
print
}
[/tt]

 
There's a shorter way:
Code:
BEGIN { FS=" *[|] *"; OFS="," }
{ gsub( /^[|] *| *[|]$/, "" )
  t=$2; $2=$3; $3=t
  print
}

Before reading file, set Field Separator and Output Field Separator ([tt]FS[/tt] will
match [tt]|[/tt] and the spaces on either side of it);
[tt]BEGIN { FS=" *[|] *"; OFS="," }
[/tt]
Remove | and spaces at beginning and end of line:
[tt]{ gsub( /^[|] *| *[|]$/, "" )
[/tt]
Swap field 2 and field 3; a side-effect is that [tt]$0[/tt] is rebuilt with [tt]OFS[/tt]
replacing [tt]FS[/tt] (this eliminates trailing and leading spaces in each field):
[tt] t=$2; $2=$3; $3=t
[/tt]
Print [tt]$0[/tt] (the line):
[tt] print
}
[/tt]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top