Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Looking to take an input file that

Status
Not open for further replies.

stemp1ar

Programmer
Aug 3, 2001
10
US
Looking to take an input file that is not in the proper format and
change the file to the proper format.

current format is: 1234-7654321 Characters - more chars,1234

I am able to do what I need in multiple iterations of a file, but what
I really would like is to change the file in one pass.

requested format: 1234|7654321|Characters|1234|today's date|more characters

currently have:
echo ' {print substr($1,1,4) "|" substr($1,6,7) "|"}
' > ./awkscript.txt

awk -f ./awkscript.txt ./datafile > datafile.new

looking for a regular expression using awk or any other suggestions that will work...

I am able to parse the first 2 fields, but I am very new to awk and having a bit of trouble with the syntax of reg expression pattern matching...I would think using a reg expression getting the $2 up the the comma delimiter would be the next step...any help would be great

Thanks
 
Here is a solution using regular expressions in sed
Code:
sed "s/\(....\)-\(.......\)\([^-]*\)-\([^,]*\),\(.*\)/\1|\2|\3|\5|13Dec01|\4/"
You could also do it in awk using index to find the delimeter characters and substr to split up the line.
Hope this helps.
CaKiwi
 
Thanks...I did use sed in my first iteration...

Is there a way in sed to use a variable in the replacement string?

sed "s/\(....\)-\(.......\)\([^-]*\)-\([^,]*\),\(.*\)/\1|\2|\3|\5|13Dec01|\4/"

such as: "s/\(....\)-\(.......\)\([^-]*\)-\([^,]*\),\(.*\)/\1|\2|\3|\5|$VARNAME|\4/"

In my attempts the $VARNAME prints exactly that...even attempting to to `$(date)` to execute the statement and suggestions???



 
The only way I can think of doing it is to echo the sed program to a file with the date formatted appropriately in it
Code:
echo "s/ ... |`date %d%b%Y`| ... /" > sed.tmp
then run sed using this file as the program
Code:
sed -f sed.tmp ./datafile > ./datafile.new
I don't think there is a way to get sed to read its program from standard input.

Hope this helps. CaKiwi
 
The sed faq states that using the double quotes
as sed -e " s/pat/$VAR/g" text will work in this situation....
 
Thanks, I just tried that before coming here and yes it does work in this situation...

Again, thanks for the help
 
I have a question on the post above and the regular expression used:

sed "s/\(....\)-\(.......\)\([^-]*\)-\([^,]*\),\(.*\)/\1|\2|\3|\5|13Dec01|\4/"

The original input file: 1234-7654321 Characters - more chars,1234

In the example above, I am curious on the matching...

regexp used:/\(....\)-\(.......\)\([^-]*\)-\([^,]*\),\(.*\)/\1|\2|\3|\5|13Dec01|\4/

matching: /\(1234\)-\(7654321\)\ Characters - more chars,1234

I understand the first portion, but what is "([^-]*\)-\([^,]*\),\(.*\)" and )/\1|\2|\3|\5 and
\4/ ???

 
stemplar,

A ^ as the first character in [] means that any character which is not in the [] will be matched i.e. [^-] matches any character which is not a -. [^-]* means match any number of characters which are not a -. The first set of characters matched in a \( \) pair is saved in register 1, the 2nd in register 2 etc. and they are replayed in the replacement string by specifying \1, \2 etc. I hope this is clear. Post again is you need further explanation. CaKiwi
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top