Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Extracting string from file with variable lenght records 1

Status
Not open for further replies.

amkipnis

Programmer
Apr 15, 2003
21
US
Hello,
I have a file with variable length records. I need to extract the 7 bytes pattern "P??????" , then contatenated with ".txt" and save it to the file.
How can I do it with awk or ksh. Any help is greatly appreciated.
Here are some "rules".
Each field across could contain spaces
Pattern has a random position in column.
"/" is a delimiter between fields.

Thanks in advance!

Here is a input file:

/Folder1/Next Directory/Productivity-LA/New Money/HFC & MBNA/Weekly/P6A3103_02-15-2004.roi
/Folder1/Next Directory/Affiliate/HCS/New Money/Weekly/P6A3301_02-15-2004.roi
/Folder1/Next Directory/Affiliate/HRS/New Money/Weekly/P6A3302_02-15-2004.roi
/Folder1/Next Directory/Affiliate/Get Smart/New Money/Weekly/P6A3303_02-15-2004.roi
/Folder1/Next Directory/Affiliate/Folder1 Direct/MBNA All/New Money/Weekly/P6A3502_02-15-2004.roi
/Folder1/Next Directory/Affiliate/Folder1 Direct/MBNA Prior/New Money/Weekly/P6A3503_02-15-2004.roi
/Folder1/Next Directory/Affiliate/Folder1 Direct/MBNA New/New Money/Weekly/P6A3504_02-15-2004.roi
/Folder1/Next Directory/Affiliate/Folder1 Direct/MBNA No/New Money/Weekly/P6A3505_02-15-2004.roi
/Folder1/Next Directory/Affiliate/BET/New Money/Weekly/P6A3601_02-15-2004.roi

Here is an output I would like to get:
P6A3103.txt
P6A3301.txt
P6A3302.txt
P6A3303.txt
P6A3502.txt
P6A3503.txt
P6A3504.txt
P6A3505.txt
P6A3601.txt

Thanks!
 
Something like this ?
Code:
sed -n '
s!.*/\(P......\)_.*!\1.txt!p
' </path/to/inputfile >output

Hope This Help, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884
 
I have been using your solution for a while until I got to different problem. Here is a list:
/Folder1/Next Directory/Operational/Pricing/Pricing Matrices/Next Directory/Pending/P0I5002_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/Pricing Matrices/Next Directory/PendAppr/P0I5003_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/Pricing Matrices/Next Directory/Review/P0I5004_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/Pricing Matrices/Direct Lending/Pending/P0I5006_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/Pricing Matrices/Direct Lending/PendAppr/P0I5007_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/Pricing Matrices/Direct Lending/Review/P0I5008_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/Pricing Matrices/Canada/PendAppr/P0I5011_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/Pricing Matrices/Canada/Review/P0I5012_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/SpecialRules/HHoldDirect/Pending/P0J01A10_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/SpecialRules/HHoldDirect/PendAppr/P0J01A11_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/SpecialRules/HHoldDirect/Review/P0J01A12_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/SpecialRules/US/PendAppr/P0J01A3_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/SpecialRules/US/Review/P0J01A4_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/SpecialRules/Canada/Approved/P0J01A5_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/SpecialRules/Canada/Pending/P0J01A6_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/SpecialRules/Canada/PendAppr/P0J01A7_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/SpecialRules/Canada/Review/P0J01A8_03-25-2004.roi
/Folder1/Next Directory/Operational/Pricing/Control/Rescission/PendFundSwChg/Daily/P0K7201PFSC_03-25-2004.roi

After executing slightly modified original command:
sed -n 's!.*/\(P......*\)_.*!\1.txt!p'

I receive the following result:

P0I5002.txt
P0I5003.txt
P0I5004.txt
P0I5006.txt
P0I5007.txt
P0I5008.txt
P0I5011.txt
P0I5012.txt
P0J01A10.txt
P0J01A11.txt
P0J01A12.txt
P0J01A3.txt
P0J01A4.txt
P0J01A5.txt
P0J01A6.txt
P0J01A7.txt
P0J01A8.txt
P0K7201PFSC.txt

As you can see I can got extra 4 alpha characters in the last record. It should say P0K7201.txt instead.

I appreciate for your help! Thanks. Alex.
 
Alex,

The extra wildcard (*) in the command:

sed -n 's!.*/\(P......*\)_.*!\1.txt!p'

is expanding to print all characters after the P. If you just want the first 6 characters, you have to remove the extra asterisk.

sed -n 's!.*/\(P......\)_.*!\1.txt!p'

John
 
John, you gave me a clue that helped me to build the correct command:
sed -n 's!.*/\(P......[0-9]*\).*!\1.txt!p'

Thank you for your help!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top