Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

nawk - how to remove comma from "26 Apr, 2007"? 1

Status
Not open for further replies.

kornShellScripter

Programmer
Apr 27, 2007
24
GB
Hi,

I have the following input:

* Report Title ******** Thu 26 Apr, 2007 6:00 AM *
<rest of report snipped>

I need to extract the date from this line

I'm using:

OUT_TEXT_DATE=$(head -1 $FILE_NAME| nawk '{print $6, $7, $8}')

Which produces:
"26 Apr, 2007"

I'm looking for an elegant(1) way to remove the "," such that the output is:
"26 Apr 2007"
(1)-rather than split the fields up and use sed to remove it, and which would make the script more ugly

Regards.
 
Hi

And what can be in place of '********' ? It may break your field numbering theory.
Code:
OUT_TEXT_DATE=$(nawk 'NR==1{sub(/,/,"",$7);print $6, $7, $8;exit}' $FILE_NAME)

Feherke.
 
One solution I've found:

OUT_TEXT_DATE=$(head -1 $FILE_NAME | nawk 'FS="[ ,]+"{print $10, $11, $12}')

but I'd ideally like something like this:
OUT_TEXT_DATE=$(head -1 $FILE_NAME | nawk '{print $10, subString($11,",",""), $12}')


hmm... now even that looks awful... Maybe my FS solution is the best I can get.

I'm still keen to see others peoples' ideas.

Thanks.
 
Oh hi Feherke, our posts crossed.

************ is the report's header - literally a string of asterix characters

So it's something like this (if I only knew how to post the following in fixed-pitch font):

* Report Name *********************** Thu 26 Apr, 2007 *
* *
* DATA INPUT REPORT *
* *
********************************************* Page 1 *
<rest of the report goes here>


They're not my reports - I'm just given them as input so I have no control over them.
 
Hi

There is a solution close to what you thinking, but that is [tt]gawk[/tt] only...
Code:
OUT_TEXT_DATE=$(head -1 $FILE_NAME | gawk '{print $10, gensub(/,/,"","",$11), $12}')
kornShellScripter said:
if I only knew how to post the following in fixed-pitch font
Enclose it between [tt][ignore][tt][/ignore][/tt] and [tt][ignore][/tt][/ignore][/tt] tags according to TGML :

[tt]* Report Name *********************** Thu 26 Apr, 2007 *
* *
* DATA INPUT REPORT *
* *
********************************************* Page 1 *[/tt]

Feherke.
 
Hi

kornShellScripter, why do not use a [tt]ksh[/tt] solution ?
Code:
s="`line < $FILE_NAME`"
s="${s%  *}"
s="${s#*  *}"
OUT_TEXT_DATE="${s/,/}"
Note that the last line requires [tt]ksh[/tt] 93.

Feherke.
 
Another solution, which I think is closer to what I want:

nawk 'sub(",",""){print $6, $7, $8}'

Got this function from the O'Reilly "sed & awk" book by Dale Dougherty


 
Off topic, but Feherke, congrats on being made 'Tipmaster Of The Week'. A well-earned accolade I'm sure!

I want to be good, is that not enough?
 
How about using the tr command. The tr -d option will remove any character you specify to remove. In this case, it will be a comma.
 
Perhaps like this. Not tested.
Code:
OUT_TEXT_DATE=$(head -1 $FILE_NAME| nawk '{print $6, $7, $8}')

OUT_TEXT_DATE=`echo $OUT_TEXT_DATE`| tr -d ','
 
Hi

Certainly not like that. Maybe :
Code:
OUT_TEXT_DATE=`echo $OUT_TEXT_DATE[highlight red] [/highlight]| tr -d ','[red][b]`[/b][/red]
But personally I would not involve another tool. That [tt]head[/tt] is also avoidable. [tt]awk[/tt] itself is very versatile, usually is faster to let it solve as much as it can.

Feherke.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top