Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Print out first 3 strings in a specified column 1

Not open for further replies.


Aug 4, 1999
Separating columns via multiple spaces...

awk -F '[[:space:]][[:space:]]+' '{ print $1, $7"/"$6, $12 }'

Each column of data will have multiple data separated by spaces..

Not sure how to print out only the first 3 items in $12....

Any idea's


Joe Despres

So the fields are separated by two or more whitespace characters and sub-fields are separated by single whitespace characters.

The big question here is whether you need to preserve the original whitespaces. I mean, to keep space in the output where was space in the input and keep tab where was tab.

If not needed to preserve whitespaces as they were, is simple :
awk -F '[[:space:]][[:space:]]+' '[teal]{[/teal] [b]split[/b][teal]([/teal][navy]$12[/navy][teal],[/teal] m[teal],[/teal] [fuchsia]/[[:space:]]/[/fuchsia][teal]);[/teal] [b]print[/b] [navy]$1[/navy][teal],[/teal] [navy]$7[/navy][i][green]"/"[/green][/i][navy]$6[/navy][teal],[/teal] m[teal][[/teal][purple]1[/purple][teal]],[/teal] m[teal][[/teal][purple]2[/purple][teal]],[/teal] m[teal][[/teal][purple]3[/purple][teal]] }[/teal]'

If need to preserve whitespaces, gets abit complicated, but still reasonably simple if only has to work in GNU Awk :
awk -F '\\s\\s+' '[teal]{[/teal] [b]print[/b] [navy]$1[/navy][teal],[/teal] [navy]$7[/navy][i][green]"/"[/green][/i][navy]$6[/navy][teal],[/teal] [COLOR=orange]gensub[/color][teal]([/teal][fuchsia]/(\S+\s\S+\s\S+).*/[/fuchsia][teal],[/teal] [i][green]"[/green][/i][lime]\\[/lime][i][green]1"[/green][/i][teal],[/teal] [i][green]""[/green][/i][teal],[/teal] [navy]$12[/navy][teal]) }[/teal]'

[gray]# or[/gray]

awk -F '\\s\\s+' '[teal]{[/teal] [b]match[/b][teal]([/teal][navy]$12[/navy][teal],[/teal] [fuchsia]/\S+\s\S+\s\S+/[/fuchsia][teal],[/teal] m[teal]);[/teal] [b]print[/b] [navy]$1[/navy][teal],[/teal] [navy]$7[/navy][i][green]"/"[/green][/i][navy]$6[/navy][teal],[/teal] m[teal][[/teal][purple]0[/purple][teal]] }[/teal]'

If need to preserve whitespaces and to be portable ( or at least work with something else than GNU Awk ) :
awk -F '[[:space:]][[:space:]]+' '[teal]{[/teal] [b]match[/b][teal]([/teal][navy]$12[/navy][teal],[/teal] [fuchsia]/[^[:space:]]+[[:space:]][^[:space:]]+[[:space:]][^[:space:]]+/[/fuchsia][teal]);[/teal] [b]print[/b] [navy]$1[/navy][teal],[/teal] [navy]$7[/navy][i][green]"/"[/green][/i][navy]$6[/navy][teal],[/teal] [b]substr[/b][teal]([/teal][navy]$12[/navy][teal],[/teal] RSTART[teal],[/teal] RLENGTH[teal]) }[/teal]'

( As you can see, in GNU Awk you can use [tt]\s[/tt] for [tt][[:space:]][/tt] and [tt]\S[/tt] for [tt][^[:space:]][/tt]. That also works in original-awk ( available on Ubuntu, not sure about its origin ), but not in Mawk. There the closest alternative would be [tt][ \t][/tt] for [tt][[:space:]][/tt] and [tt][^ \t][/tt] for [tt][^[:space:]][/tt]. )

didn't work....

I do like the ::---> '\\s\\s+'


Joe Despres

Joe said:
didn't work....
Sorry to hear that. Could you post some sample input and expected output ? And specify which Awk implementation / version are you using.

awk -W version
GNU Awk 3.1.8

Using awk on a Avamar system :)

#### Here's the raw out put from the mccli command ::--->
9145091880251509 Completed w/Exception(s) 10010      2015-12-23 20:00 EST 00h:59m:07s 2015-12-23 20:59 EST Scheduled Backup   6.2 TB         0.1%      yyy.com /xxxx Windows Server 2008 R2 Enterprise Server Edition (No Service Pack) 64-bit 7.0.102-47     2015-12-23 20:00 EST 2015-12-24 08:00 EST 00h:00m:36s  /xxxx/Windows 2008                Windows File System Retention_xxxx   D         xxxx Windows /xxxx/Windows_2008                       xxxx Windows-Windows 2008-1450918802270                        Avamar N/A
9145091880251709 Completed w/Exception(s) 10010      2015-12-23 20:59 EST 00h:05m:19s 2015-12-23 21:05 EST Scheduled Backup   42.8 GB        0.8%      yyyy.com /xxxx  Windows Server 2008 R2 Enterprise Server Edition (No Service Pack) 64-bit 7.0.102-47     2015-12-23 20:00 EST 2015-12-24 08:00 EST 00h:59m:45s  /xxxx/Windows 2008                Windows VSS         Retention_xxxx   D         xxxx Windows /xxxx/Windows_2008                       xxxx Windows-Windows 2008-1450918802270                        Avamar N/A
9145083240268209 Completed w/Exception(s) 10010      2015-12-22 22:11 EST 00h:48m:34s 2015-12-22 23:00 EST Scheduled Backup   6.2 TB         0.1%      yyyy.com /xxxx  Windows Server 2008 R2 Enterprise Server Edition (No Service Pack) 64-bit 7.0.102-47     2015-12-22 20:00 EST 2015-12-23 08:00 EST 02h:11m:46s  /xxxx/Windows 2008                Windows File System Retention_xxxx   D         xxxx Windows /xxxx/Windows_2008                       xxxx Windows-Windows 2008-1450832402416                        Avamar N/A

#### Output desired ::--->
9145091880251509 Completed w/Exception(s) /xxxx/yyy.com Windows File System
9145091880251709 Completed w/Exception(s) /xxxx/yyy.com Windows VSS
9145083240268209 Completed w/Exception(s) /xxxx/yyy.com Windows File System

Basically I want to check for exceptions from yesterdays backup results... Will apply this same info to the failures as well..


Joe Despres

Then the field separator theory seems not good enough :
... yyy.com /xxxx[highlight red] [/highlight]Windows Server 2008 ...
... yyyy.com /xxxx[highlight green]  [/highlight]Windows Server 2008 ...
... yyyy.com /xxxx[highlight green]  [/highlight]Windows Server 2008 ...

As you have GNU Awk, I would say, better we use the [tt]match()[/tt] function to collect the needed pieces. ( [tt]match()[/tt]'s 3[sup]rd[/sup] parameter is GNU extension. )

But having only limited information about the input ( I assume those "xxxx" are placeholders for sensitive data ), putting together the regular expression would be quite long. So I would suggest an off-topic solution : Perl, because it's regular expressions support non-greedy quantifiers.
perl -ne '[b]print[/b][i][green]"$1 $3/$2 $4\n"[/green][/i][b]if[/b][i][green]/^(.+?)\s+\d+\s+\d{4}-\d{2}-\d{2}.+?\s(\w+\.\w+)\s+(\/\w+).+\s{2,}\/\w+\/.+?\s{2,}(.+)\s+Retention/[/green][/i]'

Actually the accent is on non-greedy modifiers, so any tool/language with PCRE would do it.

Hey Feherke.....

That didn't work :(

Thanks! You shouldn't work on this any more...

Joe Despres

Well, it works for the sample input... I suppose the issue is with those "xxxx", which I try to match a [tt]\w+[/tt]. If they contain non-word characters, those will break the matching.

Yeah, xxxx is just alphabet characters


Joe Despres
I totally forgot! mccli command can output xml!

      <StartTime>2015-12-26 20:11 EST</StartTime>
      <EndTime>2015-12-26 20:18 EST</EndTime>
      <Type>Scheduled Backup</Type>
      <ProgressBytes>22.7 GB</ProgressBytes>
      <OS>Windows Server 2008 R2 Enterprise Server Edition Service Pack 1 64-bit</OS>
      <Sched.StartTime>2015-12-26 20:00 EST</Sched.StartTime>
      <Sched.EndTime>2015-12-27 08:00 EST</Sched.EndTime>
      <Plug-In>Windows VSS</Plug-In>

Each backup generates one set of this...

All I really need is to strip out all the tags and put the data on one line separated by a comma

Joe Despres

Joe said:
All I really need is to strip out all the tags and put the data on one line separated by a comma
May I suggest another off-topic solution for that ? XMLStarlet :
xmlstarlet sel -t -m //Row -v ID -o , -v Status -o , -v Errorcode -o , -v Domain -o / -v Client -o , -v Plug-In -n
( Although not sure where the commas will come in the picture as until now the separators were spaces. )

Bummer...... I don't have "xmlstarlet" installed :(

#### This seems to work ::--->
raw-quickc () {
export MCCLI=/usr/local/avamar/bin/mccli
export BIN=/home/admin/bin
echo "ID,Status,ErrorCode,StartTime,Elapsed,EndTime,Type,ProgressBytes,NewBytes,Client,Domain,OS,ClientRelease,Sched.StartTime,Sched.EndTime,ElapsedWait,Group,Plug-In,RetentionPolicy,Retention,Schedule,Dataset,WID,Server,Container"
$MCCLI activity show --completed=true --verbose --xml | sed -n '/<Row/,/<\/Row/p'| sed 's/<\/\?[^>]\+>//g'|awk '{$1=$1}1'|awk -f $BIN/ONE_Line.awk|sed 's/\&amp\;lt\;//g'

ugly enough to back a buzzard off a gut wagon!

#### ONE_Line.awk ::--->
BEGIN { RS = ""; FS = "\n"; ORS = "" }
        while ( x<NF ) {
                print $x ","
        print $NF "\n"

My next goal is to grep out part of a column :)


Joe Despres
Not open for further replies.

Part and Inventory Search

