printing columns

Guest_imported · Mar 10, 2002

Hello,

I want to search for a word in a file but I also want a clean output. So if I search for the word 'the' in a file I don't just want the records but I want something like this:

the guy thinks she's crazy
He's trying to hit the ball
I think the doctor is coming

Can somebody help me with that?
Thanx
Col81

aigles · Mar 11, 2002

awk -f Format.awk -v WORD=the file

--- Format.awk ---
BEGIN {
pattern = "(^|" IFS &quot

" WORD "($|" IFS &quot

" ;
}
$0 ~ pattern {
print " ",$0 ;
next ;
}
{
print
}
--- End of Format.awk --- Jean Pierre.

Guest_imported · Mar 11, 2002

I'm sorry, the thread didn't look like I wanted to. I want the words 'the' all nicely under each other so:

aaa the bbbbb
the ccc
ddddd the e
(maybe it still doesn't look right)

I don't know what went wrong but aigles' script doesn't do anything with my data file.

Col81

aigles · Mar 11, 2002

Hi,

In the previous script replace IFS by FS (IFS if for ksh not for awk).
So, in all cases the script doesn't format the output like you want.

What must the script do ?

For each input line
Find the word position
If word in line then memorize line, position, mawimum position
end

For all memorized lines
Compute spaces to print before line text
Print spaces and line
end

Find the word position in the input line:

With gawk, no problem :
[tt]
pattern = "\\<" WORD "\\>" ; # WORD is word
pos = match($0, pattern) ; # Pos != 0 if found
[/tt]
In others awk implementation :
[tt]
pattern1 = FS WORD (^|" FS &quot

" ; # WORD in middle/end of line
pattern2 = "^" WORD (^|" FS &quot

" ; # WORD at start of line
pos = match($0, pattern1) ; # Search WORD in the line
if (pos > 0) # If found
pos += 1 ; # skip FS
else # else
pos = match($0, pattern2) ; # search WORD at start of line
[/tt]

Memorize line, word position, maximum position:
[tt]
if (pos != 0) { # If found
Line[++Cnt] = $0 ; # Memorize line (Cnt=line count)
Pos[Cnt] = pos ; # Memorize word position in line
if (pos > MaxPos) # Set max position
MaxPos = pos ; #
} #
[/tt]
Compute spaces to print before line text:
[tt]
for (l=1; l<= MaxPos; l++) # Build a the maximum spaces string
spaces = spaces " " ; #
SpacesCnt = MaxPos-Pos[l] ; # Space count to print for line l
[/tt]
Print spaces and line:
[tt]
printf substr(spaces,1,MaxPos-Pos[l]) ;
print Line[l] ;
[/tt]

Now, the complete script :

--- AlignWord.awk (gawk version) [COLOR=/---[/color]
BEGIN {
pattern = "\\<" WORD "\\>" ;
}
{
pos = match($0, pattern) ;
if (pos != 0) {
Line[++Cnt] = $0 ;
Pos[Cnt] = pos ;
if (pos > MaxPos)
MaxPos = pos ;
}
}
END {
for (i=1; i<= MaxPos; i++)
spaces = spaces " " ;
for (l=1; l<=Cnt; l++) {
printf substr(spaces,1,MaxPos-Pos[l]) ;
print Line[l] ;
}
}
--- End of AlignWord.awk ---

awk -v WORD=the -f AlignWord.awk file

Jean Pierre.

Guest_imported · Mar 11, 2002

Thank you aigles, your script works allright.
Probably you also know how I can just get the left and the right context into a variable, say left and right, in order to use a printf-statement:

printf("%30s %s %s\n", left, pattern, right)

I know your solution works very well but I want to know whether I could deal with it this way.

col81

aigles · Mar 12, 2002

Hi,

First a correction ...

In others awk implementation :

pattern1 = FS WORD ($|" FS &quot

" ; # WORD in middle of end of line
pattern2 = "^" WORD ($|" FS &quot

" ; # WORD at start of line

Now, another version of the script using a function that search for a WORD and return its position and the left (RLEFT) and right (RGIHT) context.

When calling the script, defines the variables :
WORD = Word to search for
COL = Col where to print the WORD

--- AlignWord2.awk (gawk version) [COLOR=/---[/color]
BEGIN {
if (WORD == "&quot

exit ;
if (COL == 0 ) COL = 10 ;
COL -= 1 ;
}
function WordMatch(String, Word, ere1, ere2, wpos, wlen) {
ere1 = FS Word "($|" FS &quot

" ;
ere2 = "^" Word "($|' FS &quot

" ;
wlen = length(Word) ;
wpos = match(String, ere1) ;
if (wpos != 0) {
wpos += 1 ;
else
wpos = match(String, ere2) ;
if (wpos > 0) {
RLEFT = substr(String, 1, wpos-1) ;
RRIGHT = substr(String, wpos+wlen) ;
} else {
RLEFT = $0 ;
RRIGHT = "" ;
}
RSTART = wpos ;
RLENGTH = wlen ;
return wpos :
}
if (WordMatch($0, WORD) > 0) {
RLEFT = substr(RLEFT, RSTART-COL) ;
printf "%*s%s%s\n",COL,RLEFT,WORD,RRIGHT ;
}
}
--- End of AlignWord.awk ---

awk -v WORD=the -v COL=12 -f AlignWord2.awk file
Jean Pierre.

Guest_imported · Mar 13, 2002

Hi aigles,

can I bother you one last time with this?
Is it possible to adapt your script so that I can specify the number of words in the left and right contexts? (so even involve the previous and the next record).

input:

aaaa bbb cc ddd eee
fff g the hh ii jjj
lll mm

output (-v context=4):

ddd eee fff g the hh ii jjj lll

TIAcol81

aigles · Mar 14, 2002

You need another logic to do that because of lookahead.

The file is read word by word and not line by line using the LwrGetWord function. You get the left and right contexts by calling LwrGetLeft and LwrGetRight.

--- Lwr.awk ---
BEGIN {
if (WORD == "&quot

exit ;
if (COL == 0 ) COL = 20 ;

if (CONTEXT == 0 ) CONTEXT = 3 ;
LwrLen = CONTEXT ;
LwrSiz = CONTEXT * 2 + 1 ;
LwrCnt = 0 ;
LwrWrd = 0 ;
LwrFld = 0 ;
}

function LwrGetInput( p) {
while (1) {
if (++LwrFld > NF) {
if (getline <= 0)
$0 = "" ;
LwrFld = 1 ;
}
p = LwrCnt++ % LwrSiz ;
LwrCtx[p] = $LwrFld ;
if (LwrCnt > LwrLen) break ;
}
}

function LwrGetWord( p) {
LwrGetInput() ;
p = LwrWrd++ % LwrSiz ;
return LwrCtx[p] ;
}

function LwrGetLeft( i,lp,lc) {
for (i=LwrLen; i>0; i--) {
lp = (LwrWrd - i + LwrSiz -1) % LwrSiz ;
lc = lc " " LwrCtx[lp] ;
}
sub("^ *","",lc) ;
return lc
}

function LwrGetRight( i,rp,rc) {
for (i=1; i<=LwrLen; i++) {
rp = (LwrWrd + i - 1) % LwrSiz ;
rc = rc LwrCtx[rp] " ";
}
sub(" *$","",rc) ;
return rc
}

{
ll = COL - 1 ;
W = LwrGetWord() ;
while (W != "&quot

{
if (W == WORD) {
left = LwrGetLeft() " " ;
right = " " LwrGetRight() ;
lleft = length(left)
if (lleft > ll)
left = substr(left, lleft-ll+1) ;
printf "%*s%s%s\n",ll,left,W,right;
}
W = LwrGetWord() ;
}
exit ;
}
--- End of Lwr.awk ---

Call the awk script with the variable définitions :
-v WORD=word the word you are looking for
-v COL=col column where to align the word (def=20)
-c CONTEXT=context size (in words) of left and right contexts

awk -f Lwr.awk -v WORD=the -v CONTEXT=4 file Jean Pierre.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

printing columns

Guest_imported

New member

aigles

Technical User

Guest_imported

New member

aigles

Technical User

Guest_imported

New member

aigles

Technical User

Guest_imported

New member

aigles

Technical User

Similar threads

Part and Inventory Search

Sponsor