Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

EXTRACT DATA FROM PARENTHESIS

Status
Not open for further replies.

andrew_121

Programmer
Feb 23, 2017
7
US
Hi ,
i want to retrive in between data from parenthesis and i a getting errors while run the awk.i have 2 files and want to process 1st file pattern to 2nd file.

pattern_file.txt
--------------
ABCD
PQRS
XYZ

INPUT FILE.TXT
----------------

CRAETE TABLE ABCD
(
A,
B,
C
);
CREATE TABLE PQRS
(
P,
R
);

SO HERE IN CASE 1ST PATTERN READS AND LOOK INTO INPUT FILE.TXT AND IF FINDS THE MATCH IT SHOULD GET THE DATA BETWEEN PARENTHESIS
A,
B,
C
NEXT PATTERN PQRS AND IT MATCHES AND GETS THE RESULT

P,
R

NEXT XYZ COMES AND NO MATCH SO NOTHING RETURNED.

MY CODE
-------
for i in `cat pattern_file.txt`
do
awk '/'{print "$i"}'/{flag=1;next}/);/{flag=0}flag' INPUT_FILE.txt
done
 
Hi

Try this :
Code:
while IFS='' read -r pattern; do
    echo "=== filtering for $pattern ==="
    awk '
        [navy]$0[/navy] [teal]~[/teal] pattern [teal]{[/teal] flag [teal]=[/teal] [purple]1[/purple] [teal]}[/teal]
        [fuchsia]/\(/[/fuchsia][teal],[/teal] [fuchsia]/\)/[/fuchsia] [teal]{[/teal] [b]if[/b] [teal]([/teal]flag [teal]&& ![/teal] [fuchsia]/[()]/[/fuchsia][teal])[/teal] [b]print[/b] [teal]}[/teal]
        [fuchsia]/\)/[/fuchsia] [teal]{[/teal] flag [teal]=[/teal] [purple]0[/purple] [teal]}[/teal]
    ' pattern="$pattern" INPUT_FILE.TXT
done < pattern_file.txt

Next time please use TGML tags to make the code ( see [tt][ignore]
Code:
[/ignore][/tt] .. [tt][ignore]
[/ignore][/tt] ) and input data ( see [tt][ignore][pre][/ignore][/tt] .. [tt][ignore][/pre][/ignore][/tt] ) easier to read. Also would appreciate if you avoid posting the same question in multiple forums.

Feherke.
feherke.ga
 
I flagged the other post and D&D will probably delete it. andrew_121, for future reference, do not post the same question in multiple forums.

==================================
advanced cognitive capabilities and other marketing buzzwords explained with sarcastic simplicity


 
Sure feherke but i am new to AWK and faced issues in this scenario_Other forum people with out listening the actual problem they close the thread.

Here also i am facing while running your code .Based on the Given example it is working fine.But when i tested with real data it fails.i haven given sample file below. Input.txt and Pattern.txt and while running getting only the 1st record of both files.

INPUT_FILE.Txt
[pre]
CREATE TABLE XXX_ALUEUTRAN_DB.ABC

(
ABC DATE NOT NULL,
ABC_123 VARCHAR(6) NOT NULL,
ABC_1234 VARCHAR(32) NOT NULL,
XYZ123 VARCHAR(32) NOT NULL,
AVC_ID VARCHAR(32) NOT NULL,
LTECELL_ID VARCHAR(32) NOT NULL,
ID VARCHAR(32) NOT NULL

);

CREATE TABLE XXX_ALUEUTRAN_DB.XYZ

(
START_DATE DATE NOT NULL,
START_TIME VARCHAR(6) NOT NULL,
SERVER_NAME VARCHAR(32) NOT NULL,
ENBEQUIPMENT_ID VARCHAR(32) NOT NULL,
ENB_ID VARCHAR(32) NOT NULL

); [/pre]

PATTERN_FILE.txt
[pre]
XXX_ALUEUTRAN_DB.ABC
XXX_ALUEUTRAN_DB.XYZ
[/pre]


Code:
while IFS='' read -r pattern; do
   # echo "=== filtering for $pattern ==="
    awk '
        $0 ~ pattern { flag = 1 }
        /\(/, /\)/ { if (flag && ! /[()]/) print }
        /\)/ { flag = 0 }
    ' pattern="$pattern" Input_File.txt > output.txt
done < PatternFile.txt

Expected Output:

[pre]
ABC,
ABC_123,
ABC_1234,
XYZ123,
AVC_ID,
LTECELL_ID,
ID


START_DATE,
START_TIME,
SERVER_NAME,
ENBEQUIPMENT_ID,
ENB_ID

[/pre]

Currently what i am getting by running this code,only the 1st record of all files
Code:
        ABC            DATE NOT NULL,
        START_DATE           DATE NOT NULL,

 
Hi

I see. The intermediary "(" and ")" are messing up the logic. The easiest will be to only handle "(" when is at the beginning of line and ")" when is followed by semicolon ( ; ) :
Code:
[navy]$0[/navy] [teal]~[/teal] pattern [teal]{[/teal] flag [teal]=[/teal] [purple]1[/purple] [teal]}[/teal]
[fuchsia]/[highlight]^[/highlight]\(/[/fuchsia][teal],[/teal] [fuchsia]/\)[highlight];[/highlight]/[/fuchsia] [teal]{[/teal] [b]if[/b] [teal]([/teal]flag [teal]&& ![/teal] [fuchsia]/[highlight]^[/highlight]\([highlight pink]|[/highlight]\)[highlight];[/highlight]/[/fuchsia][teal])[/teal] [b]print[/b] [teal]}[/teal]
[fuchsia]/\)[highlight];[/highlight]/[/fuchsia] [teal]{[/teal] flag [teal]=[/teal] [purple]0[/purple] [teal]}[/teal]

Feherke.
feherke.ga
 
[pre]

Hi Feherke,Thanks this is working fine with my requirement.one small help on Now i need to include one more thing based on that pattern i need to grep from the
below file and there will be only 1 entry in file but they may come in different styles like in single line or in 2 line or in 3 line .Like the same line in 3 formats.
But always file will contain one line inside file for each pattern.

[/pre]

PRIME_FILE.TXT

Code:
ALTER TABLE XXX_ALUEUTRAN_DB.ABC ADD CONSTRAINT ABC_2_PK
	PRIMARY KEY (ABC,ABC_123,ABC_1234,ID);

ALTER TABLE XXX_ALUEUTRAN_DB.ABC ADD CONSTRAINT ABC_2_PK 	PRIMARY KEY (ABC,ABC_123,ABC_1234,ID);

ALTER TABLE XXX_ALUEUTRAN_DB.ABC 
    ADD CONSTRAINT ABC_2_PK
	PRIMARY KEY (ABC,ABC_123,ABC_1234,ID);

I want to include this inside the existing code put into separate file.

Code:
Expected Output
-----------------
(ABC,ABC_123,ABC_1234,ID)
not getting appropriate result after doing grep
Code:
Sample code

while IFS='' read -r pattern; do
   # echo "=== filtering for $pattern ==="
    awk '
        $0 ~ pattern { flag = 1 }
         /^\(/, /\);/ { if (flag && ! /^\(|\);/) print }
         /\);/ { flag = 0 }
    ' pattern="$pattern" sa.txt >output.txt
##need to add a command which can get the output###
[COLOR=#CC0000]grep $pattern PRIME_FILE.TXT > prime_out.txt[/color]     ##single line command to get the output (ABC,ABC_123,ABC_1234,ID)  
done < pat.txt
 
Hi

I would go with a similar Awk code, using range pattern :
Code:
awk '
    [navy]$0[/navy] [teal]~[/teal] pattern[teal],[/teal] [fuchsia]/;/[/fuchsia] [teal]{[/teal]
        [b]if[/b] [teal]([/teal][b]match[/b][teal]([/teal][navy]$0[/navy][teal],[/teal] [fuchsia]/\([^()]+\)/[/fuchsia][teal],[/teal] found[teal]))[/teal] [b]print[/b] found[teal][[/teal][purple]0[/purple][teal]][/teal]
    [teal]}[/teal]
' pattern="$pattern" PRIME_FILE.TXT

Feherke.
feherke.ga
 
[pre]
Hi feherke,Thanks a lot .This is working and suits my requirement.Only thing the problem is AWK has to check for the exact pattern matching which is failing now.
[/pre]
EXP:
Code:
i have 2 pattern file 
XXX_ALUEUTRAN_DB.ABC
XXX_ALUEUTRAN_DB.ABC_1
[pre]
so in this case when the 1st pattern(XXX_ALUEUTRAN_DB.ABC) comes it has to check XXX_ALUEUTRAN_DB.ABC but it is checking both
XXX_ALUEUTRAN_DB.ABC and XXX_ALUEUTRAN_DB.ABC_1 which is making duplicates.i need to make exact pattern match.
[/pre]


 
Hi

Nothing the [tt]\<[/tt] and [tt]\>[/tt] anchors can not solve :
Code:
[navy]$0[/navy] [teal]~[/teal] [highlight][i][green]"[/green][/i][lime]\\[/lime][i][green]<"[/green][/i][/highlight] pattern [highlight][i][green]"[/green][/i][lime]\\[/lime][i][green]>"[/green][/i][/highlight] [teal]{[/teal] flag [teal]=[/teal] [purple]1[/purple] [teal]}[/teal]
[fuchsia]/^\(/[/fuchsia][teal],[/teal] [fuchsia]/\);/[/fuchsia] [teal]{[/teal] [b]if[/b] [teal]([/teal]flag [teal]&& ![/teal] [fuchsia]/^\(|\);/[/fuchsia][teal])[/teal] [b]print[/b] [teal]}[/teal]
[fuchsia]/\);/[/fuchsia] [teal]{[/teal] flag [teal]=[/teal] [purple]0[/purple] [teal]}[/teal]
Code:
[navy]$0[/navy] [teal]~[/teal] [highlight][i][green]"[/green][/i][lime]\\[/lime][i][green]<"[/green][/i][/highlight] pattern [highlight][i][green]"[/green][/i][lime]\\[/lime][i][green]>"[/green][/i][/highlight][teal],[/teal] [fuchsia]/;/[/fuchsia] [teal]{[/teal] [b]if[/b] [teal]([/teal][b]match[/b][teal]([/teal][navy]$0[/navy][teal],[/teal] [fuchsia]/\([^()]+\)/[/fuchsia][teal],[/teal] found[teal]))[/teal] [b]print[/b] found[teal][[/teal][purple]0[/purple][teal]] }[/teal]

Feherke.
feherke.ga
 
Hi feherke,

if i need to do a small change on the requirement like from 1st example if i want to get the data including the two parenthesis() then where need to change .As per your solution now the parenthesis not coming.So i want data in between from pattern and ; . i thnk its a samll change....i tried to remove but got syntax error
Code:
CRAETE TABLE ABCD
(
A,
B,
C
);
CREATE TABLE PQRS
(
P,
R
);
Code:
earlier output
A,
B,
C
Now Want
Code:
(
A,
B,
C
)

Actual code
Code:
awk '
        $0 ~ "\\<" pattern "\\>" { flag = 1 }
         /^\(/, /\;/ { if (flag && ! /^\(|\;/) print }
         /\;/ { flag = 0 }
    ' pattern="$pattern" CREATE_TABLE.txt
 
Hi

Indeed a small change. The range pattern includes the delimiting lines too, I explicitly added a condition to skip them.
Code:
[navy]$0[/navy] [teal]~[/teal] [i][green]"[/green][/i][lime]\\[/lime][i][green]<"[/green][/i] pattern [i][green]"[/green][/i][lime]\\[/lime][i][green]>"[/green][/i] [teal]{[/teal] flag [teal]=[/teal] [purple]1[/purple] [teal]}[/teal]
[fuchsia]/^\(/[/fuchsia][teal],[/teal] [fuchsia]/\;/[/fuchsia] [teal]{[/teal] [b]if[/b] [teal]([/teal]flag[teal])[/teal] [b]print[/b] [teal]}[/teal]
[fuchsia]/\;/[/fuchsia] [teal]{[/teal] flag [teal]=[/teal] [purple]0[/purple] [teal]}[/teal]


Feherke.
feherke.ga
 
Hi,
[pre]
Thanks for your help.This code is working most of scenario but getting different output in one case and breaks.

Below scenario code is Working Code and giving correct output.
[/pre]
exp:Source data
Code:
CREATE TABLE ALU.ABCD_1
(
  ABCD VARCHAR(64) NOT NULL ,
 WXYZ NUMERIC(37,10) NOT NULL 
);

Code:
while IFS='' read -r pattern; do
 
    awk '
        $0 ~ "\\<" pattern "\\>" { flag = 1 }
         /^\(/, /\;/ { if (flag && ! /^\(|\;/)  print }
         /\;/ { flag = 0 }
    ' pattern="$pattern" Src_SQL.sql |awk '{print $1}' >Ou.txt
done < PATTERN.txt

Output
Code:
ABCD
WXYZ
[pre]

In the below scenario it is not giving correct output with same code
[/pre]
Code:
CREATE TABLE ALU.ABC_1
(
  ABC VARCHAR(64) NOT NULL 
, XYZ NUMERIC(37,10) NOT NULL 
);

Code:
while IFS='' read -r pattern; do
 
    awk '
        $0 ~ "\\<" pattern "\\>" { flag = 1 }
         /^\(/, /\;/ { if (flag && ! /^\(|\;/)  print }
         /\;/ { flag = 0 }
    ' pattern="$pattern" Src_SQL.sql |awk '{print $1}' >Ou.txt
done < PATTERN.txt
[pre]
here the output is deleting the last record and missing the XYZ
output
Code:
ABC
,
[/pre]

[pre]
The problem comes if the comma(,) is before the column and in other examples comma(,) is in the same line.
Also getting error in below scenario..it deletes the 1st record
[/pre]

Code:
CREATE TABLE ALU.ABC_1
(  ABC VARCHAR(64) NOT NULL ,
 XYZ NUMERIC(37,10) NOT NULL 
);
OUTPUT
xyz

it deletes the 1st line (ABC) as it is with the same line of parenthesis'('
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top