Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Little Batch Script

Status
Not open for further replies.

jayjaybigs

IS-IT--Management
Jan 12, 2005
191
CA
I have a file of records:
empid , Name, effdate, home_date, salary_date
200095,BIB STEVEN, 5/1/2003 10/1/1993 5/1/2003
200095,BIB STEVEN, 5/1/2004 10/1/2003 5/1/2002
200095,BIB STEVEN, 5/1/2003 5/1/2002 5/1/2003
200095,BIB STEVEN, 5/1/2004 10/1/1993 5/1/2005
200099,BEN TAMMY, 5/29/2000 5/29/2000 5/04/2002
200099,BEN TAMMY, 5/23/2001 1/02/2001 1/1/2001
200099,BEN TAMMY, 5/29/2000 5/29/2000 5/04/2002
200099,BEN TAMMY, 1/1/2001 5/23/2001 1/02/2001

Does anyone have script that could loop through all the records and do the following
extract the record for a particular empid where salary_date does match any
effdate or home_date in any row of a particular employee(say 200095);

Hence, based on the above, I should get:
200095,BIB STEVEN, 5/1/2004 10/1/1993 5/1/2005
200099,BEN TAMMY, 5/29/2000 5/29/2000 5/04/2002
200099,BEN TAMMY, 5/29/2000 5/29/2000 5/04/2002

Thanks
 
based on the above description and a sample file, one would get:

200095,BIB STEVEN, 5/1/2003 10/1/1993 5/1/2003
200095,BIB STEVEN, 5/1/2004 10/1/2003 5/1/2002
200095,BIB STEVEN, 5/1/2003 5/1/2002 5/1/2003
200095,BIB STEVEN, 5/1/2004 10/1/1993 5/1/2005


here's a sample code:

nawk -v empid=200095 -f jay.awk file.txt
Code:
BEGIN {
  FS=","
}

$1 == empid {
   tmp=$NF;
   gsub("/","", tmp);

   split(tmp, datesA, " ");

   if ( datesA[1] != datesA[2] || datesA[1] != datesA[3] )
      print;
}

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
ooops, sorry - misread the date's locations:
Code:
BEGIN {
  FS=","
}

$1 == empid {
   tmp=$NF;
   gsub("/","", tmp);

   split(tmp, datesA, " ");

   if ( datesA[3] != datesA[1] || datesA[3] != datesA[2] )
      print;
}

but I still get only this:
200095,BIB STEVEN, 5/1/2004 10/1/2003 5/1/2002
200095,BIB STEVEN, 5/1/2004 10/1/1993 5/1/2005


vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Thanks Gersh

Is it possible not to have to pass empid in the paramter instead to to store each empid, durring reading of record to a variable and compare the dates until variable changes.

thanx again.
 
Is it possible not to have to pass empid in the paramter instead to to store each empid, durring reading of record to a variable and compare the dates until variable changes.

I don't understand why you want to store it and see if it changes.

If you don't wanna pass empid and simply compare dates:

nawk -f jay.awk file.txt
Code:
BEGIN {
  FS=","
}

{
   tmp=$NF;
   gsub("/","", tmp);

   split(tmp, datesA, " ");

   if ( datesA[3] != datesA[1] || datesA[3] != datesA[2] )
      print;
}


vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Hi Gersh:

I actually reposted the sample file:
200095,BIB,STEVEN,5/1/2003,10/1/1993,5/1/2003
200095,BIB,STEVEN,5/1/2004,10/1/2003,5/1/2002
200095,BIB,STEVEN,5/1/2003,5/1/2002,5/1/2003
200095,BIB,STEVEN,5/1/2004,10/1/1993,5/1/2005
200099,BEN,TAMMY,5/29/2000,5/29/2000,5/04/2002
200099,BEN,TAMMY,5/23/2001,1/02/2001,1/1/2001
200099,BEN,TAMMY,5/29/2000,5/29/2000,5/04/2002
200099,BEN,TAMMY,1/1/2001,5/23/2001,1/02/2001

The trick is to eliminate the first subset and just split by ",".

Once that is done.

I would like to loop through all the reocrds for a particular employee (say 200095) and store that to a varable.

While it is still the Employee, I want to check the 6th element against 4th and 5th element of each record and print that record where there is no match in 4th and 5th elements for a particular employee.

I want to do the same for the next employee when emplyee changes.

Thanks.
 
Code:
BEGIN { FS = "," }

empid != $1 {
  report( empid )
  empid = $1
  split("",dates)
  split("",lines)
  c=0
}

{
  dates[$4]++
  dates[$5]++
  lines[++c] = $0
}

END { report( empid ) }

function report( id )
{ if (id)
  { for (i=1;i in lines;i++)
    { date = lines[i]
      gsub(/.*,/, "", date)
      if ( !(date in dates) )
        print lines[i]
    }
  }
}
 
futurelet,

I think the desired output is:

200095,BIB,STEVEN,5/1/2004,10/1/1993,5/1/2005
200099,BEN,TAMMY,5/29/2000,5/29/2000,5/04/2002
200099,BEN,TAMMY,5/29/2000,5/29/2000,5/04/2002

I think the point is that $6 should NOT appear as $5 or $5 in any of the resords/lines fora GIVEN customer.

I haven't had enough time today to look into this closer.


vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Vlad,
I think you're right. Fortunately, that's also what I thought when I hacked out the code. My output is:

200095,BIB,STEVEN,5/1/2004,10/1/1993,5/1/2005
200099,BEN,TAMMY,5/29/2000,5/29/2000,5/04/2002
200099,BEN,TAMMY,5/29/2000,5/29/2000,5/04/2002

You do a lot of C programming, right?
 
futurelet,

I copied the sample file from the post and it did have some trailing blanks on some of the lines - that's what gave me the grief.
Gettin' rif of the blanks fixes the problem:
Code:
......
{
  dates[$4]++
  dates[$5]++
  sub(" *$","", $0)
  lines[++c] = $0
}

You do a lot of C programming, right?

Yeah, used to. It shows, eh? [wink]
Good posts - keep it up!

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Vlad, the semicolons betray that you've been using C.

It's always good to see your posts instead of those of ***.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top