Little Batch Script

jayjaybigs · Feb 9, 2005

I have a file of records:
empid , Name, effdate, home_date, salary_date
200095,BIB STEVEN, 5/1/2003 10/1/1993 5/1/2003
200095,BIB STEVEN, 5/1/2004 10/1/2003 5/1/2002
200095,BIB STEVEN, 5/1/2003 5/1/2002 5/1/2003
200095,BIB STEVEN, 5/1/2004 10/1/1993 5/1/2005
200099,BEN TAMMY, 5/29/2000 5/29/2000 5/04/2002
200099,BEN TAMMY, 5/23/2001 1/02/2001 1/1/2001
200099,BEN TAMMY, 5/29/2000 5/29/2000 5/04/2002
200099,BEN TAMMY, 1/1/2001 5/23/2001 1/02/2001

Does anyone have script that could loop through all the records and do the following
extract the record for a particular empid where salary_date does match any
effdate or home_date in any row of a particular employee(say 200095);

Hence, based on the above, I should get:
200095,BIB STEVEN, 5/1/2004 10/1/1993 5/1/2005
200099,BEN TAMMY, 5/29/2000 5/29/2000 5/04/2002
200099,BEN TAMMY, 5/29/2000 5/29/2000 5/04/2002

Thanks

vgersh99 · Feb 9, 2005

based on the above description and a sample file, one would get:

200095,BIB STEVEN, 5/1/2003 10/1/1993 5/1/2003
200095,BIB STEVEN, 5/1/2004 10/1/2003 5/1/2002
200095,BIB STEVEN, 5/1/2003 5/1/2002 5/1/2003
200095,BIB STEVEN, 5/1/2004 10/1/1993 5/1/2005

here's a sample code:

nawk -v empid=200095 -f jay.awk file.txt

Code:

BEGIN {
  FS=","
}

$1 == empid {
   tmp=$NF;
   gsub("/","", tmp);

   split(tmp, datesA, " ");

   if ( datesA[1] != datesA[2] || datesA[1] != datesA[3] )
      print;
}

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

vgersh99 · Feb 9, 2005

ooops, sorry - misread the date's locations:

Code:

BEGIN {
  FS=","
}

$1 == empid {
   tmp=$NF;
   gsub("/","", tmp);

   split(tmp, datesA, " ");

   if ( datesA[3] != datesA[1] || datesA[3] != datesA[2] )
      print;
}

but I still get only this:
200095,BIB STEVEN, 5/1/2004 10/1/2003 5/1/2002
200095,BIB STEVEN, 5/1/2004 10/1/1993 5/1/2005

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

jayjaybigs · Feb 9, 2005

Thanks Gersh

Is it possible not to have to pass empid in the paramter instead to to store each empid, durring reading of record to a variable and compare the dates until variable changes.

thanx again.

vgersh99 · Feb 9, 2005

Is it possible not to have to pass empid in the paramter instead to to store each empid, durring reading of record to a variable and compare the dates until variable changes.

I don't understand why you want to store it and see if it changes.

If you don't wanna pass empid and simply compare dates:

nawk -f jay.awk file.txt

Code:

BEGIN {
  FS=","
}

{
   tmp=$NF;
   gsub("/","", tmp);

   split(tmp, datesA, " ");

   if ( datesA[3] != datesA[1] || datesA[3] != datesA[2] )
      print;
}

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

jayjaybigs · Feb 9, 2005

Hi Gersh:

I actually reposted the sample file:
200095,BIB,STEVEN,5/1/2003,10/1/1993,5/1/2003
200095,BIB,STEVEN,5/1/2004,10/1/2003,5/1/2002
200095,BIB,STEVEN,5/1/2003,5/1/2002,5/1/2003
200095,BIB,STEVEN,5/1/2004,10/1/1993,5/1/2005
200099,BEN,TAMMY,5/29/2000,5/29/2000,5/04/2002
200099,BEN,TAMMY,5/23/2001,1/02/2001,1/1/2001
200099,BEN,TAMMY,5/29/2000,5/29/2000,5/04/2002
200099,BEN,TAMMY,1/1/2001,5/23/2001,1/02/2001

The trick is to eliminate the first subset and just split by ",".

Once that is done.

I would like to loop through all the reocrds for a particular employee (say 200095) and store that to a varable.

While it is still the Employee, I want to check the 6th element against 4th and 5th element of each record and print that record where there is no match in 4th and 5th elements for a particular employee.

I want to do the same for the next employee when emplyee changes.

Thanks.

futurelet · Feb 9, 2005

Code:

BEGIN { FS = "," }

empid != $1 {
  report( empid )
  empid = $1
  split("",dates)
  split("",lines)
  c=0
}

{
  dates[$4]++
  dates[$5]++
  lines[++c] = $0
}

END { report( empid ) }

function report( id )
{ if (id)
  { for (i=1;i in lines;i++)
    { date = lines[i]
      gsub(/.*,/, "", date)
      if ( !(date in dates) )
        print lines[i]
    }
  }
}

vgersh99 · Feb 9, 2005

futurelet,

I think the desired output is:

200095,BIB,STEVEN,5/1/2004,10/1/1993,5/1/2005
200099,BEN,TAMMY,5/29/2000,5/29/2000,5/04/2002
200099,BEN,TAMMY,5/29/2000,5/29/2000,5/04/2002

I think the point is that $6 should NOT appear as $5 or $5 in any of the resords/lines fora GIVEN customer.

I haven't had enough time today to look into this closer.

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

futurelet · Feb 9, 2005

Vlad,
I think you're right. Fortunately, that's also what I thought when I hacked out the code. My output is:

200095,BIB,STEVEN,5/1/2004,10/1/1993,5/1/2005
200099,BEN,TAMMY,5/29/2000,5/29/2000,5/04/2002
200099,BEN,TAMMY,5/29/2000,5/29/2000,5/04/2002

You do a lot of C programming, right?

vgersh99 · Feb 10, 2005

futurelet,

I copied the sample file from the post and it did have some trailing blanks on some of the lines - that's what gave me the grief.
Gettin' rif of the blanks fixes the problem:

Code:

......
{
  dates[$4]++
  dates[$5]++
  sub(" *$","", $0)
  lines[++c] = $0
}

You do a lot of C programming, right?

Yeah, used to. It shows, eh? [wink]

Good posts - keep it up!

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

futurelet · Feb 10, 2005

Vlad, the semicolons betray that you've been using C.

It's always good to see your posts instead of those of ***.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Little Batch Script

jayjaybigs

IS-IT--Management

vgersh99

Programmer

vgersh99

Programmer

jayjaybigs

IS-IT--Management

vgersh99

Programmer

jayjaybigs

IS-IT--Management

futurelet

Programmer

vgersh99

Programmer

futurelet

Programmer

vgersh99

Programmer

futurelet

Programmer

Similar threads

Part and Inventory Search

Sponsor