breaking a file based on the Key column

bengalliboy · Aug 9, 2002

Hi
What would be the best way to split a file into pieces based on the 1st column (say 8 char, fixed length), some thing like this: The text file file1 looks like this:

ABCD001 asfafggfdggfgfg
ABCD001 hjhjhjh
ABCD001 hghgh
ABCD002 hjhjhj
ABCD002 jhhj

So I want: file2
ABCD001 asfafggfdggfgfg
ABCD001 hjhjhjh
ABCD001 hghgh

and file2:
ABCD002 hjhjhj
ABCD002 jhhj
TIA

vgersh99 · Aug 9, 2002

using nawk. Save the below in myAwk.awk and run it as:

nawk -f myAwk.awk myTextFile.txt

#----------------------------- myAwk.awk--------------------

{
pos=match($1, "[0-9][0-9]*&quot

;
outFile= (!pos) ? "file0" : "file" substr($1, RSTART, RLENGTH);

print >> outFile;
}

#------------------------------------------------ vlad
+---------------------------+
|#include<disclaimer.h> |
+---------------------------+

CaKiwi · Aug 9, 2002

Use the following awk script

Code:

{
  if (substr($0,1,8) != hldstr) {
    if (NR>1) close(fn)
    fnix++
    fn = &quot;file&quot; fnix
    hldstr = substr($0,1,8)
  }
  print > fn
}

Put in a file, split.awk say, and enter

Code:

awk -f split.awk inputfile

CaKiwi

vgersh99 · Aug 9, 2002

ooops, sorry - that's better

#----------------------------- myAwk.awk--------------------

{
pos=match($1, "[1-9][0-9]*&quot

;
outFile= (!pos) ? "file0" : "file" substr($1, RSTART, RLENGTH);

print >> outFile;
}

#------------------------------------------------ vlad
+---------------------------+
|#include<disclaimer.h> |
+---------------------------+

vgersh99 · Aug 9, 2002

CaKiwi,
not sure if your implementation works:

ABCD001 asfafggfdggfgfg
ABCD001 hjhjhjh
ABCD001 hghgh
ABCD002 hjhjhj
ABCD003 vlad
ABCD002 jhhj

The file number is encoded in the value of the first column.

vlad
+---------------------------+
|#include<disclaimer.h> |
+---------------------------+

CaKiwi · Aug 9, 2002

Vlad,

I took it to mean that when the first 8 characters of a record changed the data should be written to a new sequentially numbered file, but your interpretation seems more likely. CaKiwi

vgersh99 · Aug 9, 2002

whatever makes the customer happy

vlad
+---------------------------+
|#include<disclaimer.h> |
+---------------------------+

bengalliboy · Aug 15, 2002

Thanks a lot... this is exactly what I am trying to do. Cawiki, I want to ask you if I can change the output file names from the file1, file2 etc to: first column itself... that is: file1 shold be named as ABCD001, fle 2 as ABCD002 ?
Thanks again...

CaKiwi · Aug 15, 2002

Yes, like this

.

Code:

{
  if (substr($0,1,8) != hldstr) {
    if (NR>1) close(fn)
    fn = $1
    hldstr = substr($0,1,8)
  }
  print > fn
}

CaKiwi

vgersh99 · Aug 15, 2002

CaKiwi,

you're a better mind-reader

vlad
+---------------------------+
|#include<disclaimer.h> |
+---------------------------+

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

breaking a file based on the Key column

bengalliboy

MIS

vgersh99

Programmer

CaKiwi

Programmer

vgersh99

Programmer

vgersh99

Programmer

CaKiwi

Programmer

vgersh99

Programmer

bengalliboy

MIS

CaKiwi

Programmer

vgersh99

Programmer

Similar threads

Part and Inventory Search

Sponsor