Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Splitting a flat file

Status
Not open for further replies.

alan147

Technical User
Nov 15, 2002
128
GB
I have a flat file containing customer records, each of the fields has a defined length but the fields are of different lenghts. I need to be able to create a new delimited file form the old one for inputing into a database table. Can this be done using AWK or SED?

Thanks


Alan
 
Yes - awk or sed can do this - but we need sample data
of your defined length / variable length fields. Dickie Bird (:)-)))
 
If this is for an Oracle database then SQL*Loader will do all the above and the data tranfer for you. It will extract data from defined field widths or delimited fields, your choice. Windows folks do the same with Access - the DSN can point to ODBC or any database you like. Cheers,
ND [smile]

bigoldbulldog@hotmail.com
 
The file is of the form:

XXXXXXXXXXXXXXXXXXXXXXXX XXXXXX XX XXXXXX X XXXXXXXXXXXXX XXXXXXX XXXXXX XXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XX XX XX 0000000000000000XXXXXX XXXXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXXXXXXXXX

The real data has been removed to protect the innocent

As for the filed lenghts, the first few are defined as follows:

Field 1 1 Character
Field 2 8 Characters
Field 3 8 Characters
Field 4 9 Characters
Field 5 20 Characters

etc., etc., etc.

There are 38 fields in total!


Alan
 
In awk you have to use substr command for each field. Follow that with sub command to remove the trailing whitespace - for each field. You could use OFS with your new delimiter but a for loop for all 38 fields will save you some coding.

Please forgive mention of the MicroSlop product in earlier post. Cheers,
ND [smile]

bigoldbulldog@hotmail.com
 
awk '{print(substr($0,1,1),"|",substr($0,2,8),"|",substr($0,9,8),"|", etc etc }' infile > outfile
^ 9 = start pos in record
^ 8 = No. of chars to grab
+ I've used | (pipe) as my delimiter. Dickie Bird (:)-)))
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top