Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Basic code to create a csv file from txt file.

Status
Not open for further replies.

Diubidone

IS-IT--Management
Nov 3, 2009
13
IT
Hi all,

I'm trying to figure out what would a basic awk command to create a csv file out of a very basic (but very large) txt file.

Txt file is formated like

desc samid fn ln acctexpires disabled
description user name lastname never yes

and I need this in normal csv

desc,samid,fn,ln,acctexpires,disabled
description,user,name,lastname,never,yes

Anyone can help me?
 
Hi

While you specified nothing about the text file's formatting, this looks as enough :
Code:
awk -vOFS=, '{$1=$1}1' /input/file > /output/file
Tested with [tt]gawk[/tt] and [tt]mawk[/tt].

[tt][blue][small][ignore][offtopic][/ignore][/small][/blue][/tt]
Because you mentioned that the file is very large, here is an alternative which should be faster :
Code:
tr -s ' ' ',' < /input/file > /output/file
[tt][blue][small][ignore][/offtopic][/ignore][/small][/blue][/tt]


Feherke.
 
It works but with a little issue: every space is replaced by comma.

I have this example

desc samid fn ln acctexpires disabled
Dip IBM temp.lastname John Doe 06/11/2009 no
Cosmopol temp.lastname2 Claire Madarena 14/12/2009 no

Is it possible to trasform in

desc,samid,fn,ln,acctexpire,disabled
Dip IBM,temp.lastname,John,Doe,06/11/2009,no

Using the first line as indication of what fields are separated by comma?

Thx!

PS could you explain me the command?
 
Hi

Diubidone said:
Using the first line as indication of what fields are separated by comma?
How ? You, as human, how are you guided by that first line ? To me, that looks like an enumeration of words.

( If spacing is relevant, please post the sample between [tt][ignore]
Code:
[/ignore][/tt] and [tt][ignore]
[/ignore][/tt] or [tt][ignore][tt][/ignore][/tt] and [tt][ignore][/tt][/ignore][/tt] TGML tags to preserve the formatting. )

Feherke.
 
Code:
desc                                   samid                 fn                ln              acctexpires    disabled  
  Zone XIII  IBM                       temp.Doe        John          Doe       06/11/2009     no

Is this ok?
 
PS sorry, I forgot to explain the firs line, those are fields of a LDAP Organisation Unit describing users parameters.
 
Hi

Much better. Now is clearly visible that the first line is for no help in identifying the data columns.

I would say, the best approach is to define the [tt]FS[/tt] as 3 or more spaces :
Code:
awk -F'   +' -vOFS=, '{$1=$1}1' /input/file > /output/file

Feherke.
 
Feherke said:
Much better. Now is clearly visible that the first line is for no help in identifying the data columns.
Perhaps the file actually has "fixed-length" columns?
[3eyes]


----------------------------------------------------------------------------
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb
 
Here's the original code without real names (privacy ;) )

Code:
 desc                                   samid                 fn                ln              acctexpires    disabled  
 Dip XIII soc IBM                       temp.lastname         Firstname         Lastname        06/11/2009     no

Could you explain me the code so I can re use it another time?
 
Hi

Here is the [tt]gawk[/tt] version for now. Is pointless to work on a portable version too while the sample input may still change.
Code:
gawk 'NR==1{s="";w=substr($0,1,1)==" ";l=1;for(i=1;i<=length();i++){c=substr($0,i,1);if(w!=(c==" ")){if(c!=" "){s=s (s?" ":"")i-l;l=i}w=(c==" ")}};FIELDWIDTHS=s (i>l?" "i-l:"")}{for(i=1;i<=NF;i++){sub(/ +$/,"",$i);printf"%s%s",$i,i==NF?"\n":","}} /input/file
Tested with [tt]gawk[/tt]. Will not work with other [tt]awk[/tt] implementations.

Feherke.
 
Strange, I installed gawk and that command doesn't work...why did you proposed gawk instead of awk?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top