Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

manipulate column string in a csv file

Status
Not open for further replies.

subok

MIS
Feb 21, 2005
37
0
0
BE
Hi,
I've been struggling to write a simple script to manipulate a column in a large csv file but i can't get any further.
Any help is greatly appreciated!

[pre]
csv file
2013-12-6T02:05:08+01:00,0,0,0,0,0,258,0,0,0,0,0,258,0,0,0,0,0,258,0,0,0,0,0,258,false
2013-12-11T02:10:08+01:00,0,0,0,0,0,283,0,0,0,0,0,283,0,0,0,0,0,283,0,0,0,0,0,283,false
2013-12-11T02:15:08+01:00,11,3662,0,9,3898,302,11,3662,0,9,3898,302,11,3662,0,9,3898,302,11,3662,0,9,3898,302,false
2013-12-5T02:20:08+01:00,0,0,0,0,0,254,0,0,0,0,0,254,0,0,0,0,0,254,0,0,0,0,0,254,false
2013-12-11T02:25:08+01:00,0,0,0,0,0,1324,0,0,0,0,0,1324,0,0,0,0,0,1324,0,0,0,0,0,1324,false

In need to split the first column into 2 as date and time (remove the 'T' and '+01:00) and sort it so that :

2013-12-5,02:20:08,0,0,0,0,0,254,0,0,0,0,0,254,0,0,0,0,0,254,0,0,0,0,0,254,false
2013-12-6,02:05:08,0,0,0,0,0,258,0,0,0,0,0,258,0,0,0,0,0,258,0,0,0,0,0,258,false
2013-12-11,02:10:08,0,0,0,0,0,283,0,0,0,0,0,283,0,0,0,0,0,283,0,0,0,0,0,283,false
2013-12-11,02:15:08,11,3662,0,9,3898,302,11,3662,0,9,3898,302,11,3662,0,9,3898,302,11,3662,0,9,3898,302,false
2013-12-11,02:25:08,0,0,0,0,0,1324,0,0,0,0,0,1324,0,0,0,0,0,1324,0,0,0,0,0,1324,false

[/pre]
 
What have you tried so far and where in your code are you stuck ?
Tip: man sed (the s command)

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Thanks for the tip.

I was able to manage with sed.

#Replace "T" with ',' in column 1
sed -i 's/T/,/1' subok.csv
sed -i 's/+01:00,/,/1' subok.csv
sed -i 's/ime/time/1' subok.csv

#sort column 1 with column 2
sort -t\; -k 1 -k 2 subok.csv

my only problem now, is to sort skipping the first row?
Any suggestion?

Is it possible to combine these 4 command line is a single liner or in a perl script?


 
Single line:
sed 's/T/,/;s/+01:00,/,/;s/ime/time/' subok.csv | sort -t\; -k 1 -k 2

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Code:
sed 's/T\(.*\)+01:00,/,\1,/' subok.csv | sort -t- -k 1,1n -k2,2 -k3,3n
 
Hello,

Following may help on same.

Code:
sed 's/\(.*T\)\(.*\+\)\(.*\:00\,\)\(.*\)/\1 \2\,\4/g;s/T//g;s/\+//g' check_data_specific_form

Where check_data_specific_form is the Input file name.
 
Sorry just want to add the sorting stuff as it is suggested in previous posts same we can add after running script as follows.


Code:
ksh check_data_specific_form.ksh | sort -t- -k 1,1n -k2,2 -k3,3n

Thanks,
R. Singh
 
Here is all in all coede now.

Code:
sed 's/\(.*T\)\(.*\+\)\(.*\:00\,\)\(.*\)/\1 \2\,\4/g;s/T//g;s/\+//g' | sort -t- -k 3n

Output will be as follows.


Code:
2013-12-5 02:20:08,0,0,0,0,0,254,0,0,0,0,0,254,0,0,0,0,0,254,0,0,0,0,0,254,false
2013-12-6 02:05:08,0,0,0,0,0,258,0,0,0,0,0,258,0,0,0,0,0,258,0,0,0,0,0,258,false
2013-12-11 02:10:08,0,0,0,0,0,283,0,0,0,0,0,283,0,0,0,0,0,283,0,0,0,0,0,283,false
2013-12-11 02:15:08,11,3662,0,9,3898,302,11,3662,0,9,3898,302,11,3662,0,9,3898,302,11,3662,0,9,3898,302,false
2013-12-11 02:25:08,0,0,0,0,0,1324,0,0,0,0,0,1324,0,0,0,0,0,1324,0,0,0,0,0,1324,false

Thanks,
R. Singh
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top