ramneim
Programmer
- Aug 26, 2012
- 1
hi guys,
i have a script that basically just sums up the values of 2 particular columns in a file, grouped by the columns specified as well
The script is quite long, but it's basically just repetitive.. just for each condition.
This script only accepts one type of input file.
#!/usr/bin/sh
###############################################################################
# #
# Description: #
# This Shell Script sums the 2 columns as grouped by the the columns #
# specified #
# #
###############################################################################
FILENAME=$1
DELIMITER=$2
FIRST_COL=$3
SECOND_COL=$4
SALESDATE_COL=$5
STOREID=$6
UPC=$7
GTIN=$8
PROMOID=$9
echo ""
echo ".:Summation Tool:."
for FILE in ${FILENAME}
do
gzip -t ${FILE} 2>/dev/null
if [ $? -eq 1 ];
then
comm=cat
else
comm=gzcat
fi
if [ $# -eq 5 ]; then
$comm $FILE | awk -v col=$FIRST_COL -v col2=$SECOND_COL -v col3=$SALESDATE_COL -v sourcefile=$FILE -F "$DELIMITER" '
{
if(NR!=1)
{
if($col3!="" && $col!="" && $col2!="")
{
salesdate[$col3] = $col3
v[$col3] += $col
d[$col3] += $col2
}
}
}
END{
printf("\n%s%s%s%s%s%s\n","sales_date","|","sum(POS_QTY)","|","sum(POS_AMT)|","<source_file>")
for (i in v)
{
if(salesdate!=1)
{
printf("%s%s%d%s%10.4f%s%s\n",salesdate,"|",v,"|",d,"|",sourcefile)
}
}
}'
elif [ $# -eq 6 ]; then
$comm $FILE | awk -v col=$FIRST_COL -v col2=$SECOND_COL -v col3=$SALESDATE_COL -v col4=$STOREID -v sourcefile=$FILE -F "$DELIMITER" '
{
if(NR!=1)
{
if($col3!="" && $col!="" && $col2!="")
{
salesdate[$col3$col4] = $col3 "|" $col4
v[$col3$col4] += $col
d[$col3$col4] += $col2
}
}
}
END{
printf("\n%s%s%s%s%s%s\n","sales_date|cust_id","|","sum(POS_QTY)","|","sum(POS_AMT)|","<source_file> ")
for (i in v)
{
if(salesdate!=1)
{
printf("%s%s%d%s%10.4f%s%s\n",salesdate,"|",v,"|",d,"|",sourcefile)
}
}
}'
.
. (same pattern for [ $# -eq 7 ] up to [ $# -eq 9 ]
.
elif [ $# -gt 10 ]; then
echo "Too many parameters passed. Please pass only the file name, delimiter, the number columns for POS_QTY and POS_AMT, and number of fields to GROUP BY."
elif [ $# -lt 5 ]; then
echo "Not enough parameters passed. Please pass the file name, delimiter, the number columns for POS_QTY and POS_AMT, and number of fields to GROUP BY."
else
echo "No parameters passed. Please pass the file name, delimiter, the number columns for POS_QTY and POS_AMT, and number of fields to GROUP BY."
fi
done
Usage of the script:
sh <script file name> <file or files> <delimiter> <column # to sum1> <column # to sum2> <column #s for grouping>
so basically sample execution would look like this
$ sh getsums.sh ft-gnct-3398-cd-2012-07-07-140112.txt.gz " " 22 23 1 2 4
.:Summation Tool:.
sales_date|cust_id|UPC|sum(POS_QTY)|sum(POS_AMT)|<source_file>
2012-07-05|PL_000000000034014|PL_2003007476012|75| 75.9200|ft-gnct-3398-cd-2012-07-07-140112.txt.gz
2012-07-03|PL_000000000034012|PL_2003000322606|19| 19.5000|ft-gnct-3398-cd-2012-07-07-140112.txt.gz
2012-07-01|PL_000000000034010|PL_2003005081201|38| 38.9800|ft-gnct-3398-cd-2012-07-07-140112.txt.gz
THE MODIFICATION NEEDED:
1. there is an additional input file (2nd file)
- first, there should be a code added to check whether there is a 2ND file
- if there is no 2nd file, just execute the first original script as is
- if there is a 2nd file
- first column needed from this file needs to be
stored to an array to be compared with UPC=$7 which is
coming from the first file
- all those values that match between UPC=$7 and first column
of 2nd file should be "excluded in summation"
(entire line for that match)
- SO, basically, the array from first column of 2nd file needs to
be looped to UPC=$7, to compare each line if there's a match
- those that will match will be stored to a separate new array
- this new array contains the list of values that will be
excluded during summation
if you guys can come up/suggest codes for this, that would be awesome.
i would really appreciate it.
i would also like suggestions, and guide me through the whole process of building it, im quite new to this stuff, so i basically don't know anything,
but i'm willing to learn.
i hope you can all teach me and help me solve this and come up with a
correctly modified script.
Thanks a lot guys! hope to hear from you guys soon.
i really need this script done as soon as possible.
Thanks a lot.
i have a script that basically just sums up the values of 2 particular columns in a file, grouped by the columns specified as well
The script is quite long, but it's basically just repetitive.. just for each condition.
This script only accepts one type of input file.
#!/usr/bin/sh
###############################################################################
# #
# Description: #
# This Shell Script sums the 2 columns as grouped by the the columns #
# specified #
# #
###############################################################################
FILENAME=$1
DELIMITER=$2
FIRST_COL=$3
SECOND_COL=$4
SALESDATE_COL=$5
STOREID=$6
UPC=$7
GTIN=$8
PROMOID=$9
echo ""
echo ".:Summation Tool:."
for FILE in ${FILENAME}
do
gzip -t ${FILE} 2>/dev/null
if [ $? -eq 1 ];
then
comm=cat
else
comm=gzcat
fi
if [ $# -eq 5 ]; then
$comm $FILE | awk -v col=$FIRST_COL -v col2=$SECOND_COL -v col3=$SALESDATE_COL -v sourcefile=$FILE -F "$DELIMITER" '
{
if(NR!=1)
{
if($col3!="" && $col!="" && $col2!="")
{
salesdate[$col3] = $col3
v[$col3] += $col
d[$col3] += $col2
}
}
}
END{
printf("\n%s%s%s%s%s%s\n","sales_date","|","sum(POS_QTY)","|","sum(POS_AMT)|","<source_file>")
for (i in v)
{
if(salesdate!=1)
{
printf("%s%s%d%s%10.4f%s%s\n",salesdate,"|",v,"|",d,"|",sourcefile)
}
}
}'
elif [ $# -eq 6 ]; then
$comm $FILE | awk -v col=$FIRST_COL -v col2=$SECOND_COL -v col3=$SALESDATE_COL -v col4=$STOREID -v sourcefile=$FILE -F "$DELIMITER" '
{
if(NR!=1)
{
if($col3!="" && $col!="" && $col2!="")
{
salesdate[$col3$col4] = $col3 "|" $col4
v[$col3$col4] += $col
d[$col3$col4] += $col2
}
}
}
END{
printf("\n%s%s%s%s%s%s\n","sales_date|cust_id","|","sum(POS_QTY)","|","sum(POS_AMT)|","<source_file> ")
for (i in v)
{
if(salesdate!=1)
{
printf("%s%s%d%s%10.4f%s%s\n",salesdate,"|",v,"|",d,"|",sourcefile)
}
}
}'
.
. (same pattern for [ $# -eq 7 ] up to [ $# -eq 9 ]
.
elif [ $# -gt 10 ]; then
echo "Too many parameters passed. Please pass only the file name, delimiter, the number columns for POS_QTY and POS_AMT, and number of fields to GROUP BY."
elif [ $# -lt 5 ]; then
echo "Not enough parameters passed. Please pass the file name, delimiter, the number columns for POS_QTY and POS_AMT, and number of fields to GROUP BY."
else
echo "No parameters passed. Please pass the file name, delimiter, the number columns for POS_QTY and POS_AMT, and number of fields to GROUP BY."
fi
done
Usage of the script:
sh <script file name> <file or files> <delimiter> <column # to sum1> <column # to sum2> <column #s for grouping>
so basically sample execution would look like this
$ sh getsums.sh ft-gnct-3398-cd-2012-07-07-140112.txt.gz " " 22 23 1 2 4
.:Summation Tool:.
sales_date|cust_id|UPC|sum(POS_QTY)|sum(POS_AMT)|<source_file>
2012-07-05|PL_000000000034014|PL_2003007476012|75| 75.9200|ft-gnct-3398-cd-2012-07-07-140112.txt.gz
2012-07-03|PL_000000000034012|PL_2003000322606|19| 19.5000|ft-gnct-3398-cd-2012-07-07-140112.txt.gz
2012-07-01|PL_000000000034010|PL_2003005081201|38| 38.9800|ft-gnct-3398-cd-2012-07-07-140112.txt.gz
THE MODIFICATION NEEDED:
1. there is an additional input file (2nd file)
- first, there should be a code added to check whether there is a 2ND file
- if there is no 2nd file, just execute the first original script as is
- if there is a 2nd file
- first column needed from this file needs to be
stored to an array to be compared with UPC=$7 which is
coming from the first file
- all those values that match between UPC=$7 and first column
of 2nd file should be "excluded in summation"
(entire line for that match)
- SO, basically, the array from first column of 2nd file needs to
be looped to UPC=$7, to compare each line if there's a match
- those that will match will be stored to a separate new array
- this new array contains the list of values that will be
excluded during summation
if you guys can come up/suggest codes for this, that would be awesome.
i would really appreciate it.
i would also like suggestions, and guide me through the whole process of building it, im quite new to this stuff, so i basically don't know anything,
but i'm willing to learn.
i hope you can all teach me and help me solve this and come up with a
correctly modified script.
Thanks a lot guys! hope to hear from you guys soon.
i really need this script done as soon as possible.
Thanks a lot.