Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

hi can any one help me out in solving this

Status
Not open for further replies.

chika9643

Programmer
Sep 15, 2010
4
US
data

H|20070103
D|333|B|20070103|3000
D|111|C|Emily|Smart
D|222|C|Joe|Heart
D|333|C|Wendy|Goodwin
D|111|B|20070101|1000
D|222|B|20070102|2000
D|111|C|Emily|Smart
D|999|C|Teresa|Problem
D|333|A|Broad St|CT|06333
D|111|A|Washington St|CT|06453
D|222|A|Down St|CT|06666
T|20070103|11




BUSINESS EXPLANATION:
====================

Assume today's date is 20070104.

There is a program shell listed below that you must use for
your coding.

There is a daily input datafeed called "customer_datafeed.txt"
from an external vendor. The datafeed contains our customer
names, their balances and their addresses.
The following is a sample of the datafeed.
Please keep in mind that for our test purpose, we limited the
number of records to 13. In actual production, this same file
contains millions of records.

H|20070103
D|333|B|20070103|3000
D|111|C|Emily|Smart
D|222|C|Joe|Heart
D|333|C|Wendy|Goodwin
D|111|B|20070101|1000
D|222|B|20070102|2000
D|111|C|Emily|Smart <--- duplicate C record type.
D|999|C|Teresa|Problem <--- missing B and A record types.
D|333|A|Broad St|CT|06333
D|111|A|Washington St|CT|06453
D|222|A|Down St|CT|06666
T|20070103|11

Looking at the datafeed,
- the first record is the header record indicated by 'H' in
the first column.
- the last record is the trailer record indicated by 'T' in
the first column.
- the records between the header and trailer records are the
the detail record indicated by 'D' in the first column.

The header record contains the following info:
- first column = H = header record indicator
- second column = date = file creation-date or header-date

The trailer record contains the following info:
- first column = T = trailer record indicator
- second column = date = file creation date
- third column = 11 = the record count for detail records (important).



The detail record contains the following info:
- first column = D = detail record indicator
- second column = account number. The account number
is unique per customer.
- third column = information type which can be subdivided
into 3 types and they are:
A = address information
B = balance information
C = customer name information

The following sample information type "A" or
address information are as follows
D|111|A|Washington St|CT|06453
- "Washington St" is the street name.
- "CT" is the state name.
- "06453" is the zip code.

The following sample information type "B" or
balance information are as follows
D|333|B|20070103|3000
- "20070103" is the balance date.
- "3000" is the balance amount.

The following sample information type "C" or
name information are as follows
D|333|C|Wendy|Goodwin
- "Wendy" is the first name.
- "Goodwin" is the last name.



Note the following information regarding the detail records.
- there is at least one name information record type "C"
per customer.
- you can have duplicate customer record. Please see
datafeed example for duplicate customer record.
- you can have missing "B" and "A" records for a given customer
Please see datafeed example for missing record.





FIRST OBJECTIVE:
===============
Using the "DATA" step procedure, you must generate a flat file
called "customer_matched.txt". This file is used for reporting
and browsing purposes. This output file gives a different and
easier summary of the input datafeed content. Our users prefer
to browse this new output file over the raw input datafeed.

The new output file contains all the customer
information from the input datafeed. Please INCLUDE all
duplicates (for example Emily) or
unmatched customer information (for example account number 999).

The following is a sample of the output file.

20070103|111|Emily|Smart|Washington St|CT|06453|20070101|$10.00
20070103|111|Emily|Smart|Washington St|CT|06453|20070101|$10.00
20070103|222|Joe|Heart|Down St|CT|06666|20070102|$20.00
20070103|333|Wendy|Goodwin|Broad St|CT|06333|20070103|$30.00
20070103|999|Teresa|Problem|||||

The following describes each column for the first record:
- first column = 20070103 = file creation-date or header-date.
- second column = 111 = account number.
- third column = Emily = first name.
- fourth column = Smart = last name.
- fifth column = Washington St = street addres.
- sixth column = CT = state code.
- seventh column = 06453 = state zip code.
- eighth column = 20070101 = balance date
- ninth column = $10.00 = balance amount divided by 100.

Note that the account number "999" is listed as unmatched records
because of missing record type "A" and "B". It only has customer
information type "C". You can write out the literal "-> Unmatched records".




SECOND OBJECTIVE:
================
You will generate another output flat-file called "customer_for_gfs.txt".
The new output file share many similarities with the input datafeed
with one exception ... this new file is an out-bound datafeed
to different system(s) for reading and processing. The new datafeed,
like the input datafeed, contains header, trailer and detail records.

The new datafeed contains only unique and complete customer information.
Please do NOT include duplicate and un-matched records.

The following is a sample output file.

H|20070104|PBCT
D|20070104|20070103|111|Emily|Smart| Washington St |CT|06453|20070101|$10.00
D|20070104|20070103|222|Joe|Heart|Down St|CT|06666|20070102|$20.00
D|20070104|20070103|333|Wendy|Goodwin|Broad St|CT|06333|20070103|$30.00
T|20070104|PBCT|3



There are 3 different record types in the datafeed and they
are as follows:
- header record at the top.
- trailer record at the bottom.
- detail record in between the header and trailer.

The header record contains the following information:
- H = header indicator
- 20070104 = today's date
- PBCT = fixed bank code

The trailer record contains the following information:
- T = trailer indicator
- 20070104 = today's date
- PBCT = fixed bank code
- 3 = count for detail record ("D")

The detail record is self explanatory. The date in the
second column is today's date (20070104) and the date
in the third column is the header-date or creation-date.

NOTE: The output file contains only unique records based on
account number.




THIRD OBJECTIVE:
================
Your program will write out the record count statistics into the SAS log.
The following is an example of the record count statistics.

CUSTOMER1 >>> 1. REPORTS.
CUSTOMER1 >>> 2. detail record count read in = 11
CUSTOMER1 >>> 3. detail record count in trailer = 11
CUSTOMER1 >>> 4. Detail records read in EQUAL detail record count in trailer record
CUSTOMER1 >>> 5. first_output_record_count = 5
CUSTOMER1 >>> 6. second_output_record_count = 3


The following example explains the above prints:
- first line prints out the title "REPORTS";

- second line prints out the record count for "detail" (D) record read in.

- third line prints out the record count from the trailer record in the input datafeed.

- fourth line prints the following message if the both detail record counts matched

"Detail records read in EQUAL detail record count in trailer record"

or prints the following message if the detail record counts don't matched

"Detail records read in NOT EQUAL detail record count in trailer record"

- fifth line print out the record count for the first output.

- sixth line prints out the record count for detail record for the second output.




PROGRAM SPECIFICATION:
======================
Please note the following development structure, restrictions
and naming convention.

- the SAS program name is customer1.sas

- the file and path name for the input datafeed is:
h:\shared\customer\maestro\incoming\customer_datafeed.txt

- the file and path name for the output files is:
h:\shared\customer\maestro\flatfile\CUSTOMER1_customer_for_gfs.txt
h:\shared\customer\maestro\flatfile\CUSTOMER1_customer_matched.txt

- all SAS dataset and files must be prefixed by the program
name as shown below:
CUSTOMER1_customer_for_gfs.txt
customer1_customer_matched.txt

- You must use macro variables. Please see the list
of macro variables listed in the shell program below.

- You must NOT hard code any values, dates, filenames, etc.
You must use the macro variables.

- Please use the following shell program to build the complete
program. Replace all "xxx" with detail information.

- Add comments or assumptions if necessary.

- Please start coding at the block highlighted in bold face.

- Good luck and have a great time.

/*****************************************************************

SAS Program : xxx
SAS Author : xxx
Program Date : xxx

Description : xxx <--- please write good explanation
your logic.


*****************************************************************/

%macro rptshell;

%let thisPgm = CUSTOMER1;
%let bank_code_pbct = PBCT;
%let file_type_txt = txt;
%let input_file = customer_datafeed;
%let matched_file = customer_matched;
%let gfs_file = customer_for_gfs;
%let root_dir = h:\shared\customer\maestro;
%let input_dir = incoming;
%let output_dir = flatfile;
%let today_date = %sysfunc(today(), yymmddn8.);


<... begin your code from here ... >
<... begin your code from here ... >
<... begin your code from here ... >

%mend rptshell;
%rptshell;

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top