Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Create a Stored Procedure to load monthly text files 1

Status
Not open for further replies.

BxWill

MIS
Mar 30, 2009
367
0
16
US

Any insight as to how I can set up the monthly bulk inserts of text files whereby I do not have to keep modifying the as displayed below?

The code below is used to import the October 2017 text files. For the November 2017 monthly data, I would just copy the script below and paste at the end of my sql script, modify a few fields so that the November 2017 data is validated and imported into the database table. Then, I repeat the same process each time for each succeeding month.


Currently reading about the use of a stored procedure and SSIS. However, I would like to first attempt the use of a stored procedure then proceed to the use of SSIS only if necessary.

Based on the reading so far, it appears that I will need parameters for the stored procedure such as
@FilePath
@FileNameMask


As I am not familiar at all with stored procedures, I would appreciate any insight concerning best practices regarding the use of a stored procedure with several parameters. Are there any disadvantages or nuances/intricacies with the use of stored procedures that I should be aware of?



CODE
USE EquipmentDatabase

If object_id ('tempdb..#EquipmentData') is not null
BEGIN
DROP Table #EquipmentData
END
GO


Create Table #EquipmentData
.
.
.
GO


Bulk Insert #EquipmentData
From 'C:\Data\TextFilesToLoad\Equipmentdata\EquipmentData_201710.txt'
With
(
Fieldterminator = '\|',
Rowterminator = '\n'
Firstrow = 2
);
Go



if object_id('tempdb..#validate_EquipmentData' is not null
BEGIN
DROP table #validate_EquipmentData;
END
GO


Create Table #validate_EquipmentData
.
.
.
[Count_Dups] int
GO


Insert into #validate_EquipmentData
select rtrim(ltrim(ed1.JurisdictionCode))
, rtrim(ltrim(ed1.AccountNo))
, rtrim(ltrim(ed1.SequenceNumber))
...
...
...
, case
when len(ed1.SaleDate) = 8
and isdate(ed1.SaleDate) = 1
and rtrim(ltrim(ed1.SaleDate)) like '[0-9][0-9][0-9][0-9][0-1][0-2][0-3][0-9]'
then convert(date, ed1.SaleDate, 112)
else cast(null as date)
end as SaleDate
-- fields related to validation
, ed1.SaleDate as Input_SaleDate
, dup.count_dups
from #EquipmentData ed1
left outer join (select JurisdictionCode
, AccountNo
, SequenceNumber
, SaleDate
, count(*) as count_dups
from #EquipmentData
group by JurisdictionCode
, AccountNo
, SequenceNumber
, SaleDate
) dup
on dup.JurisdictionCode = ed1.JurisdictionCode
and dup.AccountNo = ed1.AccountNo
and dup.SequenceNumber = ed1.SequenceNumber
and dup.SaleDate = ed1.SaleDate



Insert into Equipment_2017
select
ve.JurisdictionCode,
ve.AccountNo,
ve.SequenceNumber,
.
.
.
from #validate_Equipment ve
where ve.SaleDate between convert(date, '20170101') and convert(date, '20171231')
and ve.JurisdictionCode is not null
and ve.AccountNo is not null
and ve.SequenceNumber is not null
and ve.count_dups = 0


if object_id('Equipment..Equipment_201710_Errors') is not null
BEGIN
DROP table Equipment_201710_Errors;
END
GO


Create Table Equipment_201710_Errors (
JurisdictionCode varchar (5) NULL,
AccountNo varchar (45) NULL,
SequenceNo int NOT NULL,
.
.
.
SaleDate date NULL,
SaleAmount int NULL
)
GO


insert into Equipment_201710_Errors
select ve.JurisdictionCode
, ve.AccountNo
, ve.SequenceNumber
...
...
...
, ve.SaleDate
from #validate_Equipment ve
where not (ve.SaleDate between convert(date, '20170101') and convert(date, '20171231')
and ve.JurisdictionCode is not null
and ve.AccountNo is not null
and ve.SequenceNumber is not null
and ve.count_dups = 0
)
[/Code]
 
2 questions:

Could you use the Preview before posting so you can see if your code is formatted properly?

"modify a few fields so that the November 2017 data is validated and imported into the database table." - could you highlight the code you modify?


---- Andy

There is a great need for a sarcasm font.
 
Did preview the code prior to posting.

In retrospect, I realize that I somewhat rushed in creating the initial post.

Up through last week, I would modify the filename in the Bulk Insert section and also initially setup a separate temporary table for each month to initially receive the data.

For example, I used numerous CREATE TABLE scripts to create a temporary table for each month such as
"CREATE TABLE #EquipmentData_201601", "CREATE TABLE #EquipmentData_201602", "CREATE TABLE #EquipmentData_201603", and so on.

I recently modified this approach just a few days ago by just creating a single temporary table (for example, "Create Table #EquipmentData"). Now, I probably need to incorporate several "TRUNCATE" statements so that I can re-use just one temporary table, "#EquipmentData" every month.


Note, I am interested in validating and then loading monthly text files from 'C:\Data\TextFilesToLoad\Equipmentdata\'
into the respective database tables (Equipment_2014, Equipment_2015, Equipment_2017, Equipment_2018, and so on via a
stored procedure.

Text files to load

EquipmentData_201410.txt
EquipmentData_201411.txt
EquipmentData_201412.txt
EquipmentData_201501.txt
.
.
.
EquipmentData_201712.txt

[Bold]
Therefore, any section of the script that contain date-related information or location of the text file was changed every month.[/Bold]

During my continued reading, several thoughts appear interesting such as creation of a dynamic stored procedure or the use of cursors or the use of SSIS, etc. etc.

As I am not familiar with the aforementioned concepts, I have begun exploring in detail. However, due to the need to setup the database and the monthly processing of the text files within the next week or so, I need to quickly get a process in place and then circle back to explore other options, if necessary.

Displayed below are some portions of the sql script that contains lines that have been changed on a monthly basis because of the specific text file that is loaded.

Code:
Insert into Equipment_2017  <------- This will change; dependent on text file that is loaded
 select
 ve.JurisdictionCode,
 ve.AccountNo,
 ve.SequenceNumber,
 .
 .
 .
 from #validate_Equipment ve
 where ve.SaleDate between convert(date, '20170101') and convert(date, '20171231')   <-- This will change
 and ve.JurisdictionCode is not null
 and ve.AccountNo is not null
 and ve.SequenceNumber is not null
 and ve.count_dups = 0


 if object_id('Equipment..Equipment_201710_Errors') is not null  <------ This will change 
 BEGIN
 DROP table Equipment_201710_Errors;  <------ This will change 
 END
 GO


 Create Table Equipment_201710_Errors (    <---- This will change 
 JurisdictionCode varchar (5) NULL,
 AccountNo varchar (45) NULL,
 SequenceNo int NOT NULL,
 .
 .
 .
 SaleDate date NULL,
 SaleAmount int NULL
 )
 GO


 insert into Equipment_201710_Errors  <--- This will change
.
.
.

 
So my initial guess would be something like:
[tt]
from #validate_Equipment ve
where ve.SaleDate between
convert(date, [blue]year(date) + '[/blue]0101') and convert(date, [blue]year(date) + '[/blue]1231')
and ve.JurisdictionCode is not null
and ve.AccountNo is not null
and ve.SequenceNumber is not null
and ve.count_dups = 0
[/tt]
Since beginning of the year is always 01/01 and ends on 12/31


---- Andy

There is a great need for a sarcasm font.
 
I would be very tempted to - instead of creating, dropping, recreating [tt]Equipment_201710_Errors[/tt] table with different name for each year and month - have a table named simply [tt]Equipment_Errors[/tt] and in it have a field indicating the date of the record and keep all the data for all years and months in this one table.
You can easily retrieve the records you want for any Year, Month, etc.


---- Andy

There is a great need for a sarcasm font.
 
I initially had just one table for the storage of errors. However, to readily isolate the errors, I decided to have a errors table for each month because it would also be less errors to review.

Maybe, I am making this more difficult than it need to be.

As I think more about this, reverting back to just one table for the storage of errors and then just using the two temporary tables - "#EquipmentData" and "#validate_Equipment" may be the way to go. Then, every month, I could just modify the text filename within the bulk insert portion of the sql script.

However, would I not still need the functionality of a stored procedure to loop through the directory and select the appropriate text file, validate the data and then save it to the respective database table if I do not want to constantly modify the bulk insert portion of the sql script?

For example, text file "EquipmentData_201410.txt" should be inserted into the table named "Equipment_2014", text file "EquipmentData_201602.txt" should be inserted into the table named "Equipment_2016", and so on.

Surely, there is a way to perform the objective. However, there may be caveats to using stored procedures and/or SSIS to perform text file imports. Considering the additional time needed to not only learn about stored procedures, dynamic sql, cursors, etc., I may need to continue with the present setup for now.

The bottom line, I need to load the data as quickly as possible and minimize the monthly maintenance of the data load and begin to analyze the data itself!

Any additional insight/recommendation as to a "must have" reference resource that explains stored procedures, dynamic sql, cursors, etc. is appreciated.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top