Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

2.5 million row insert 2

Status
Not open for further replies.

nuct

Programmer
Sep 5, 2001
103
Hi im trying to find the right time to schedule a dts package that will insert approx 2.5 millions rows from a text file. Does anyone have any idea how long it will take and are there any things that I should bear in mind.

Cheers

Simon
 
Assuming you will use the BCP or BULK LOAD operation:
Drop the table indexes and re-create them after the load. Make sure the database option 'select into bulk copy' is set. If possible break up the data set into smaller groups. Just recently loaded 1.4 million rows. Broke up the data into 200K row files. Used the BULK INSERT command. Each file loaded in about 3-5 minutes. Your performance will vary depending on how wide the table is.
 
here is the table...


CREATE TABLE [dbo].[Accounts] (
[FTC_Ref_No] [varchar] (8) NOT NULL ,
[Home_Phone] [varchar] (64) NULL ,
[Client_Code] [varchar] (6) NULL ,
[Title] [varchar] (6) NULL ,
[Forename] [varchar] (20) NULL ,
[Surname] [varchar] (20) NULL ,
[Address1] [varchar] (25) NULL ,
[Address2] [varchar] (25) NULL ,
[Address3] [varchar] (25) NULL ,
[Address4] [varchar] (25) NULL ,
[Postcode] [varchar] (10) NULL ,
[Arrears] [real] NULL ,
[AmountToDate] [real] NULL ,
[Status] [varchar] (10) NULL ,
[Bank] [varchar] (32) NULL ,
[BankAccount] [varchar] (32) NULL ,
[Company] [varchar] (32) NULL ,
[ClientRef] [varchar] (32) NULL ,
[LatestPaymentAmount] [real] NULL ,
[LatestPaymentDate] [varchar] (32) NULL ,
[PaymentBookNo] [varchar] (6) NULL ,
[ActualWriteOffAmount] [real] NULL ,
[OriginalArrears] [real] NULL ,
[OtherPhone] [varchar] (16) NULL ,
[Updated] [timestamp] NULL
) ON [PRIMARY]
GO


Unfortunatly the text file that is the source is created by a unix type process that I don't really have time to look into, so I can't really break it up. How long do you reakon it might take based on the table defintion and assuming I do all the things you suggest.

Thanks

Simon.
 
BULK INSERT in BOL:
BATCHSIZE [ = batch_size ]
Specifies the number of rows in a batch. Each batch is copied to the server as one transaction. SQL Server commits or rolls back, in the case of failure, the transaction for every batch. By default, all data in the specified data file is one batch.


If I am reading this correctly, you should be able to specify BATCHSIZE = 200000 as a BULK INSERT parameter and see a performance gain for the import. Good luck!

--Angel [rainbow]
-----------------------------------
Behold! As a wild ass in the desert
go forth I to do my work.
--Gurnie Hallock (Dune)
 
The danger of using smaller batchsizes on a single file is that if it fails on the 3rd the first 2 batches are commited and not rolled back so you have to buid a system to allow you to backout commited transactions incase of failure.


Mem, CPU and Drive speed and configuration will also impact processing time.

"Shoot Me! Shoot Me NOW!!!"
- Daffy Duck
 
Based on my experience, I would 'guess' the load would take no more 2 hours. Your table is narrower than mine. As MDXER indicated, memory, CPU power and disk speed will effect the outcome.
 
Ok, thanks for all the input so far, very helpful.

The table table I am inserting into is just a copy of the actual table that is actually used and is out of date. I am just going to rename it to the same as the original when it is loaded up with the up to new data. This means obiously that no one will be accessing the table at all. Does this mean that I will have no problems doing the BULK INSERT during business hours, because I would quite like to watch over it, or should I schedule it.

Cheers

Simon
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top