Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Fastest method to extract large quantities of data

Status
Not open for further replies.

dyarwood

Programmer
Nov 3, 2003
1,483
GB
We have a db which is very large (couple of million records and couple of 1000 columns). The extraction methods we are using are quite slow so my question is this.

Can anyone suggest a method which I could go away and have a look at to try speed up the extraction of data?

Any help would be gratefully appriciated.

dyarwood
 
<b>carolynh<b>

We are currently using ODBC.

<b>harebrain<b>

That is one table.

I tried transfering from one table from Progress using ODBC and in just under 3 hours it retrieved 6.8 million records.

I know that is a lot to retrieve but i still have a problem. The transfer didn't complete.

I'm thinking that the answer is nothing will be able to do that table quickly. (Just to add to things, that is just one of about 5 tables of that size and the process needs to be done every night)

Cheers for your replies.

dyarwood
 
With a schema like that, you have to be thrashing the, er, tar out of the DBMS. Bad design yields poor performance, this time in spades.

Your best bet might be to EXPORT from Progress to flat files, create a saner schema (perhaps on another platform) import the data and work from that. That's going to be a lot of work, but anyway you look at it, you're already toast.
 
Unfortunately the database was inherited and we are not allowed to change the information in there as that is the agreement with the vendor. So yes pretty much toast.

Good news though I managed to get the table out. Only took 7 hours to do 16 million records.

Oh to be able to change the database.
 
So we have 5 tables at 7.25 hours per table = 36.25 hours. And that must be done nightly, meaning you have about 12 hours each day to accomplish the task. Good luck.

Does your vendor agreement preclude you from reshaping the data *after* it's been extracted from the Progress DB?
 
We can reshape it. That part isn't a problem. We can not alter keys, indexes or data within the database (obviously we can put stuff in). Managed to get all the other data apart from 10 tables in about 5 minutes. And these other tables take hours. And the window is around 4 hours.
 
Obviously, you can't even extract one table in the alloted time. This calls for radical measures.

Can you create a more normalized database schema to receive the data captive in the existing tables? If not, a series of tables comprising a natural join of a larger table could improve export time. Your problem then would be to synchronize the data in the second schema with the first. This might or might not be workable, depending on transaction volatility.
 
I will have a look into the natural join thing. Think the eventual solution would be data warehousing solution. Did manage to get one table a bit quicker by export to a text file then import to SQL database (a bit of a work round but not suitable for everything)

Cheers for your help
 
Hello
I am new to this process. We have been given a system called MR2000, which is a medical electronic record, originally marketed by 3M and now abandoned by everyone. It runs on a Progress database which we do not support. Can anyone give me some lead on what is simple and feasible solution to convert this Progress database into someting easier to handle (like Access?). Does anyone know who does this kind of stuff? The information in the database is for archival and medico-legal purposes only and has no current use.
Thanks.
bs11
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top