Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Help iporting a .csv file to ORACLE 2

Status
Not open for further replies.

MikeTruscott

Technical User
Jun 26, 2003
35
GB
Hi everyone.....

I basically have a csv file with millions of rows in, which I want to stick into an oracle database. I'm hoping that, if I'm not fussy (see below) it's a one or two line job....but I'm not an ORACLE expert!

The csv file:

1:Has no headings
2:Is comma delimited
3:Has a couple of hundred columns
4:Has, on occasion, fields with commas in like "truscott,mike" for example. So I would want to add a text quantifier of "" if possible,

I believe I can either use SQL*Loader or an external table but don't really have the time to figure these out. So if anone could just tell me what I need to do to get the file into ORACLE I'd really appreciate it.

I'm not fussy as to what the columns are called.....I just want the data into a database. Ideally I'd like ORACLE to create the column headings and would like each column to be a text field, but I don't want to type every field detail in for 200+ fields!

Again, any help much appreciated.

Mike
 
Mike,

I think you're making life harder than it need be. I believe that I may have a way to make Oracle sweat instead of you.

Before doing this, check with your DBA that you have a tablespace which is able to cope with a sudden influx of several million rows of data. If not, get one set up in advance.

Let us suppose that you have 10 million rows in your whopping csv file, some with 200 fields, some with 100, 50, and finally 10, as mentioned above.

I presume that you have established correct working with one or two rows of fully populated 200 field data, 100, 50 etc. You should therefore have four sample data files (one for each row size) and four control files, one for each size, right? Do not proceed further until you have achieved this.

Use the control file with 200 fields to create an external table. Go into the database and issue the following command

Code:
CREATE TABLE MIKE_200_FIELDS_INTERNAL
AS SELECT * FROM EXT_MIKETRUSCOTT;

The valid 200 field data should be brought in ok, since you've already established correct working of the control file with the correct size data. MIKE_200_FIELDS_INTERNAL should contain all the 200 field data from the csv file. Your .bad file will be enormous, i.e. all the rows except those which loaded successfully.

Then issue the command
Code:
DROP TABLE EXT_MIKETRUSCOTT;

Then move the control file somewhere safe, delete the .bad file (it's got rubbish in it anyway), delete the .log file (do you really want a blow-by-blow account of this?) and then use the 100 field control file to create a new EXT_MIKETRUSCOTT table.

Go into the database, and issue the command
Code:
CREATE TABLE MIKE_100_FIELDS_INTERNAL
AS SELECT * FROM EXT_MIKETRUSCOTT;

Then drop the table and repeat.

Basically, because Oracle automatically rejects bad data, let it sieve the data for you, with no coding effort on your part.

This process should elegantly separate out all of the varying field length records into separate internal tables.

Let me know how you get on.

Regards

Tharg



Grinding away at things Oracular
 
Morning,

So I came in yesterday to try out my awesome function to pad out the lines, and received an email saying that I should just be able to use 'TRAILING FIELDS NULL' in my control file.

Did a google search for this and couldn't find anything, so thought he meant 'TRAILING NULLCOLLS', which we've already used. Anyway, during the search I found the command 'MISSING FIELD VALUES ARE NULL' which looked like it was worth a go.

So I stuck it in, and it worked! No bad file, a quick count of rows suggested that all of them had gone in. I created an Access database and linked it to the ORACLE database, ran the queries and got the same set of results as with SQL server...in about a fifth of the time.

That was with a quarter of the data, so I stuck all of the data in and although I got no BAD file when I tried to query the database I got the error:

KUP-04020 "Found record longer than buffer size supported"

I think this means that there was a number to big for the database or that there was a string that was too long? The problem is, if I don't get a BAD file then how do I find the rogue record?

Anyway, I'm carrying on with this for now (sticking a quarter of the data in at a time) and I'll see how I get on. I really appreciate your help on this....I'd really be struggling otherwise.

One more thing. Should it take longer to run Access queries with ORACLE linked tables than with tables in Access? I know it has the link to contend with but as the size of the data got bigger I was expecting using ORACLE to store the data to be an advantage?!

Thanks!

Mike
 
Mike,

well done for finding 'MISSING FIELD VALUES ARE NULL', you have just furthered my knowledge of external table control file options.

To answer your question about performance, as is usual with such things, it depends.

If you have a lot of data in Access, and query it, Access will usually beat oracle, simply because there is no network involved. The only limiting factor is how fast you can read the data from the local hard disk. Stick Access on a network drive and you will see the performance drop and/or get a visit from the network boys, asking why you're hogging bandwidth :)

In this way, I have seen Access appear to significantly outperform oracle. However, if you run the query in say sqlplus on the server, then Oracle might well outperform Access.

Also, once you hit the table size limit in Access, you're stuffed. Oracle on the other hand will quite happily deliver up terabytes of data, so for scalability and supportability, I'd go with Oracle.

Grinding away at things Oracular
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top