Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Query Performance on Large Tables (15 million records)

Status
Not open for further replies.

bbanner

Programmer
Aug 3, 2001
5
US
Background:
1) I have a table that has 15.1 million records in it.
2) This is my first experience with Oracle, though I've used other database systems (mostly MySQL).
3) I currently have little knowledge of the actual hardware that Oracle is running on as I just replaced the person who set it all up. That person left the company on a bad note and left little behind in the way of documentation, etc.

The table contains the contact information for 15.1 million people. This table is queried (exact match) on the VARCHAR2 columns first_name and last_name. I've indexed these two columns, but Oracle is still taking 45 seconds to 1 minute to return the data (when it isn't already in memory).

Before I go buy all the Oracle performance tuning books I can find to try and speed this search up, I thought I'd ask if anybody out there has experience with such large data sets.

Is there any way to significantly speed up a query like this or should I be realistic and face the fact that a search across such a large data set has to take this long?

For those who have dealt with tables this size, what's the best query time you've seen and do you have any tips on how to begin addressing this problem (if it is an addressable problem)?

I appreciate any help.
 
Don't claim to be an expert, but a good first step is to run an explain plan on the query to see what it is using. Make sure the indexes are being used. Depending on search type(returns one or many rows, OR conditions, etc) the index type may need to be changed.

Partitioning the table is another possibility.
 
$% secs to 1 min is too much for an exact match, even on 15 million rows.
Did you build one index for both keys or one for each.

Getting the explain plan is a good idea.

here is some doc on how to get the explain plan :
How to use SQL_TRACE_IN_SESSION:
================================

The DBMS_SYSTEM.SQL_TRACE_IN_SESSION procedure enables/disables SQL tracing
for any user session identified by Session ID (SID) and Session Serial Number.
This procedure is the most effective way of analyzing SQL related performance
issues originating from SQL embedded in applications.

The package SQL_TRACE_IN_SESSION produces a trace file in user_dump_dest (same
format and content as 'ALTER SESSION SET SQL_TRACE = TRUE') that can be
formatted with TKPROF. This package can be enabled at both the SQL*Plus command
line and directly from stored procedures.

\Steps to Follow:
================

1. Setup

Unlike other packages created by catproc.sql, a public synonym is not
created for DBMS_SYSTEM and no privileges on the package are granted.
Thus, as initially created, only SYS can reference and use any component
in the DBMS_SYSTEM package.

SYS can grant access to DBMS_SYSTEM components to any user/role:

Example: GRANT EXECUTE ON DBMS_SYSTEM to username;

2. Obtain the session identifier (SID) and serial number from v$session.

SQL> select sid, serial#, osuser, username from v$session;

SID SERIAL# OSUSER USERNAME
----- -------- ---------- ----------------
12 16631 <osLogin> <oracleUserName>


3. Turn on SQL Trace for a selected session.

If from the command line:

exec sys.dbms_system.set_sql_trace_in_session (12,16631,TRUE);

Else, if from PL/SQL procedure:

begin
sys.dbms_session.set_sql_trace (<sid>,<serial>,TRUE);
end

4. Turn off SQL Trace for a select session. The resulting trace file can be
found in your directory defined by user_dump_dest.

If from the command line:

exec sys.dbms_system.set_sql_trace_in_session (<sid>,<serial>,FALSE);

Else, if from PL/SQL procedure:

begin
sys.dbms_session.set_sql_trace (<sid>,<serial>,FALSE);
end



After you have gathered the trace, you will have to translate it with the utility TKPROF, that is invoked from the prompt.

Hope this helps
 
A simpler way to get the plan from a SQL*Plus session would be:

SQL> SET AUTOTRACE TRACE
SQL> your query here
SQL> SET AUTOTRACE OFF

Your results might look like the following:

SQL> set autotrace trace
SQL> select * from dual; <====== Here's the Query.


Execution Plan <===== Here's the Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (FULL) OF 'DUAL'




Statistics <== And here are some stats for performance comparison
----------------------------------------------------------
0 recursive calls
12 db block gets
5 consistent gets
0 physical reads
0 redo size
363 bytes sent via SQL*Net to client
425 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> SET AUTOTRACE OFF
 
You may have some problems both in the case index is not used and is used unproperly.

As I see your index size is quite large so even when used it may not fit to memory completely, causing swaping. Of course it dependes on hardware being used but 15 million is NOT TOO GREATE NUMBER for Oracle. We have double-head Sun with 512M memory and I get the result of query on 5 million table (index scan) within 300 msec. Though the index is built on a number column so has a small size.
If you do use index built on both first_name and second_name your index is probably both extra large and poor balanced, because the number of first names is much smaller than the number of second names. Try to use index based on the second name only, or at least change the order of columns.
Try to use partitioning, it not only may save your memory but also improves paralleling of your queries.

Try to ANALYZE your table, for cost-based optimizer does not work properly without statistics.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top