Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

challenge - quickly count distinct rows in large table

Status
Not open for further replies.

Advocate

IS-IT--Management
Oct 18, 2000
135
GB
Greetings all and here is the challenge ...

i want to write an sql statement to count the number of distinct rows in a large table of approx 500k records (oracle 7.x/8.x unix/nt)

i suspect that there must be an easy way and i am ready to kick myself (dont laugh too loud !)

the number of columns precludes naming them individually (too time consuming)

the best i can come up with that works is :
select count(*) from (select distinct * from table)
but this is slow due to retrieving all rows in inner loop i assume

does not work:
select count(distinct *) from table or
select count(distinct t.*) from table t

all ideas considered and thank you in advance ...
Best of Irish Luck, David.
djwilkes@hotmail.com
 
David,

Have you tried running analyze table beforehand?
Mike
michael.j.lacey@ntlworld.com
 
Mike, thanks for response.

No ... as a programmer and not oracle dba (but learning quickly - so whats new) i have not used it before. I ran away and did a little reading around the subject and assume this may help background optimisation of my working but slow query ?

Any suggestions on the actual sql syntax ? i would have thought that this was a common sql requirement and therefore common but elusive sql syntax was available ?

is something like this correct for what you had in mind ?
analyze table tocperson compute statistics for table ?
Best of Irish Luck, David.
djwilkes@hotmail.com
 
yep

analyze table tocperson compute statistics;

this *will* help for a straight count(*)

will it help for a count(*) distinct? errmmmm... dunno...

have you tried this? (I don't have a DB here....)

Select Count(*), Distinct <complete_field)_list>
From MY_TABLE;
Mike
michael.j.lacey@ntlworld.com
 
How about

SELECT COUNT(DISTINCT(col1,col2,col3...)) FROM my_table;

Hopefully you don't have hundreds of columns!
 
Nope, that won't work (should have tried it first!).
However, this will:

SELECT COUNT(DISTINCT(col1||col2||col3...)) FROM my_table;
 
thanks mike & carp

all your helpful comments are appreciated, but as i said originally, i am trying to avoid typing all column names (one reason is that the genius who named the columns used a ferengi dialect not recognised in this solar system)

looks like im stuck with : select count(*) from (select distinct * from table)
Best of Irish Luck, David.
djwilkes@hotmail.com
 
I would not abandon an approach just because of the difficulty of assembling a list of column names. One common approach is to let Oracle help you write your SQL. For example to get the column list separated by '||' that Carp suggests you could execute the following:

select column_name || '||' from dba_tab_columns
where owner='table_owner'
and table_name='your_table'

You could then paste the result into the desired select statement and execute it.

My guess is that this approach would be somewhat faster than your query. However it is by no means particularly efficient. It looks as if you will end up doing a disk sort on a result set that is the same size as your large table.
 
save the follwing to a file, replace <table_name> with the name of you table, run the following from SQL*Plus and subtract the sum of the counts from the total number of rows in the table:

set pagesize 0
set feedback off
spool subprog0.sql
select 'select <table_name>.*, count(*)-1' from dual;
select 'from <table_name>' from dual;
select 'group by ' from dual;
select column_name
||decode(column_id,
(select max(column_id)
from user_tab_columns
where table_name='<table_name>'
),' '
,',')
from user_tab_columns
where table_name='<table_name>'
order by column_id
;
select 'having count(*)>1;' from dual;
spool off;
set pagesize <whatever>
set feedback on
@subprog0.sql

karluk is right: there is no substitute for knowing SQL*Plus and your data dictionary/dynamic performance views!
 
David,

You don't know ferengi? What kind of DBA are you anyway? AS/400 or something?

(joke by the way - all of the AS/400 ppl out there)
Mike
michael.j.lacey@ntlworld.com
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top