Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Random Sample

Status
Not open for further replies.
Aug 29, 2002
14
0
0
US
I need to write SQL to pull every nth record from a data warehouse. How would I write this to give me a true random sample?

Thanks.
 
If you have a nice, numeric sequential ID field on your table, you could do something like this:

Select table1.*
from table1
where id like '%01'


That would give you every 100th record. '%001' would give you every thousandth record and so on. Without such a friendly ID field, you could do it by generating a count field in a temp table and then doing the above on that temporary table. But, your DBMS has to support that feature to do that.
 
you cannot get every nth row unless you specify a column which can be used for sequencing purposes

if the table has an autoincrement or identity column, with no gaps, call it pk, you could simply take the modulus of that column

select *
from yourtable
where MOD(pk,n)=0

assuming your database has a MOD function

if there are many gaps, or if the pk is not numeric, you could generate a rank, which will be incremental, but this query is probably inefficient for large tables --

select t1.pk
from yourtable as t1, yourtable as t2
where t1.pk >= t2.pk
group by t1.pk
having mod(count(*),n)=0

rudy
 
Or you may select, from a column with a unique constraint, every row that has the 4th digit = 4, or 5th = 5, either way you've got options. AA 8~)
 
If you are using Oracle, you can use the "sample" clause to select a random percentage of rows in a table. To select 5 percent of the rows, the query would be

select * from your_table sample (5);
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top