Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Pulling Random Records

Status
Not open for further replies.

boblang

Programmer
Jan 6, 2002
1
US
Does anyone know how to pull random records from a Teradata table with SQL?
 
select * from tablename sample n

You can substitute 'n' with the actual number of records you want the sample. Say if you want 5 records and the table name is emp...the sql will be

sel * from emp sample 5
 
The sample 5 example will give you the first 5 rows of the table, not the randomness that you are looking for.
 
You can also get a percentage of the records:
sel * from emp sample .05;

This will return 5% of the records from table emp. (if you have qualified records using where condition(s) the percent of records returned is based on total number of records meeting criteria).
 
Hello TheTeraKid

In Teradata there is a concept of rowid as in Oracle where in we know that the row has been inserted in which sequence. In TD rows are stored randomly on different AMPs and there is no way to know which one was inserted first except if you put in a time stamp in a record. So Sample 5 will not give you first 5 rows of a table.

e.g. A table t2 has d1,d2,d3 integer columns having data as
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
19 20 21
22 23 24
25 26 27
28 29 30

Now fire statements like

sel * from t2 sample 4;

*** Query completed. 4 rows found. 3 columns returned.
*** Total elapsed time was 1 second.

d1 d2 d3
----------- ----------- -----------
16 17 18
22 23 24
10 11 12
4 5 6

sel * from t2 sample 4;

*** Query completed. 4 rows found. 3 columns returned.
*** Total elapsed time was 1 second.

d1 d2 d3
----------- ----------- -----------
16 17 18
19 20 21
7 8 9
22 23 24

You see different rows are returned even when you fire the same SQL and also keep the Sample as 4.

Also, you may try the new Teradata feature of RANDOM function.

sel * from t2 where random(1,100)<10; (Returns less than 10% of random rows)

*** Query completed. One row found. 3 columns returned.
*** Total elapsed time was 1 second.

d1 d2 d3
----------- ----------- -----------
19 20 21


sel * from t2 where random(1,100)<30; (Returns less than 30% rows)

*** Query completed. 4 rows found. 3 columns returned.
*** Total elapsed time was 1 second.

d1 d2 d3
----------- ----------- -----------
28 29 30
19 20 21
25 26 27
4 5 6

I am trying to find out the exact difference between RANDOM and SAMPLE function in TD.
 
I wrongly gave the explanation of RANDOM fn in earlier example. Actually in the sql stmt below a random no between 1 and 100 is generated by the system and if it is less than 30 that row is returned to the user.

sel * from t2 where random(1,100)<30;

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top