Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Weighted Random Select 1

Status
Not open for further replies.

jeff5311

Programmer
Mar 1, 2002
31
US
I'm trying to retrieve a "random" row from a table where the randomness is "weighted" by a factor, let's call it "weight" (original, huh?).


the following (pseude-SQL) works but is EXTREMELY slow:

SELECT * FROM table
ORDER BY RAND() * ( 1 / weight)
LIMIT 1;


I've read that "ORDER BY RAND()" is a big no-no as far as optimization is concerned.

Is there another, more optimal way to construct a weighted random query?
 
Hi

Not tried, but should work faster :
Code:
[b]select[/b] * [b]from[/b] table [b]limit[/b] 1 [b]offset[/b] rand()*(1/weight);

[gray]-- the same with older syntax :[/gray]
[b]select[/b] * [b]from[/b] table [b]limit[/b] rand()*(1/weight),1;

Feherke.
 
Thanks for your help!!!

Unfortunately, MYSQL tells me this isn't valid syntax and won't run the query.

Any other suggestions??? Help!!!
 
The limit figures have to be constants. Looks like you're stuck with what you have.
 
i'm afraid i don't understand how "weighting" a random number works

let's say that weight=3

okay, so instead of getting a random floating-point number between 0.000 and 1.000, as you would with just RAND(), you are instead getting a random floating-point number between 0.000 and 0.333

and you're using this number in the ORDER BY

so all the rows with weight=3 are sorted ahead of all the rows with weight=2, and behind all the rows with weight=4

why couldn't you just do ORDER BY weight DESC?

r937.com | rudy.ca
 
Not wishing to put words into the OP's mouth, but it looks like he doesn't want the lowest-weighted record always coming first; he just wants it to have a better chance than any others of coming first.
 
Tony,

Yes you are correct!!! I want more probability of the higher-weighted records returned.

Again, the above query works probability wise, it just gets slower and slower as the table grows.

Does anyone know how to re-write this? Help!!!

Thanks,
Jeff
 
then this should perform just as poorly...?
Code:
select *
  from (
       select foo
            , bar
            , rand() / weight as weighted_rand
         from datable
       ) as d     
order
    by weighted_rand limit 1


r937.com | rudy.ca
 
You might get better results with:
[tt]
SELECT table.*
FROM
table
JOIN
(
SELECT keyfield,weight FROM table
ORDER BY RAND() * ( 1 / weight)
LIMIT 1
)
t2
USING (keyfield)
[/tt]
 
Thanks Rudy and Tony!

I'll try both methods, but I think any SQL statement with the "ORDER BY RAND()" clause in it will have poor performance... but I'm no guru, it's just what I've read.

Thanks again for your help!!!

Jeff
 
Thanks Tony!!!

That SQL significantly dropped the load on the server...

I'm not sure why (it still has the dreaded "ORDER BY RAND()" in there), but it's working great!!!

thanks againg for all your help.

jeff
 
I constructed a query which might return results faster, but it has all these nasty subqueries.

Is there a better way to write this?

SELECT * FROM
(
SELECT *,
(
SELECT SUM( weight ) FROM table t2 WHERE t2.id <= t1.id
) AS sum_weight
FROM table t1
) AS t3
WHERE t3.sum_weight >=
(
SELECT SUM( weight ) FROM table
) AS t4
ORDER BY t3.sum_weight ASC
LIMIT 1';
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top