Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Weighted Random Selection 1

Status
Not open for further replies.

nvwildfire

Technical User
Aug 15, 2002
43
US
I have a dataset that I need to randomly select x number of rows of that dataset, but I want some of the rows to have a greater probability of getting selected than others based on a calculated weight value, although I still want the chance of a row that has a low weight to get selected. Hope that made sense. So each row gets a weight based on numerous criteria, I don't know what the max value and the min value could be 0. My plan is to normalize the weight values between 0 and 1, the normalized value for each row would represent the probability of getting chose. After this point I have no idea of how to randomly select values with this weight or probability. Has anyone else out there had to do a weighted random selection like this? Any ideas would be great. I am not looking for the code specifically more the concept.

thanks in advance.

kgk
 
A dataset?

Are you using visual basic .net?

One of the .net forums might be able to help more :)

Transcend
[gorgeous]
 
The one time I did it I was able to put the IDs in an array (not sure if your recordset is of a manageable size). Based on criteria in the record I would put it in the array different numbers of times -- by default every record ID went into the array once, but if they met the criteria they'd be put in 2, 3, as many as 5 times. Then I just generated a random number to the Ubound of the array each time I wanted a result.
 
Transend,

what I meant by a dataset are a group of records in a database, I did not mean dataset in programming sense, although I don't thing there is a programming meaning for dataset.

Genimus,

I like your idea, I might try to incorporate some other things with the array idea. I'll post more when I figure something out.

thanks,

kgk
 
nvwildfire:

sorry about the confusion, a dataset is what visual basic .net uses rather than recordsets :)

Transcend
[gorgeous]
 
After you normalize the weight values between 0 and 1, so that the value represents the probabilty of getting chosen, so forget to insure that the sum of the probabilities adds up to 1.

Then depending on your desired resolution, I would create an array of 100, 1000, 10K, ... then populate the array with the appropriate row number. Then use the random number generator to generate a number between 1 and the upper limit which would be the subscript into the array thus identifying the row to be chosen.

Simple example with 5 rows

row 1 has prob of 21% to be select
row 2 has prob of 18%
row 3 has prob of 11%
row 4 has prob of 30%
row 5 has prob of 20%

Create the array with 100 elements with entries 01 to 21 having the value of 1.
entries 22 to 39 having the value of 2
entries 40 to 50 having the value of 3
entries 51 to 80 having the value of 4
entries 81 to 100 having the value of 5

Generate random number 1 to 100 that selected the corresponding row.


Good Luck
--------------
As a circle of light increases so does the circumference of darkness around it. - Albert Einstein
 
CajunCenturion,

Thanks for the help. I think your idea will work perfectly. My only concern is that the array should be shuffled so that not all numbers with a high probability are together. This may be an invalid concern, I don't know. Again many thanks for your input.

Thanks you,

kgk
 
If the random number generator that you're using has a reasonable degree of randomness, and is producing at or near a even distribution, then you won't need to shuffle the array. Each number has an equal probability of being generated.

Good Luck
--------------
As a circle of light increases so does the circumference of darkness around it. - Albert Einstein
 
CajunCenturion,

Thanks again for the information. It really helped.

kgk
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top