2 回答

TA貢獻1813條經驗 獲得超2個贊
好吧,我想我可以將我曾經實施的反向加權(請參閱如何隨機均衡不相等的值?)到您的案例。
基本上,樣本概率與其人口數量成反比。初始人口將是您的指導參數 - 如果它很高,則反向會很低,并且累積計數器幾乎沒有影響,所以它會非常接近均勻。如果初始人口數較低(例如 1),則累積計數器將更多地影響采樣。
當您想放棄累積概率并返回原始概率時要考慮的第二個參數,否則低初始計數器的影響會隨著時間的推移而消失。
代碼,使用Math .NET在 [0...6) 范圍內進行分類采樣,.NET Core 2.2,x64。
using System;
using System.Linq;
using MathNet.Numerics.Random;
using MathNet.Numerics.Distributions;
namespace EqualizedSampling
{
class Program
{
static void Main(string[] args)
{
int increment = 10; // how much inverse probabilities are updated per sample
int guidanceParameter = 1000000; // Small one - consequtive sampling is more affected by outcome. Large one - closer to uniform sampling
int[] invprob = new int [6];
double[] probabilities = new double [6];
int[] counter = new int [] {0, 0, 0, 0, 0, 0};
int[] repeat = new int [] {0, 0, 0, 0, 0, 0};
int prev = -1;
for(int k = 0; k != 100000; ++k ) {
if (k % 60 == 0 ) { // drop accumulation, important for low guidance
for(int i = 0; i != 6; ++i) {
invprob[i] = guidanceParameter;
}
}
for(int i = 0; i != 6; ++i) {
probabilities[i] = 1.0/(double)invprob[i];
}
var cat = new Categorical(probabilities);
var q = cat.Sample();
counter[q] += 1;
invprob[q] += increment;
if (q == prev)
repeat[q] += 1;
prev = q;
}
counter.ToList().ForEach(Console.WriteLine);
repeat.ToList().ForEach(Console.WriteLine);
}
}
}
我計算了重復的對以及數字的總外觀。在低引導參數的情況下,連續對的外觀更均勻:
16670
16794
16713
16642
16599
16582
2431
2514
2489
2428
2367
2436
引導參數為 1000000 時,選擇連續對的概率更高
16675
16712
16651
16677
16663
16622
2745
2707
2694
2792
2682
2847
更新
我們可以添加另一個參數,每個樣本遞增。較大的增量將使連續采樣更不可能。代碼更新,輸出
16659
16711
16618
16609
16750
16653
2184
2241
2285
2259
2425
2247

TA貢獻1863條經驗 獲得超2個贊
我最終修改了 Severin 的解決方案以更好地滿足我的需求,所以我想我在這里分享它,以防有人遇到同樣的問題。我做了什么:
替換
Categorical
為基于Random
類的自己的代碼,因為Categorical
這給我帶來了奇怪的結果。改變了概率的計算方式。
添加了更多統計數據。
要更改的關鍵參數是ratio
:
最小值為 1.0,這使得它的行為就像一個隨機數生成器
值越高,它就越類似于洗牌算法,因此可以保證數字在不久的將來出現并且不會重復。訂單仍然不可預測。
比率 1.0 的結果:
這就像偽隨機數生成一樣。
3, 5, 3, 3, 3, 3, 0, 3, 3, 5, 5, 5, 2, 1, 3, 5, 3, 3, 2, 3, 1, 0, 4, 1, 5, 1, 3, 5, 1, 5, -
Number of occurences:
2
5
2
12
1
8
Max occurences in a row:
1
1
1
4
1
3
Max length where this number did not occur:
14
13
12
6
22
8
比率 5.0 的結果
我最喜歡的。很好的分布,偶爾的重復,沒有那么長的間隔沒有發生一些數字。
4, 1, 5, 3, 2, 5, 0, 0, 1, 3, 2, 4, 2, 1, 5, 0, 4, 3, 1, 4, 0, 2, 4, 3, 5, 5, 2, 4, 0, 1, -
Number of occurences:
5
5
5
4
6
5
Max occurences in a row:
2
1
1
1
1
2
Max length where this number did not occur:
7
10
8
7
10
9
比率 1000.0 的結果
分布非常均勻,但仍然帶有一些隨機性。
4, 5, 2, 0, 3, 1, 4, 0, 1, 5, 2, 3, 4, 3, 0, 2, 5, 1, 4, 2, 5, 1, 3, 0, 2, 4, 5, 0, 3, 1, -
Number of occurences:
5
5
5
5
5
5
Max occurences in a row:
1
1
1
1
1
1
Max length where this number did not occur:
8
8
7
8
6
7
代碼:
using System;
using System.Linq;
namespace EqualizedSampling
{
class Program
{
static Random rnd = new Random(DateTime.Now.Millisecond);
/// <summary>
/// Returns a random int number from [0 .. numNumbers-1] range using probabilities.
/// Probabilities have to add up to 1.
/// </summary>
static int Sample(int numNumbers, double[] probabilities)
{
// probabilities have to add up to 1
double r = rnd.NextDouble();
double sum = 0.0;
for (int i = 0; i < numNumbers; i++)
{
sum = sum + probabilities[i];
if (sum > r)
return i;
}
return numNumbers - 1;
}
static void Main(string[] args)
{
const int numNumbers = 6;
const int numSamples = 30;
// low ratio makes everything behave more random
// min is 1.0 which makes things behave like a random number generator.
// higher ratio makes number selection more "natural"
double ratio = 5.0;
double[] probabilities = new double[numNumbers];
int[] counter = new int[numNumbers]; // how many times number occured
int[] maxRepeat = new int[numNumbers]; // how many times in a row this number (max)
int[] maxDistance = new int[numNumbers]; // how many samples happened without this number (max)
int[] lastOccurence = new int[numNumbers]; // last time this number happened
// init
for (int i = 0; i < numNumbers; i++)
{
counter[i] = 0;
maxRepeat[i] = 0;
probabilities[i] = 1.0 / numNumbers;
lastOccurence[i] = -1;
}
int prev = -1;
int numRepeats = 1;
for (int k = 0; k < numSamples; k++)
{
// sample next number
//var cat = new Categorical(probabilities);
//var q = cat.Sample();
var q = Sample(numNumbers, probabilities);
Console.Write($"{q}, ");
// affect probability of the selected number
probabilities[q] /= ratio;
// rescale all probabilities so they add up to 1
double sumProbabilities = 0;
probabilities.ToList().ForEach(d => sumProbabilities += d);
for (int i = 0; i < numNumbers; i++)
probabilities[i] /= sumProbabilities;
// gather statistics
counter[q] += 1;
numRepeats = q == prev ? numRepeats + 1 : 1;
maxRepeat[q] = Math.Max(maxRepeat[q], numRepeats);
lastOccurence[q] = k;
for (int i = 0; i < numNumbers; i++)
maxDistance[i] = Math.Max(maxDistance[i], k - lastOccurence[i]);
prev = q;
}
Console.WriteLine("-\n");
Console.WriteLine("Number of occurences:");
counter.ToList().ForEach(Console.WriteLine);
Console.WriteLine();
Console.WriteLine("Max occurences in a row:");
maxRepeat.ToList().ForEach(Console.WriteLine);
Console.WriteLine();
Console.WriteLine("Max length where this number did not occur:");
maxDistance.ToList().ForEach(Console.WriteLine);
Console.ReadLine();
}
}
}
- 2 回答
- 0 關注
- 157 瀏覽
添加回答
舉報