Random number generation in C++11: how to generate, how does it work?

I recently came across new way to generate random numbers in C++11, but couldn't digest the papers that I read about it (what is that engine, maths term like distribution, "where all integers produced are equally likely").

So can anyone please explain

  • what are they?
  • what does they mean?
  • how to generate?
  • how do they work?
  • etc

You can call it all in one FAQ about random number generation.

94252 次浏览

随机数生成器是一个方程,给定一个数,就会得到一个新数。通常,您要么提供第一个数字,要么提供从系统时间之类的数字中提取的数字。

Each time you ask for a new number it uses the previous number to execute the equation.

A random number generator is not considered very good if it has a tendency to produce the same number more often than other numbers. i.e. if you wanted a random number between one and 5 and you had this distribution of numbers:

  • 1:1%
  • 2:80%
  • 3:5%
  • 4:5%
  • 5: 9%

2比任何其他数字更容易产生,所以它比其他数字更容易产生。如果所有的数字都一样,你每次都有20% 的机会得到每个数字。换句话说,上面的分布是非常不均匀的,因为2是有利的。20% 的分配是均等的。

通常,如果你想要一个真正的随机数,你可以从天气或其他自然资源中获取数据,而不是从随机数生成器中获取。

这个问题太宽泛了,无法给出一个完整的答案,但让我挑选一些有趣的观点:

为什么“同样可能”

假设您有一个简单的随机数生成器,它以相同的概率生成数字0,1,... ,10(将其视为经典的 rand())。现在你需要一个在0,1,2范围内的随机数,每个数的概率都相等。你下意识的反应就是服用 rand() % 3。但是等等,余数0和1比余数2出现得更频繁,所以这是不正确的!

This is why we need proper 分配, which take a source of uniform random integers and turn them into our desired distribution, like Uniform[0,2] in the example. Best to leave this to a good library!

引擎

Thus at the heart of all randomness is a good pseudo-random number generator that generates a sequence of numbers that uniformly distributed over a certain interval, and which ideally have a very long period. The standard implementation of rand() isn't often the best, and thus it's good to have a choice. Linear-congruential and the Mersenne twister are two good choices (LG is actually often used by rand(), too); again, it's good to let the library handle that.

它是如何工作的

简单: 首先,设置一个发动机并播种。种子完全决定了“随机”数字的整个序列,所以 a)每次使用不同的数字(例如从 /dev/urandom中提取) ,b)如果你想重新创建一个随机选择的序列,存储种子。

#include <random>


typedef std::mt19937 MyRNG;  // the Mersenne Twister with a popular choice of parameters
uint32_t seed_val;           // populate somehow


MyRNG rng;                   // e.g. keep one global instance (per thread)


void initialize()
{
rng.seed(seed_val);
}

现在我们可以创建分布:

std::uniform_int_distribution<uint32_t> uint_dist;         // by default range [0, MAX]
std::uniform_int_distribution<uint32_t> uint_dist10(0,10); // range [0,10]
std::normal_distribution<double> normal_dist(mean, stddeviation);  // N(mean, stddeviation)

然后用引擎创造随机数字!

while (true)
{
std::cout << uint_dist(rng) << " "
<< uint_dist10(rng) << " "
<< normal_dist(rng) << std::endl;


}

并发性

比起传统的 rand(),更喜欢使用 <random>的一个更重要的原因是,如何使随机数生成线程安全现在已经非常清楚和显而易见: 要么为每个线程提供自己的线程本地引擎,在线程本地种子上播种,要么同步对引擎对象的访问。

Misc

  • An 有趣的文章 on TR1 random on codeguru.
  • Wikipedia 有一个很好的总结(谢谢,@Justin)。
  • 原则上,每个引擎应该输入一个 result_type,这是用于种子的正确的整数类型。我想我曾经有一个错误的实现,迫使我强制种子为 std::mt19937uint32_t的 x64,最终这应该是固定的,你可以说 MyRNG::result_type seed_val,从而使引擎非常容易更换。