具有特定比例的二进制随机数组?

什么是有效的(可能用 Matlab 术语向量化)方法来产生随机数的零和一个特定的比例?尤其是和麻木不仁一起?

由于我的案例对于 1/3来说是特殊的,因此我的代码是:

import numpy as np
a=np.mod(np.multiply(np.random.randomintegers(0,2,size)),3)

但是,是否有任何内置的函数,可以更有效地处理这一点,至少在 K/N的情况下,K 和 N 是自然数?

72983 次浏览

You can use numpy.random.binomial. E.g. suppose frac is the proportion of ones:

In [50]: frac = 0.15


In [51]: sample = np.random.binomial(1, frac, size=10000)


In [52]: sample.sum()
Out[52]: 1567

A simple way to do this would be to first generate an ndarray with the proportion of zeros and ones you want:

>>> import numpy as np
>>> N = 100
>>> K = 30 # K zeros, N-K ones
>>> arr = np.array([0] * K + [1] * (N-K))
>>> arr
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1])

Then you can just shuffle the array, making the distribution random:

>>> np.random.shuffle(arr)
>>> arr
array([1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0,
1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1,
1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1,
0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1,
1, 1, 1, 0, 1, 1, 1, 1])

Note that this approach will give you the exact proportion of zeros/ones you request, unlike say the binomial approach. If you don't need the exact proportion, then the binomial approach will work just fine.

If I understand your problem correctly, you might get some help with numpy.random.shuffle

>>> def rand_bin_array(K, N):
arr = np.zeros(N)
arr[:K]  = 1
np.random.shuffle(arr)
return arr


>>> rand_bin_array(5,15)
array([ 0.,  1.,  0.,  1.,  1.,  1.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,
0.,  0.])

Yet another approach, using np.random.choice:

>>> np.random.choice([0, 1], size=(10,), p=[1./3, 2./3])
array([0, 1, 1, 1, 1, 0, 0, 0, 0, 0])

Simple one-liner: you can avoid using lists of integers and probability distributions, which are unintuitive and overkill for this problem in my opinion, by simply working with bools first and then casting to int if necessary (though leaving it as a bool array should work in most cases).

>>> import numpy as np
>>> np.random.random(9) < 1/3.
array([False,  True,  True,  True,  True, False, False, False, False])
>>> (np.random.random(9) < 1/3.).astype(int)
array([0, 0, 0, 0, 0, 1, 0, 0, 1])

Another way of getting the exact number of ones and zeroes is to sample indices without replacement using np.random.choice:

arr_len = 30
num_ones = 8


arr = np.zeros(arr_len, dtype=int)
idx = np.random.choice(range(arr_len), num_ones, replace=False)
arr[idx] = 1

Out:

arr


array([0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1,
0, 0, 0, 0, 0, 1, 0, 0])