始终创建相同的随机数组

小开

最佳答案

Simply seed the random number generator with a fixed value, e.g.

numpy.random.seed(42)

This way, you'll always get the same random number sequence.

This function will seed the global default random number generator, and any call to a function in numpy.random will use and alter its state. This is fine for many simple use cases, but it's a form of global state with all the problems global state brings. For a cleaner solution, see Robert Kern's answer below.

小开

Create your own instance of numpy.random.RandomState() with your chosen seed. Do not use numpy.random.seed() except to work around inflexible libraries that do not let you pass around your own RandomState instance.

[~]
|1> from numpy.random import RandomState


[~]
|2> prng = RandomState(1234567890)


[~]
|3> prng.randint(-1, 2, size=10)
array([ 1,  1, -1,  0,  0, -1,  1,  0, -1, -1])


[~]
|4> prng2 = RandomState(1234567890)


[~]
|5> prng2.randint(-1, 2, size=10)
array([ 1,  1, -1,  0,  0, -1,  1,  0, -1, -1])

小开

If you are using other functions relying on a random state, you can't just set and overall seed, but should instead create a function to generate your random list of number and set the seed as a parameter of the function. This will not disturb any other random generators in the code:

# Random states
def get_states(random_state, low, high, size):
rs = np.random.RandomState(random_state)
states = rs.randint(low=low, high=high, size=size)
return states


# Call function
states = get_states(random_state=42, low=2, high=28347, size=25)

小开

It is important to understand what is the seed of a random generator and when/how it is set in your code (check e.g. here for a nice explanation of the mathematical meaning of the seed).

For that you need to set the seed by doing:

random_state = np.random.RandomState(seed=your_favorite_seed_value)

It is then important to generate the random numbers from random_state and not from np.random. I.e. you should do:

random_state.randint(...)

instead of

np.random.randint(...)

which will create a new instance of RandomState() and basically use your computer internal clock to set the seed.

小开

I just want to clarify something in regard to @Robert Kern answer just in case that is not clear. Even if you do use the RandomState you would have to initialize it every time you call a numpy random method like in Robert's example otherwise you'll get the following results.

Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> prng = np.random.RandomState(2019)
>>> prng.randint(-1, 2, size=10)
array([-1,  1,  0, -1,  1,  1, -1,  0, -1,  1])
>>> prng.randint(-1, 2, size=10)
array([-1, -1, -1,  0, -1, -1,  1,  0, -1, -1])
>>> prng.randint(-1, 2, size=10)
array([ 0, -1, -1,  0,  1,  1, -1,  1, -1,  1])
>>> prng.randint(-1, 2, size=10)
array([ 1,  1,  0,  0,  0, -1,  1,  1,  0, -1])

小开

Based on the latest updates in Random sampling the preferred way is to use Generators instead of RandomState. Refer to What's new or different to compare both approaches. One of the key changes is the difference between the slow Mersenne Twister pseudo-random number generator (RandomState) and a stream of random bits based on different algorithms (BitGenerators) used in the new approach (Generators).

Otherwise, the steps for producing random numpy array is very similar:

Initialize random generator

Instead of RandomState you will initialize random generator. default_rng is the recommended constructor for the random Generator, but you can ofc try another ways.

import numpy as np


rng = np.random.default_rng(42)
# rng -> Generator(PCG64)

Generate numpy array

Instead of randint method, there is Generator.integers method which is now the canonical way to generate integer random numbers from a discrete uniform distribution (see already mentioned What's new or different summary). Note, that endpoint=True uses [low, high] interval for sampling instead of the default [low, high).

arr = rng.integers(-1, 1, size=10, endpoint=True)
# array([-1,  1,  0,  0,  0,  1, -1,  1, -1, -1])

As already discussed, you have to initialize random generator (or random state) every time to generate identical array. Therefore, the simplest thing is to define custom function similar to the one from @mari756h answer:

def get_array(low, high, size, random_state=42, endpoint=True):
rng = np.random.default_rng(random_state)
return rng.integers(low, high, size=size, endpoint=endpoint)

When you call the function with the same parameters you will always get the identical numpy array.

get_array(-1, 1, 10)
# array([-1,  1,  0,  0,  0,  1, -1,  1, -1, -1])


get_array(-1, 1, 10, random_state=12345)  # change random state to get different array
# array([ 1, -1,  1, -1, -1,  1,  0,  1,  1,  0])


get_array(-1, 1, (2, 2), endpoint=False)
# array([[-1,  0],
#        [ 0, -1]])

And for your needs you would use get_array(-1, 1, size=(100, 2000)).