If you want to successively pick n elements from the list and be able to do so without replacement over and over and over again, you are probably best of randomly permuting the elements, then taking chunks off in blocks of n. If you randomly permute the list you guarantee statistical randomness for each block you pick out. Perhaps the easiest way to do this would be to use Collections.shuffle.
List<Foo> list = createItSomehow();
Collections.shuffle(list);
Foo foo = list.get(0);
It enables you to get n unique elements by an incremented index (assuming that the list itself contains unique elements).
In case you're wondering if there's a Java 8 Stream approach; no, there isn't a built-in one. There's no such thing as Comparator#randomOrder() in standard API (yet?). You could try something like below while still satisfying the strict Comparator contract (although the distribution is pretty terrible):
List<Foo> list = createItSomehow();
int random = new Random().nextInt();
Foo foo = list.stream().sorted(Comparator.comparingInt(o -> System.identityHashCode(o) ^ random)).findFirst().get();
A fair way to do this is to go through the list, on the nth iteration calculating the probability of whether or not to pick the nth element, which is essentially the fraction of the number of items you still need to pick over the number of elements available in the rest of the list. For example:
public static <T> T[] pickSample(T[] population, int nSamplesNeeded, Random r) {
T[] ret = (T[]) Array.newInstance(population.getClass().getComponentType(),
nSamplesNeeded);
int nPicked = 0, i = 0, nLeft = population.length;
while (nSamplesNeeded > 0) {
int rand = r.nextInt(nLeft);
if (rand < nSamplesNeeded) {
ret[nPicked++] = population[i];
nSamplesNeeded--;
}
nLeft--;
i++;
}
return ret;
}
// define ArrayList to hold Integer objects
ArrayList<Integer> arrayList = new ArrayList<>();
for (int i = 0; i < maxRange; i++) {
arrayList.add(i + 1);
}
// shuffle list
Collections.shuffle(arrayList);
// adding defined amount of numbers to target list
ArrayList<Integer> targetList = new ArrayList<>();
for (int j = 0; j < amount; j++) {
targetList.add(arrayList.get(j));
}
return targetList;
import java.util.Enumeration;
import java.util.Random;
public class RandomPermuteIterator implements Enumeration<Long> {
int c = 1013904223, a = 1664525;
long seed, N, m, next;
boolean hasNext = true;
public RandomPermuteIterator(long N) throws Exception {
if (N <= 0 || N > Math.pow(2, 62)) throw new Exception("Unsupported size: " + N);
this.N = N;
m = (long) Math.pow(2, Math.ceil(Math.log(N) / Math.log(2)));
next = seed = new Random().nextInt((int) Math.min(N, Integer.MAX_VALUE));
}
public static void main(String[] args) throws Exception {
RandomPermuteIterator r = new RandomPermuteIterator(100);
while (r.hasMoreElements()) System.out.print(r.nextElement() + " ");
}
@Override
public boolean hasMoreElements() {
return hasNext;
}
@Override
public Long nextElement() {
next = (a * next + c) % m;
while (next >= N) next = (a * next + c) % m;
if (next == seed) hasNext = false;
return next;
}
}
Most of the proposed solutions till now suggest either a full list shuffle or successive random picking by checking uniqueness and retry if required.
But, we can take advantage of the Durstenfeld's algorithm (the most popular Fisher-Yates variant in our days).
Durstenfeld's solution is to move the "struck" numbers to the end of
the list by swapping them with the last unstruck number at each
iteration.
Due to the above, we don't need to shuffle the whole list, but run the loop for as many steps as the number of elements required to return. The algorithm ensures that the last N elements at the end of the list are 100% random if we used a perfect random function.
Among the many real-world scenarios where we need to pick a predetermined (max) amount of random elements from arrays/lists, this optimized method is very useful for various card games, such as Texas Poker, where you a-priori know the number of cards to be used per game; only a limited number of cards is usually required from the deck.
public static <E> List<E> pickNRandomElements(List<E> list, int n, Random r) {
int length = list.size();
if (length < n) return null;
//We don't need to shuffle the whole list
for (int i = length - 1; i >= length - n; --i)
{
Collections.swap(list, i , r.nextInt(i + 1));
}
return list.subList(length - n, length);
}
public static <E> List<E> pickNRandomElements(List<E> list, int n) {
return pickNRandomElements(list, n, ThreadLocalRandom.current());
}
Keep selecting a random element and make sure you do not choose the same element again:
public static <E> List<E> selectRandomElements(List<E> list, int amount)
{
// Avoid a deadlock
if (amount >= list.size())
{
return list;
}
List<E> selected = new ArrayList<>();
Random random = new Random();
int listSize = list.size();
// Get a random item until we got the requested amount
while (selected.size() < amount)
{
int randomIndex = random.nextInt(listSize);
E element = list.get(randomIndex);
if (!selected.contains(element))
{
selected.add(element);
}
}
return selected;
}
In theory this could run endlessly but in practise it is fine. The closer you get the whole original list the slower the runtime of this gets obviously but that is not the point of selecting a random sublist, is it?
The following class retrieve N items from a list of any type. If you provide a seed then on each run it will return the same list, otherwise, the items of the new list will change on each run. You can check its behaviour my running the main methods.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.Random;
public class NRandomItem<T> {
private final List<T> initialList;
public NRandomItem(List<T> list) {
this.initialList = list;
}
/**
* Do not provide seed, if you want different items on each run.
*
* @param numberOfItem
* @return
*/
public List<T> retrieve(int numberOfItem) {
int seed = new Random().nextInt();
return retrieve(seed, numberOfItem);
}
/**
* The same seed will always return the same random list.
*
* @param seed,
* the seed of random item generator.
* @param numberOfItem,
* the number of items to be retrieved from the list
* @return the list of random items
*/
public List<T> retrieve(int seed, int numberOfItem) {
Random rand = new Random(seed);
Collections.shuffle(initialList, rand);
// Create new list with the number of item size
List<T> newList = new ArrayList<>();
for (int i = 0; i < numberOfItem; i++) {
newList.add(initialList.get(i));
}
return newList;
}
public static void main(String[] args) {
List<String> l1 = Arrays.asList("Foo", "Bar", "Baz", "Qux");
int seedValue = 10;
NRandomItem<String> r1 = new NRandomItem<>(l1);
System.out.println(String.format("%s", r1.retrieve(seedValue, 2)));
}
}
As noted in other answers, Collections.shuffle is not very efficient when the source list is big, because of the copying. Here is a Java 8 one-liner that:
efficient enough on random access lists like ArrayList if you don't need many elements from the source
doesn't modify the source
DOES NOT guarantee uniqueness, if it's not super important for you. If you pick 5 from a hundred, there's a very good chance the elements will be unique.
Code:
private static <E> List<E> pickRandom(List<E> list, int n) {
return new Random().ints(n, 0, list.size()).mapToObj(list::get).collect(Collectors.toList());
}
However, for a list with no quick random access (like LinkedList) the complexity would be n*O(list_size).
This solution doesn't modify the original list or otherwise scale in complexity with the list size.
To get a sample of 4 from a list of 7, we just select a random element out of all 7, then select a random element out of the remaining 6, and so on. If we've already selected indices 4, 0, 3, next we generate a random number out of 0, 1, 2, 3, respectively representing index 1, 2, 5, 6.
static Random rand = new Random();
static <T> List<T> randomSample(List<T> list, int size) {
List<T> sample = new ArrayList<>();
for (int sortedSampleIndices[] = new int[size], i = 0; i < size; i++) {
int index = rand.nextInt(list.size() - i);
int j = 0;
for (; j < i && index >= sortedSampleIndices[j]; j++)
index++;
sample.add(list.get(index));
for (; j <= i; j++) {
int temp = sortedSampleIndices[j];
sortedSampleIndices[j] = index;
index = temp;
}
}
return sample;
}
All of these answers require a modifiable list or run into performance issued
Here's a swift snippet that required O(k) additional space and is guaranteed to run in O(k) time and doesn't need a modifiable array. (Performs shuffles in a map)
func getRandomElementsFrom(array: [Int], count: Int = 8) -> [Int] {
if array.count <= count {
return array
}
var mapper = [Int: Int]()
var results = [Int]()
for i in 0..<count {
let randomIndex = Int.random(in: 0..<array.count - i)
if let existing = mapper[randomIndex] {
results.append(array[existing])
} else {
let element = array[randomIndex]
results.append(element)
}
let targetIndex = array.count - 1 - i
mapper[randomIndex] = mapper[targetIndex] ?? targetIndex
}
return results
}
The following method returns a new List of Min(n, list.size()) random elements taken from the paramenter List list. Have in mind that the List list is being modified after each call. Therefore, each call will be "consuming" the original list returning n random elements from it: