Today, I had a chance to use Fisher Yates algorithm. Fisher Yates algorithm shuffles the items in a list.
Here is the problem statement:
Out of an array of N items, randomly pick K items where 0 < K < N.
My first naive solution of the problem was as follows:
IEnumerable<int> RandomSelect(int n, int k) { var random = new Random(); var set = new HashSet<int>(); while(set.Count < k) { set.Add(random.Next(n)); } return set; }
Pick an index at random. Add it to a set. Keep doing it till there are k items in the set.
The code has a problem. To pick all the items in a random order, it takes a long time for the loop to terminate.
After a bit of search, I landed up in Fisher Yates shuffle algorithm. Shuffle algorithm works similar. Move back from the last item in the array. Pick a random index. Shuffle the items in the random index and the current index. Keep repeating till all items in the array are shuffled. A variation of Fisher Yates shuffle algorithm is the random select solution. In random select, we shuffle K times where K < N. In that way, we terminate the loop earlier. Also, there is no problem with long loops we witnessed in the earlier solution.
IEnumerable<int> RandomSelect(int n, int k) { if(k > n) { throw new ArgumentException("k can't be greater than n"); } var random = new Random(); var array = Enumerable.Range(0, n); for(int i=n-1; i>n-k; i--) { int j = random.Next(i); int tmp = array[i]; array[i] = array[j]; array[j] = tmp; } return array.Skip(n-k).Take(k); }
The algorithm runs in O(n) when K is nearly equivalent to N. But, it runs much faster as O(1) when K is much smaller than N.
The last algorithm produces this compilation error: “Cannot apply indexing with [] to an expression of type ‘IEnumerable'”; I think you should add a ToArray() to Enumerable.Range(0, n);
‘j’ should be selected as such:
j ← random integer such that 0 ≤ j ≤ i
but maxValue passed to random.Next(maxValue) is _exclusive_ so you never have the chance of not swapping the element.
You need to use random.Next(i + 1) to fix this in your code.