Permutations and combinations
A string of length \( n \) has \( n! \) permutations without repetitions. A simple incremental algorithm to generate all permutations of a given string follows
A string also has
\[\binom n k = \frac{n!}{k!(n-k)!}\]combinations
Since the code above can also be used to generate all possible subsets, it can be used to generate k-permutations
Both the permutations and combinations algorithms presented make use of the backtracking paradigm and have complexity of \( O(n!) \) and \( O(2^n) \). They’re intended as naive algorithms to generate combinations and permutations. It has to be noted that other more efficient but more complex methods (e.g. Heap) are available.
Lexicographically sorted permutations
Generating lexicographically sorted permutations means generating permutations in increasing numerical order, e.g.
\[123, 132, 213, 231, 312, 321\]The algorithm is as follows (read the comments to understand the necessary steps)
Regarding the (*) asterisk-marked line, a possible optimization could be to just reverse the array instead of sorting it since it will always be in decreasing order. The complexity remains kind of daunting though: \( O(n \cdot n!) \).
The algorithm above also allows us to quickly find the next duplicate of a given
permutation (it is trivial to substitute the input permutation to the first result
of the std::sort
line).
Ranking and unranking permutations
A common problem when dealing with permutations is obtaining the rank of a permutation, i.e. its index in the list of permutations sorted lexicographically.
A simple solution could be to just generate all permutations with the code seen some paragraphs ago and check each time for a match with the given input permutation. Anyway the complexity would remain exponential.
A smarter approach would be to count the previous smaller permutations that came before the input one.
The key to understanding both ranking and unranking for permutations lies in the following lines so make sure to understand them.
As an example: suppose we were asked to rank the permutation \( { 3,4,1,2 } \). The first element is 3 and there are 2 elements in the set which are smaller than it: 1 and 2. This means there have to be other permutations of the form
\[\text{1 x x x} \\ \text{2 x x x}\]that came before our permutation (since it starts with 3). How many permutations are generated from the two lines above? Each one comprises \( 3! \) permutations since there are three elements ‘shuffling’, thus they yield a grand-total of \( 2 \cdot 3! \) permutations before ours.
Now let’s keep 3 fixed and consider the element 4: there are three elements less than it, but since we’re not considering 3 anymore (it is fixed) only 2 remain. They generate permutations that come before ours of the following form
\[\text{3 1 x x} \\ \text{3 2 x x}\]and this time the number of total smaller permutations is \( 2 \cdot 2! \). Let’s repeat this process for the last two elements and we have a total of
\[2 \cdot 3! + 2 \cdot 2! + 0 + 0 = 16\]If we’re numbering the permutations from 0, the permutation we were given has rank 16.
Thus the pseudocode for the ranking algorithm is
rank = 0
for each element i in the input permutation
smallerElements = find the number of smaller elements from i onward
rank += smallerElements * (number of elements from i+1 to the end)!
Unranking works similarly: we’re given the input number 16 and we want to know which permutation it corresponds to (in the lexicographic order list).
For this problem we’re also given the input set (i.e. the elements to permute): \( { 2,1,4,3 } \). Notice that they can be unordered.
We start by sorting the input set, this allows us to use each index to automatically know how many smaller elements are there for a given element: \( { 1,2,3,4 } \). Then per each element we calculate the number of permutations less than the one with that element as leading one, e.g. for the first element 1 we calculate \( 0 \cdot 3! = 0 \) permutations before it (1 is the smallest element). 0 permutations surely fits into the asked 16 rank thus we store it as a potential candidate. We then try to have 2 as leading element and calculate that there are \( 1 \cdot 3! = 6 \) permutations, this fits even better in the 16 rank and thus we store it as new candidate. We then try 3 as discover that it has 12 permutations less than the first permutation with it as leading element - even better. 4 doesn’t work (it yields 18 and that’s too much to fit into 16) and thus we stop searching for a first element.
The first element is therefore 3 and since there are 12 permutations starting with element less than 3, we still have \( 16-12 \) permutations to cover. The process then starts again to find the next digit (3 is eliminated by the set of potential candidates).
sort input set
while(there are elements in the input set) {
maximumFound = -1;
for each element i in the input set
smallerElements = find the number of smaller permutations before the one
starting with i
maximumFound = max(maximumFound, smallerElements) - make sure this fits into
the given rank
solution += add i corresponding to the maximumFound
delete i from the input set
}
The code to calculate ranking and unranking for permutations follows
Precomputing or using a table for the factorial calculations would reduce the runtime. The code above could also work for duplicate characters but it would need some adjustments, in particular:
-
The duplicate characters would have to be included as smaller elements
-
The total for an element would have to be divided by the number of duplicates of that element. This is because duplicates will generate the same permutation multiple times (in the same fashion as combinations without repetition need to get rid of the duplicates).
Lexicographically sorted combinations
Generating lexicographically sorted combinations proves to be easier than with permutations since for \( n = 4 \) and \( k = 2 \) we have
\[1 2 \\ 1 3 \\ 1 4 \\ 2 3 \\ 2 4 \\ 3 4\]and we can think of a recursive algorithm
generate_combination(set, k, start)
for each element i from start to the end of the set
store_element i
generate_combination(set, k-1, start+1)
drop_element i
Code follows
Ranking and unranking combinations
While generating lexicographically sorted combination was easier than its permutation counterpart, ranking and unranking combinations proves to be considerably more challenging since the reasoning we did can no longer be applied.
Before diving into the ranking/unranking algorithms two reminders are needed: the first one being the multiplicative formula for the binomial coefficient calculation. The formula assures that
\[\binom nk = \prod_{i=1}^k \frac{n+1-i}{i}\]or, put another way,
\[\binom {4}{2} = \frac{4 +1-1}{1} \cdot \frac{4 +1-2}{2} = 6\]In code it becomes
A second reminder is needed: the recursive formula
\[\binom nk = \binom{n-1}{k-1} + \binom{n-1}k\]In our previous example with \( n = 4 \) and \( k = 2 \) we had
\[1 2 \\ 1 3 \\ 1 4 \\ 2 3 \\ 2 4 \\ 3 4\]The first term of the recursive formula corresponds to the first interval of combinations starting with 1 and featuring 2,3 and 4 shuffling. The rest corresponds to three elements (every other element except 1 which has been already done) shuffling with \( k=2 \). The process restarts recursively. Another consideration to keep in mind is that the first lexicographic element is always the first \( k \) elements of the set. At the end of an interval the first combination is exactly the previous one with every element increased by one: \( {1,2} \dashrightarrow {2,3} \)
Understanding this reasoning is the key to understanding how ranking works for combinations (and also how k-subsets are generated). The pseudocode is
rankThisCombination(combination, n)
infer k from the combination's size
handle special k cases (e.g. n == k)
if k == 1 just return combination[0]
for every element in combination
--element; // Back out one interval
// this is only needed for the final adjustments
if we reached the first interval (i.e. no other intervals before this one)
return rankThisCombination(combination, n-1) // Recurse without the interval
return binomial(n-1,k-1) /* Skip one interval and continue */
+ rankThisCombination(combination, n-1)
Unraking works similarly although the code is a bit more involved:
unrankGetCombination(n, k, rank, start = 1)
comb = generate first combination from start with k elements
s = find the biggest smaller interval where our rank fits
if comb has more than one element
// recurse without the first element
comb = unrankGetCombination(n - 1, k - 1, rank - s, start = comb[1])
// If we finished our recursions or comb has one element, either case
// just return it. This assumes a [1;n] input array
return comb
The previous considerations assume a \( [1;n] \) interval input set although the same considerations can be applied to any input set.
The complete code for ranking and unranking combinations is the following