on November 16, 2024

Have you heard of Heap's algorithm

I recently encountered a problem on HackerRank named “Bigger is Greater”: https://www.hackerrank.com/challenges/bigger-is-greater/problem. The problem statement is as follows: given a string, the solution program needs to find the smallest of all the string permutations which is bigger than the string itself. In case no solution that match the criterias is found, no answer is to be returned. Generating permutations is a non trivial problem that has always kept mathematicians and computer scientist on their toes ¹. No wonder why the problem is given a difficulty level of Medium. I wanted to share my approach to solving the problem through this post.

Heap’s algorithm:

Finding permutations of a sequence is not a new problem, and therefore one would expect that many have already found a reasonable solution to the issue. According to Sedgewick (1977) ¹, there are over thirty algorithms for finding the N! permutations of N elements. By doing some basic research (source: https://en.wikipedia.org/wiki/Permutation), I found a couple of relatively simple algorithms based on exchanging elements:

Steinhaus–Johnson–Trotter algorithm: relies on shifting a predefined element (the largest of the sequence) left and right based on certain conditions until it fills all the positions ¹
Heap’s algorithm: keeps on swapping elements based on a cursor, and on whether their current position index is even or odd ²

I decided to give Heap’s algorithm a try, as it seemed more straightforward to implement. According to the Wikipedia page, the pseudo-code of the iterative version looks like:

procedure generate(n : integer, A : array of any):
    // c is an encoding of the stack state. c[k] encodes the for-loop counter for when generate(k - 1, A) is called
    c : array of int

    for i := 0; i < n; i += 1 do
        c[i] := 0
    end for

    output(A)
    
    // i acts similarly to a stack pointer
    i := 1;
    while i < n do
        if  c[i] < i then
            if i is even then
                swap(A[0], A[i])
            else
                swap(A[c[i]], A[i])
            end if
            output(A)
            c[i] += 1
            i := 1
        else
            c[i] := 0
            i += 1
        end if
    end while

Implementation in Rust

Based on the pseudo-code, the algorithm can be implemented as follows:

fn heaps_permutations(str: String) -> Vec<String> {
    let mut permutations: Vec<String> = vec![];
    let mut as_chars: Vec<char> = str.chars().collect();
    permutations.push(str.clone());
    let mut c: Vec<usize> = vec![0; str.len()];

    let mut i = 1;

    loop {
        if c[i] < i {
            if i % 2 == 0 {
                as_chars.swap(0, i);
            } else {
                as_chars.swap(c[i], i);
            }
            // 
            permutations.push(String::from_iter(as_chars.clone()));
            c[i]+=1;
            i = 1;
        } else {
            c[i] = 0;
            i+=1;
        }

        if i == str.len() {
            break
        }
    }

    return permutations
}

Applying Heap’s algorithm to “Bigger is Greater”:

Going back to the puzzle, an initial approach would be using Heap’s algorithm to generate all the possible permutations of the input string, sorting them in ascending order, and then finding the one which is right after the input string in the list(if duplicates are removed/ignored). However, there are a few things to consider before diving into the implementation:

The number of permutations for a sequence of length N is N!. This means that our algorithm has a complexity of at least O(N!), which is scary. Knowing that the input test cases can provide a string of up to 100 characters, it’s not feasible to calculate the permutations of the whole string simply because it will take a lot more than the allowed limit (100! will take literally forever). Even if the solution is written in Rust, which is known to be one of the fastest languages in terms of execution time (a benchmark can be found here: https://sites.google.com/view/energy-efficiency-languages/results), the running time grows exponentially as the length of the string grows. For example, I tried timing the algorithm for a sequence of 11 characters and I got the following result (output from the time Linux command):

real	0m3.155s
user	0m2.413s
sys     0m0.738s

Just by increasing the sequence length by 1, from 11 to 12, the execution time grew by almost 10 fold:

real	0m34.510s
user	0m26.425s
sys     0m8.019s

The HackerRank execution environment has limits on both the execution time and the memory usage. In the case of the Rust programming language, the time limit is 5s and the memory limit is 512 MB. As you can see we can easily exceed the limit if a string’s length exceeds 11.

Therefore we need to find somehow a way to avoid going beyond 10 characters. The solution can be deduced by a simple rule of thumb: if we can find the smallest permutation of a substring of the input starting from its rightmost side, and then replace the original substring with the permutation, that should in theory give you the smallest of all permutations which is bigger than the string itself. Illustration:

Let’s say we would like to solve the problem for the following string:

bvulomthrfugqfbaknxginokekuemgb

As you can see this string is 31 characters long, so calculating 31! permutations is not an option.

If we pick up the last two characters of the string gb, there is only one permutation which is bg. In terms of lexicographic order bg is obviously not bigger than gb, so let’s increase our substring size by 1 and try again.

Now mgb has 6 permutations (sorted):


bgm
bmg
gbm
gmb
mbg
mgb

However, none of them is lexicographically bigger than the original string mgb, so we need to try further and increase the size of our substring by 1.

The permutations of emgb are 24 (sorted):


begm
bemg
bgem
bgme
bmeg
bmge
ebgm
ebmg
egbm
egmb
embg
emgb
gbem
gbme
gebm
gemb
gmbe
gmeb
mbeg
mbge
mebg
megb
mgbe
mgeb

As you can see, gbem comes right after our initial sequence emgb in terms of lexicographic order. Therefore, if we replace emgb by gbem, we would have found the smallest of all the permutations that are lexicographically bigger than our string because the first 27 characters stay the same, and if we add a permutation of the remaining 4 characters which is bigger (and the smallest of the bigger permutations), then the result will keep this property. So the solution is:

bvulomthrfugqfbaknxginokekugbem

We have found a solution by calculating the permutations for only 4 characters instead of 31, which is a massive win.

We should also not forget that there are cases that have no solution. Finding a way to escape early will definitely save some CPU cycles, and lead to the solution faster. We have observed from the first two attempts with 2 and 3 characters of length that when the input sequence is reverse sorted, we could not find a bigger permutation simply because the input sequence is the biggest. Therefore, to avoid calculating the permutations for sequences that have no result, we can test if the input character sequence is reverse sorted, and directly move to the next attempt (increase the length by one), if that’s the case.

Based on the experiment, a reasonable algorithm to solve the problem can be:

start from substring (length, length-2) of the input string
check if the substring is reverse sorted, if yes, increase the length by 1.
If you have reached the beginning of the string, without reaching step 4, it means the whole string characters are reverse sorted. Return no answer
compute all permutations of the substring
sort the permutations in lexicographic order
find from the sorted permutations list the permutation which is right after the substring in order (duplicates should be ignored)
replace the substring with the permutation, and return the result.

Implementation:

// I know Rust naming convention for functions is snake case
// but this is the original naming from HackerRank folks
fn biggerIsGreater(w: &str) -> String {
    if w.len() == 1 {
        return String::from("no answer");
    }

    let mut i = w.len() - 2;

    loop {
        let (prefix, suffix) = w.split_at(i);
        println!("{}", suffix);

        if !is_reverse_sorted(String::from(suffix)) {
            let permutations = heaps_permutations(String::from(suffix));
            let mut great_permutations: Vec<String> = permutations.iter().
            filter(|permutation|permutation.cmp(&&String::from(suffix)) == Ordering::Greater).
            map(|permutation|permutation.clone()).
            collect();
            great_permutations.sort();
            let mut result = String::from(prefix);
            let res_suffix = &great_permutations[0];
            result.push_str(&res_suffix);
            return result;
        }
        
        if i == 0 {
            break
        }

        i-=1;
    }

    return String::from("no answer")
}

After running the solution against HackerRank test cases, I got some interesting results:

3 successful cases
one case that finished with an “Abort Called” message
another case that finished with “Time limit exceeded”

test result

More fine tuning:

By researching a bit, it seems like one of possible causes of the “Abort Called” error is memory exhaustion (source: https://www.quora.com/What-is-Abort-called-error-in-hackerrank). It was quite obvious that keeping the whole list of permutations stored in a vector is not feasible for long sequences. Rust stores a String as a vector of bytes or u8, so for a string of size n, we would need n! * n bytes of memory to store all the permutations. As I mentioned before, HackerRank sets the memory limit for the solution environment to 512 MB. Therefore, it is easy to hit the limit if the size of the string is above 11 characters long.

\[11! * 11 = 439084800 \ bytes \approx 439 \ MB \\ 12! * 12 = 5748019200 \ bytes \approx 5748 \ MB \\\]

12 permuations would timeout anyway. In the case of 11 characters, it must be that there are additional memory allocations causing our program to eat above the 512 MB limit, because 439 MB is only the size of the vector of permutations. Our program does more than that.

As we are only interested in one permutation (the min of the permutations that are bigger than the initial sequence), we can alter our implementation of Heap’s algorithm to keep only one value and whenever we calculate a new permutation we can do a comparison with the stored value. If the new permutation is bigger than the initial sequence and less than the stored value, we can assign the new permutation to the stored value. In this way, the memory usage can be drastically reduced as we will keep only one value instead of a vector of size n!.

fn heaps_permutations_min_from_bigger(str: String) -> Result<String, &'static str> {
    let mut as_chars: Vec<char> = str.chars().collect();
    let mut c: Vec<usize> = vec![0; str.len()];
    let mut bigger = str.clone();
    let mut min_from_max = str.clone();

    let mut i = 1;
    let mut first_iter = true;

    loop {
        if c[i] < i {
            if i % 2 == 0 {
                as_chars.swap(0, i);
            } else {
                as_chars.swap(c[i], i);
            }
            
            let current = String::from_iter(as_chars.iter());

            if current > str {
                if current > bigger {
                    bigger = current.clone();
                }
                if first_iter {
                    min_from_max = bigger.clone();
                    first_iter = false;
                }
                if current < min_from_max {
                    min_from_max = current.clone();
                }
            }
            c[i]+=1;
            i = 1;
        } else {
            c[i] = 0;
            i+=1;
        }

        if i == str.len() {
            break
        }
    }

    if min_from_max > str {
        return Ok(min_from_max);
    }

    Err("min from bigger not found")
}

As you can see, I have used the Result as a return type for the function to make sure I cover the case in which a result is not found. In this case, an error is to be returned. However, since we check if the input string is reverse sorted before calling the function, the Err will in principle never happen. Now our solution looks like the following:

fn biggerIsGreater(w: &str) -> String {
   let mut input_as_string = String::from(w);

    if w.len() == 1 {
        return String::from("no answer");
    }

    let mut i = w.len() - 2;

    loop {
        let (prefix, candidate) = w.split_at(i);

        let mut candidate_result = String::from(prefix);


    if !is_reverse_sorted(String::from(candidate)) {
            match heaps_permutations_min_from_bigger(String::from(candidate)) {
                Ok(min_from_bigger_perms) => {
                    candidate_result.push_str(min_from_bigger_perms.as_str());
                    return candidate_result;
                }
                _ => {}
            };
    }

        
        if i == 0 {
            break
        }

        i-=1;
    }

    return String::from("no answer")
}

An additional performance benefit is that we don’t have to do the sorting anymore because we get the only value that we look for.

Almost works:

After trying the improved version above, the case failing with “Abort Called” passed. However, the timeout was still happening:

test result2

I had no choice but to download the case and run it locally to see what was causing the program to take longer than the allowed time.

I found out that with the improved algorithm, the case finishes in about 11s, which is slightly more than 2 times the allowed limit. By digging further, I found that the input string from the case that was taking most of the time is xvkyxtrrrpobb. With this string, we need to go from the end all the way to the character k to be able to calculate permutations. As outlined earlier, if the substring is reverse sorted, there is no need to waste time calculating the permutations. This means that we have to calculate the permutation for a 11 characters sequence, which is definitely something we cannot achieve with this solution within the time frame. To see if the test would pass without this particular input sequence, I decided to hardcode the solution and to run the test again:

if w == "xvkyxtrrrpobb" {
    // 🤣 🤣 
    return String::from("xvobbkprrrtxy");
}

Guess what? The test case finally passed.

So our approach works, but takes more than the allowed time if the length of the substring for which we need to calculate permutations is above 10, which happened only once out of the 200111 input strings. A near success. I decided not to spend more time on this. HackerRank wins this time.

Any ideas to make the solution work with the single failing case? Please leave me a comment below.

References:

Sedgewick, R. (1977). Permutation Generation Methods. Computlng Surveys, Vol 9, No 2, June 1977. https://doi.org/10.1145%2F356689.356692 ↩ ↩² ↩³
https://en.wikipedia.org/wiki/Heap%27s_algorithm ↩

← → Top