January 8, 2011

Facebook Hacker Cup - Problem #3 - Studious Student

I’ve decided to start working problem #3 first…
Here is the problem statement Copy & Pasted directly from the site:

Studious Student

You’ve been given a list of words to study and memorize. Being a diligent student of language and the arts, you’ve decided to not study them at all and instead make up pointless games based on them. One game you’ve come up with is to see how you can concatenate the words to generate the lexicographically lowest possible string. 

Input

As input for playing this game you will receive a text file containing an integer N, the number of word sets you need to play your game against. This will be followed by N word sets, each starting with an integer M, the number of words in the set, followed by M words. All tokens in the input will be separated by some whitespace and, aside from N and M, will consist entirely of lowercase letters. 

Output

Your submission should contain the lexicographically shortest strings for each corresponding word set, one per line and in order. 

Constraints

1 <= N <= 100
1 <= M <= 9
1 <= all word lengths <= 10

Example input:

5

6 facebook hacker cup for studious students

5 k duz q rc lvraw

5 mybea zdr yubx xe dyroiy

5 jibw ji jp bw jibw

5 uiuy hopji li j dcyi

Example output:

cupfacebookforhackerstudentsstudious

duzklvrawqrc

dyroiymybeaxeyubxzdr

bwjibwjibwjijp

dcyihopjijliuiuy

—————————————————————————

This seems to be a pretty easy problem, until you look closely at the 4th result in the sample input:

jibw ji jp bw jibw

The given sample output for this line is:

bwjibwjibwjijp

After following the requirements of the problem statement, you may think that all you need to do is order sequentially by alphabetical value, every string in the list. If you’re writing your solution in Java, this is a trivial task as the String class already has the necessary alphabetical comparators built-in.

In fact, this approach gives a correct answer for everything in the sample except….uh oh….the fourth one from YOUR result set:

bwjijibwjibwjp

Look closely and you’ll notice that the Facebook approved answer has the ji token positioned BEFORE the two jibw tokens. The implication is that you cannot actually compare the values of the individual tokens and just concatenate them ascendingly.

So, how to compute the Answer, then? 

You must combine every possible permutation of the words in the list, and THEN do the alphabetical value comparison on the combined results which should solve for any case that Facebook will throw at you.

This is of course computationally non-trivial, and orders of magnitude more difficult. I’ve got a working solution that 
I’m currently trying to optimize from my best of 78sec on a dataset that meets the maximum constraints specified by the problem.