January 8, 2011

Hacker Cup - Studious Student Optimization

The way this round works is that once you’ve chosen to download the Facebook-provided data, you have 6 minutes from the time of download to paste the output from your program into the Hack Cup web form.

As I mentioned in my previous post, I’m able to execute the worst-case scenario - 100 rows of 9 words of 10 characters in length - in 78 seconds. That may be fast enough for the qualification round, but it is almost assuredly suboptimal.

Since I have 72 hours to solve 1 of the 3 presented problems, and I think I may functionally have this one licked, I’m going to spend some time optimizing as a practice exercise for the upcoming rounds. 

I’ve written my initial, suboptimal solution in Java and I’m thinking of two approaches to tighten things up:

  1. Multi-core Java: Let’s say I have for a single word set (1 line in the input file), a list of 400,000 permutations. I have 8 cores on my CPU, therefore I will split the sorting work among 8 different threads (1 per core) giving each one 50k to work on. The performance gain isn’t quite linear due to the overhead of recombining the lists afterwards and having to wait for all threads to complete, but I’d still expect to be able to finish in - ballpark - 25% of my original execution time. 
  2. Erlang: Considering I’ve never written a line of Erlang, I don’t expect this to be a big success, however high-concurrency, multi-processor friendliness is inherent in its design and maybe it’ll just be stupid easy. This is as good an excuse as any to dive head-first into a language I’ve always wanted to check out. And Facebook has a big old throbbing erection for Erlang, which can’t hurt if I make it to the later rounds :)