Google Code Jam 2012 Round 2

May 31, 2012 by xan

Again this year, I just missed the cut-off for moving past Round 2. I finished at position #623 with only the top 500 advancing. I was actually tied with #315 with about 400 other participants in points, but didn’t do well in the tie-breaker: time.

The first problem, Swinging Wild, involved crossing a juggle by swinging on vines with various positions and lengths. I found the description a bit long and confusing and skipped it at first, but I knew it must be relatively easy since it was the first problem, so I came back to it later. My first solution was rejected, and I realized my approach was bad and I had to start over. I considered moving to another problem, but at least now I understood this one. Eventually I got it by using dynamic programming and starting at the end vine, but it took way too long as I had several wrong submissions working out bugs. For each vine, I worked out the minimum length of vine needed to reach the other side from that vine. Working backwards, the minimum for each vine depended on the values for the vines already computed. I still completely missed the possibility that you might want to swing backward.

The second problem, Aerobics, was to place circles in a rectangular area without any overlap. In general, that’s the famous packing problem with no efficient solution, but in this case you’re told that there is plenty of room: over five times the area of the circles. So it comes down to find a heuristic for placing circles. I was lucky that mine worked, which was to sort the circles and place them in rows. The Google solution was even simpler: just place the circles randomly and pick a new position when there is overlap.

With less than 30 minutes left, I took a stab at the third problem, Mountain View. Given constraints about a set of mountain heights, you have to generate a set of heights consistent with the constraints. I didn’t have any code handy for solving a system of linear inequalities, but I tried a simple solution of adjusting each height to meet the constraints where it needed to be the tallest and repeating until all constraints were satisfied or some limit was reached. Unfortunately the contest ended before I finished. I did submit my solution 10 minutes later even though the site had gone in practice mode. My dumb solution was good enough to pass the small test test but not the large set.

I realized later that if I knew Mathematica better I could have used it to solve the inequalities. Later I hand-coded one of the simpler test cases (the last one on the problem page) in Mathematica:

FindInstance[{a >= 0, b >= 0, c >= 0, d >= 0,
(d – a) * 1 > (b – a) * 3, (d – a) * 2 > (c – a) * 3,
(c – b) * (3 – 1 ) >= (d – b) * (2 – 1)},
{a, b, c, d}, Integers]

It quickly returned a valid answer:

I guess I need to learn how to do file I/O in Mathematica before next year.

Update: I did get the general solution working in Mathematica (posted on StackExchange for comments), but it can’t handle the 1000+ variables and ~1M constraints of the large problem, though I guess the constraints could be pruned first since often one constraint is redundant with two others.

Google Code Jam 2012 Round 1

May 25, 2012 by xan

It’s been two weeks since Round 1 ended, and I was probably one of the few participants to compete in all three Round 1 sessions. The 2.5 hour sessions were staggered around the clock, and finishing in the top 1000 of any session put you through to Round 2. After coming up short in my first two attempts, I succeeded in the last chance even though it started at 5 a.m. local time. Here’s a rundown of each problem.

1A-A. Password Problem A relatively simple expected value problem. I think the only trick, besides translating the prose to math, was avoiding an O(n²) solution since n could go to 100,000.

1A-B. Kingdom Rush I skipped this one since the description was long and I was unfamiliar with the game.

1A-C. Cruise Control Given a bunch of cars traveling at given constant speeds on a two lane highway, how long can they avoid a collision by only changing lanes? I thought I had a good take on this one by ignoring lane and modeling each car’s path as a parallelogram in the dimensions of time and distance. Two sides measure the length of the car in the distance (vertical) dimension. The other two sides have slopes equal to the car’s speed. The interior of the parallelogram represents the path of the car over the entire time.

With two lanes to work with, it was OK for two cars paths to overlap, but it’s not OK for three to overlap. The diagram shows the problem’s third test case where the cars can travels for 1.4 secs before a speed change is needed. Unfortunately, I didn’t have any handy code for computing shape or even line intersections, and I burned all my time trying unsuccessfully to write them from scratch.

Update: Later I realized this approach is bad. You can’t ignore the starting lanes in situations when cars start “entangled” like the uppe two and lower two in the diagram above.

Round 1A Rank: #1915

1B-A Safety in Numbers The problem is to find a minimum score needed to avoid ending up in last place given a particular contest scoring system, given each contestant’s current score. I misread this one several times, but eventually solved it. My only consolation for spending extra time on it was that I found a better solution than Google’s solution. Their post-contest analysis suggested a binary search, but I found a summation formula. Google has since updated the analysis with the simpler solution after others reported finding it, too. I thought of using a binary search, but it felt like cheating.

1B-B Tide Goes In, Tide Goes Out I skipped this one at first because of the long description but came back to it. It basically involves trying to find the shortest path through a maze of caves with a twist. The water level is falling which affects your speed through caves of different elevation. I tried a depth first search with caching and pruning, figuring it would at least work for the small test cases for partial credit. I ran out of time coding it, but did finish it after the contest was over. Surprisingly, my solution solved the big and small cases instantly.

1B-C Equal Sums I recognized this problem as a variation of the famous subset-sum problem, which is not solvable in polynomial time. I did the small version with n=20 since even 2²⁰ is computable quickly, but the large version had n=500. I looked around for alternate algorithms before giving up and going back to problem B. It turns out the limited value of each set member make the problem solvable since the number of subsets greatly exceeds the number of possible sums.

Round 1B Rank: #1431 (tied with about 800 people for #769)

1C-A Diamond Inheritance The problem is to look for diamond patterns in a large inheritance tree. I used a bitset for storing ancestor relations and a work-queue to avoid deep recursion.

1C-B Out of Gas You need to get downhill as quickly as possible using only gravity and brakes while staying behind any car in front of you moving at various speeds. I think I had this one figured out as computing a series of quadratic curves bounded by line segments, but I ran out of time after saving this problem for last.

1C-C Box Factory The problem is to optimally match up series of boxes and toys coming off separate assembly lines. It’s basically a Longest Common Subsequence problem with the twist that each box or toy will occur up to 10¹⁶ times in a row, which makes the usual O(n²) dynamic programming algorithm take too long. I coded a variant of that algorithm by handling the repeats as single chunks.

Round 1C Rank: #196 for getting full credit on problems A and C.

The final scoreboards show the country of each participant (at least the ones who got some positive score), and I was wondering how the staggered starting times affected participation around the globe. The result was a visualization of each session’s global participation I posted on the JMP Blog.

Google Code Jam 2012 Qualification Round

April 15, 2012 by xan

I’ve been inactive recently at Project Euler, so I’m not sure how sharp I’ll be for the puzzle programming problems in this year’s Google’s Code Jam 2012. Yesterday was the qualifying round with 25 hours to earn 20 points on any of four problems, totaling 100 available points.

After earning over 20 points on the first two problems, I took a break for a while but not before taking a peak at the one hard problem of the group, Hall of Mirrors, so I could think about it during the day. The problem is to count your reflections in a mirrored room containing mirrored columns in given locations. Having done Project Euler’s Laserbeam problem, I recalled the simplification of reflecting the room instead of reflecting the ray of light so that the path is really a straight line. Here’s a pic of the idea from Google’s analysis:

The concept doesn’t easily map to having obstacles in the room, but it gave me enough of an idea to try the problem with a few hours remaining in the contest. It was quite a headache to think about all the reflections, but somehow I got it solved with an hour or to go. That gave me just enough time to complete the remaining easy problem and score 100 points!

Less than 200 of the 20,000 entrants solved the mirrors problem and even fewer got all 100 points. I know most folks were sensible enough to just stop working after getting the needed 20 points, but I’ll still count it as an accomplishment.

Google Code Jam 2011 Round 2

June 4, 2011 by xan

My luck ran out in Round 2 of the Google Code Jam 2011. I placed 626^th but needed to be in the top 500 to advance. At least I qualified for a T-shirt for being in the top 1000.

There were four problems this time, with less variation in difficulty than usual. I solved the first problem, Airport Walkways, completely and got partial credit for my correct-but-inefficient solutions to two other problems. Of course, I figured out how to make my solutions efficient shortly after the contest ended. I didn’t get to the fourth problem, A. I. War, but it looks more straightforward than I expected from a title alluding to artificial intelligence. Maybe I could have solved that one completely instead of one of the partial solutions.

The third problem, Expensive Dinner, involved finding the difference between the best and worst case for a particular problem involving least common multiples. I set about computing both cases separately and taking the difference. I thought I was doing well to reduce the naive O(n²) solution to O(n), but when n is 10¹², that’s not good enough (for full credit). I reduced the best case to be the number of primes less than or equal to n, aka π(n), and the worst case to be the number of primes powers less than n. That seemed like a dead-end, though, since it’s not so easy to calculate π(10¹²). What I didn’t see in time is that since I was going to be taking the difference of the two and they both included π(n), I didn’t have to compute that part at all. And counting the higher prime powers is much easier, and that’s all I really needed to do. Oh well.

The second problem, Spinning Blade, didn’t involve any fancy math, but required a clever data structure to reduce it from O(n⁴) to O(n²) or so. (n was limited to 500, so O(n²) is fine.) I didn’t find it in time, though, and went with my brute force solution for the partial credit.

Google Code Jam 2011 Round 1

May 24, 2011 by xan

Round 1 of the Google Code Jam 2011 contest was last week-end. I stayed up Friday night for the first of three sessions for round 1 qualification. The top 1000 in each two-and-a-half-hour session advance to round 2. I was fortunate to end up around 200^th, so I didn’t have to try the other sessions after all. I don’t know how many competitors there were, but there were 3100 who solved at least one problem correctly.

There were three problems, each with a small and large challenge test cases. The first problem was relatively easy, and basically boiled down to computing greatest common denominators to reduce fractions. I used a fraction class I had from Project Euler work, which made things even simpler. That got me 20 points in under 20 minutes, which turns out was already good enough to place around #640 in the final scores. Time is used for tie-breaking, and there were about 1500 partipants with 20 points.

The second problem was worth 30 points but had a very long description, so I skipped it. It looked interesting reading it later, but I wasn’t confident about boiling down all that prose correctly under time pressure. The problem was basically to find the hardest word(s) for someone to guess in hangman, given the guesser’s letter-picking order.

The third problem was the hardest and worth 50 points. You had to basically find the maximum score for a solitaire card game, given the order of cards in the deck. I thought it would be a simple dynamic programming solution, but the parameterization I chose was obviously not optimal. I included the current cards in the hand as part of the state, but with up to 80 cards in play, there could be a disastrous 2⁸⁰ possible states. After several tries, I was able to add enough tree-pruning to my code to solve the small test cases and get 15 points, but I never solved the large test case which included more pathological card sequences, defeating my pruning logic.

As a consolation prize, my code did trigger an internal JVM error at one point:

*** java.lang.instrument ASSERTION FAILED ***: "!errorOutstanding" with message can't create name string at ../../../src/share/instrument/JPLISAgent.c line: 769

Update: Turns out my parameterization was OK, but I had bugs in the pruning logic.

Google Code Jam 2011 Qualification

May 9, 2011 by xan

I advanced through the qualification round for the Google Code Jam 2011 over the week-end. It was a relatively low stress format allowing 24 hours to solve 8 problems, with only about 3 successful answers needed to move to the next round. There were really 4 problems, but each one had two input sets to solve, one small and one large.

The first two problems were easy — basically just seeing if you could carefully convert the prose instructions into code. The third problem required a little bit of math insight to reduce the simplistic O(2ⁿ) solution to O(n). I got those three done, but it still took a few hours, which would not be good enough for future rounds where time is more constrained.

The fourth problem, GoroSort, was pretty fun but required a little more thinking than I could put together. Basically, the problem is to sort a list of the first n integers by randomly permuting a subset of your choice. Because of the randomness, the answer is the expected number of permute operations to six decimal places. Fortunately, the problem statement included an example that suggested cycles should be sorted separately. I worked on a recursive solution to iterate through the possible cycle combinations (equivalent to integer partitions) resulting from a random permutation while weighting each one by its probability, but I didn’t get a successful submission.

Turns out my logic was good, but my probability formula was bad (which I should have realized since the total for all partitions was greater than 1). The next day I posed the question to the Mathematics StackExchange site and got a quick answer, which at least let me verify my logic (and learn more about combinatorial counting).

Fast Factoring for 64-bit Integers

October 19, 2008 by xan 1 Comment

Some of the Project Euler problems involve factoring numbers which are either large or small depending on your perspective. Problems are generally limited to 64-bit integers (about 18 digits) which are big numbers for most of us, but in the field of factorization those numbers are terribly small compared to the 100+ digit numbers security protocols deal with. Most advanced methods deal with optimizing the factoring of those huge numbers and don’t mind significant amount of overhead, but I want to know what’s fastest for 64-bit integers.

To find out, I ran some tests on some variations on three basic, low-overhead methods: Trial Division, Fermat’s method, and Pollard’s Rho method. All of these take a long time if the number being factored is actually prime, so it’s worthwhile to add in a fourth component which is a Miller-Rabin primality check. Here are my timing results for 400,000 random 64-bit integers. Actually, only the first test uses 400,000 numbers since I limited each test to 1 hour and extrapolated beyond that.

Seconds	Method
811	Rho + Trial Division + MR
6359	Fermat + Trial Division + MR
6393	Trial Division + MR after each factor found
29397	Trial Division +MR at start
71195	Trial Division without MR

I was really surprised at how well the Rho method worked in practice. It’s a probabilistic method that’s basically like trial division except it chooses numbers at random instead of sequentially. However, the “random” generator uses a polynomial such that lots of the values can be tested at once using some fancy number theory.

Fermat’s Method works best when there are two divisors near √n, which apparently doesn’t happen very often. Here is my Rho code, which is adapted from some pseudocode in a gamedev forum thread.

[sourcecode language=’java’] long rhoFactor(long n, int c, int max) {
int check = 10;
long x1 = 2;
long x2 = 4 + c;
long range = 1;
long product = 1;
int terms = 0;

for (int i = 0; i < max; i++) { for (int j = 0; j < range; j++) { x2 = (squareMod(x2, n) + c) % n; long next = multiplyMod(product, Math.abs(x1 - x2), n); if (next == 0) { return Math.max(gcd(product, n), gcd(Math.abs(x1 - x2), n)); } else if (++terms == check || j == range - 1) { long g = gcd(next, n); if (g > 1)
return g;
product = 1;
terms = 0;
}
else
product = next;
}

x1 = x2;
range *= 2;
for (int j = 0; j < range; j++) { x2 = (squareMod(x2, n) + c) % n; } } return 1; } [/sourcecode] For the parameters, I used small odd numbers for c, the polynomial constant term, and 16 - 20 for max which limits the generated values at around 2^max. If the factorization fails, I increase c by 2 and try again. For max = 16, it failed to find a factor about once for every 10,000 numbers and never failed twice in my tests. And those numbers had already had any small factors (less than about 50,000) removed with trial division.