Google Code Jam 2013 Round 1

Like last year, it took me three attempts to get through Round 1 of the Google Code Jam, having to get compete at 5am Sunday. For the first two attempts, I essentially skipped the easy (small) problems and went straight into solving the large versions of each problem, which would simultaneously solve the small problem, too. But that strategy didn’t make the cut (top 1000 scores), so I focused on solving the small problems first for Round 1C and finished in the top 300 to move on to Round 2.

I did get the large version of Problem A, Consonants, after working out a dynamic programming solution. And almost finished the large version of Problem C, The Great Wall, but it took me an extra 20 minutes after the contest ended to get it right. It’s nice that Google let’s you keep trying for practice even when the context is over.

In finishing the problem, I learned about a Java API I didn’t know about, NavigableMap. I was looking for something like C++’s std::map::lower_bound/upper_bound for iteration through a subrange of an ordered map, but I didn’t see anything in the Java Map API. Fortunately I found NavigableMap, whose floorEntry and higherEntry provide similar functionality. The problem called for keeping track of repeated attacks/fortifications to a wall. For the small solution I stored the wall as an array using an element for each unit-length segment, but that wasn’t efficient enough for the large solution. I needed each stored element to represent a run of similar segments and support splitting of segments as new attacks occurred while staying ordered for fast look-up. The balanced binary tree implementation of NavigableMap fit the bill.

A great feature of Code Jam is being able to examine other competitors’ entries after the contest is over. I checked one top scorer’s solution and he didn’t use such a data structure. Instead, he made two passes through all the wall attacks. On the first pass he recorded where all the splits might occur. Then he could allocate an array for those segments without needing to split elements on the second pass.

Posted in Code | Leave a comment

Unsophisticated Art Review: Suzanne Vega

For Valentine’s Day, Bonnie surprised me with tickets to a Suzanne Vega concert in Carrboro, which was especially nice since I’m a big fan but Bonnie has only known suffering due to my years of repeatedly humming Tom’s Diner.

In any case, we both thought the performance was outstanding. The word that keeps resonating with me is “authentic.” The important things were excellent: the lyrics, the singing, the stories, the music. The other things were appropriately less polished: a hat that fell off, a new song sung with the help of a lyric sheet, missed stage lighting. The small venue helped (300 – 400 people). Her only accompaniment was veteran musician Gerry Leonard, who handled a wide variety of music and even tried to play along when Suzanne Vega sang a verse of her thirty year old country music song (after an audience member request).

I’m sure it’s blasphemous to say so, but one thing I often dislike about live performances is that the music is not as finely mixed as in a studio recording, especially the vocals often getting overwhelmed by the instruments. Except for some bass than sounded clipped at times, the live performance was much better in this case, though perhaps more due of artistic refinement over time. The vocal inflections carried more feeling, and in Small Blue Thing the vocals somehow came out sounding round “like a marble or an eye.”

 

Posted in Art | 1 Comment

Graph Makeover Contest — Household spending

I tried Naomi Robbins’ Graph Makeover Contest recently. After trying a few such competitions, I find I learn more after seeing the other entries if I first make the effort of producing an entry myself. Robbins presented the following table from the Bureau of Labor Statistics report on prices and spending [pdf] and asked for a graphic version.

The first step was to get the data and understand it. Ignoring the percent change, which is computed, you can think of the data as one continuous variable, annual spending, by three categorical variables: year, housing tenure and expenditure. Expenditure is tricky though. It’s really expenditure and sub-expenditure. The food expenditure is completely broken down into sub-expenditures, but transportation and healthcare are only partially subdivided. And double checking the data, it seems that some expenditures are missing altogether because they don’t add up to the stated total.

The second step is to decide what message the graph should communicate, and that will lead into the third step of selecting an appropriate visualization for the data and the message. The table presentation, for example, shows all the data with good accuracy, but shows some messages better than others. It’s not too hard to see that the total expenditures stayed about the same from 1986 to 2010 or that health insurance has by far the biggest precent increase. However, it’s harder to rank expenditures or to see that renters pay relatively more on housing than homeowners.

If I were writing the report, I would probably include several graphs to make several points about the data. Here’s a collection I experimented with.

This area chart does a good job of showing that total expenditures stayed about the same and how the totals compare across housing tenure levels. Less effectively, it shows how each expenditure changed over time and their ranking (larger expenses are at the bottom).

Note that I added in an “Other” category for the missing expenditures and replaced the umbrella expenditures with their sub-expenditures so summing would work out. That meant, for instance, replacing “Transportation” with the new sub-expenditure “Transportation (non-fuel)”.

The area chart muddles the changes of the middle items though. This slope chart works better to seeing each expenditure change.

Coloring is another challenge with this many categories, and I didn’t find a great solution. Maybe only the interesting lines should have color…

We’ve lost the total, of course, and though we can see absolute amounts and changes, we still don’t see the giant relative increase in health insurance. For that I tried a log scale.

A log scale is may not be appropriate for a general audience, but it does show relative change well because equal vertical distances represent equal multipliers.

If you want to concentrate only on relative change, this “spoke” chart I made by accident makes the point even better about the health insurance cost change.

It took me a while decipher it, but it’s sort of like all the lines from the previous chart centered on the same location.

My actual entry used the log scaled slope lines, except I changed the vertical axis to relative values and added the total spending numbers at the bottom, which adds the message about stable spending and provides a basis for the percentages. Using relative spending amounts helps see that renters spend relatively more on housing, which was one point raised in the prose of the original report.

I’m still not happy with having so many colors. With more time, perhaps I could have found better colors, mixed in different line styles, tried putting the labels on the lines, combined similar categories, …

Posted in Graphs | Leave a comment

Low Water Immersion Dyeing

I tried three more shirts using the low water immersion dyeing method (mostly) as described by Paula Burch. The basic technique is

  • stuff a shirt in a jar
  • add dye(s) and just enough water to cover it
  • wait an hour for dye to move around
  • add soda ash to bind the dye to the fabric
  • wait another hour
  • done (remove shirt and rinse out dye)

Pretty simple. The results are much more random than when you use tying and folding to control the dye pattern.

The first shirt was made with a single dye called emerald green. The dye itself is a mix of other primitive colors, and this technique lets them separate a little before the fixer is applied.

For the second shirt, I put yellow dye in the bottom of the container and azure blue dye in the top. I had been working with color spaces and was imagining I would get something like the yellow-blue axis () in the CIE L*a*b* color space, but I forgot that blue and yellow would mix to make green instead.

For the third shirt, I included three colors in the immersion: magenta, yellow and lilac.

With the help of these three new shirts, I shattered last summer’s “record” of 18 by wearing a different tie-dye shirt for 25 days in a row. I guess I need to add 1.2 days to call it a marathon. Here’s a collage of the 25 shirts, in no particular order.

Posted in Art | 2 Comments

Google Code Jam 2012 Round 2

Again this year, I just missed the cut-off for moving past Round 2. I finished at position #623 with only the top 500 advancing. I was actually tied with #315 with about 400 other participants in points, but didn’t do well in the tie-breaker: time.

The first problem, Swinging Wild, involved crossing a juggle by swinging on vines with various positions and lengths. I found the description a bit long and confusing and skipped it at first, but I knew it must be relatively easy since it was the first problem, so I came back to it later. My first solution was rejected, and I realized my approach was bad and I had to start over. I considered moving to another problem, but at least now I understood this one. Eventually I got it by using dynamic programming and starting at the end vine, but it took way too long as I had several wrong submissions working out bugs. For each vine, I worked out the minimum length of vine needed to reach the other side from that vine. Working backwards, the minimum for each vine depended on the values for the vines already computed. I still completely missed the possibility that you might want to swing backward.

The second problem, Aerobics, was to place circles in a rectangular area without any overlap. In general, that’s the famous packing problem with no efficient solution, but in this case you’re told that there is plenty of room: over five times the area of the circles. So it comes down to find a heuristic for placing circles. I was lucky that mine worked, which was to sort the circles and place them in rows. The Google solution was even simpler: just place the circles randomly and pick a new position when there is overlap.

With less than 30 minutes left, I took a stab at the third problem, Mountain View. Given constraints about a set of mountain heights, you have to generate a set of heights consistent with the constraints. I didn’t have any code handy for solving a system of linear inequalities, but I tried a simple solution of adjusting each height to meet the constraints where it needed to be the tallest and repeating until all constraints were satisfied or some limit was reached. Unfortunately the contest ended before I finished. I did submit my solution 10 minutes later even though the site had gone in practice mode. My dumb solution was good enough to pass the small test test but not the large set.

I realized later that if I knew Mathematica better I could have used it to solve the inequalities. Later I hand-coded one of the simpler test cases (the last one on the problem page) in Mathematica:

FindInstance[{a >= 0, b >= 0, c >= 0, d >= 0,
(d - a) * 1 > (b - a) * 3, (d - a) * 2 > (c - a) * 3,
(c - b) * (3 - 1 ) >= (d - b) * (2 - 1)},
{a, b, c, d}, Integers]

It quickly returned a valid answer:

{{a -> 7, b -> 0, c -> 1, d -> 0}}

I guess I need to learn how to do file I/O in Mathematica before next year.

Update: I did get the general solution working in Mathematica (posted on StackExchange for comments), but it can’t handle the 1000+ variables and  ~1M constraints of the large problem, though I guess the constraints could be pruned first since often one constraint is redundant with two others.

Posted in Code | 1 Comment

Google Code Jam 2012 Round 1

It’s been two weeks since Round 1 ended, and I was probably one of the few participants to compete in all three Round 1 sessions. The 2.5 hour sessions were staggered around the clock, and finishing in the top 1000 of any session put you through to Round 2. After coming up short in my first two attempts, I succeeded in the last chance even though it started at 5 a.m. local time. Here’s a rundown of each problem.

1A-A. Password Problem A relatively simple expected value problem. I think the only trick, besides translating the prose to math, was avoiding an O(n2) solution since n could go to 100,000.

1A-B. Kingdom Rush I skipped this one since the description was long and I was unfamiliar with the game.

1A-C. Cruise Control Given a bunch of cars traveling at given constant speeds on a two lane highway, how long can they avoid a collision by only changing lanes? I thought I had a good take on this one by ignoring lane and modeling each car’s path as a parallelogram in the dimensions of time and distance. Two sides measure the length of the car in the distance (vertical) dimension. The other two sides have slopes equal to the car’s speed. The interior of the parallelogram represents the path of the car over the entire time.

With two lanes to work with, it was OK for two cars paths to overlap, but it’s not OK for three to overlap. The diagram shows the problem’s third test case where the cars can travels for 1.4 secs before a speed change is needed. Unfortunately, I didn’t have any handy code for computing shape or even line intersections, and I burned all my time trying unsuccessfully to write them from scratch.

Update: Later I realized this approach is bad. You can’t ignore the starting lanes in situations when cars start “entangled” like the uppe two and lower two in the diagram above.

Round 1A Rank: #1915

1B-A Safety in Numbers The problem is to find a minimum score needed to avoid ending up in last place given a particular contest scoring system, given each contestant’s current score. I misread this one several times, but eventually solved it. My only consolation for spending extra time on it was that I found a better solution than Google’s solution. Their post-contest analysis suggested a binary search, but I found a summation formula. Google has since updated the analysis with the simpler solution after others reported finding it, too. I thought of using a binary search, but it felt like cheating.

1B-B Tide Goes In, Tide Goes Out I skipped this one at first because of the long description but came back to it. It basically involves trying to find the shortest path through a maze of caves with a twist. The water level is falling which affects your speed through caves of different elevation. I tried a depth first search with caching and pruning, figuring it would at least work for the small test cases for partial credit. I ran out of time coding it, but did finish it after the contest was over. Surprisingly, my solution solved the big and small cases instantly.

1B-C Equal Sums I recognized this problem as a variation of the famous subset-sum problem, which is not solvable in polynomial time. I did the small version with n=20 since even 220 is computable quickly, but the large version had n=500. I looked around for alternate algorithms before giving up and going back to problem B. It turns out the limited value of each set member make the problem solvable since the number of subsets greatly exceeds the number of possible sums.

Round 1B Rank: #1431 (tied with about 800 people for #769)

1C-A Diamond Inheritance The problem is to look for diamond patterns in a large inheritance tree. I used a bitset for storing ancestor relations and a work-queue to avoid deep recursion.

1C-B Out of Gas You need to get downhill as quickly as possible using only gravity and brakes while staying behind any car in front of you moving at various speeds. I think I had this one figured out as computing a series of quadratic curves bounded by line segments, but I ran out of time after saving this problem for last.

1C-C Box Factory The problem is to optimally match up series of boxes and toys coming off separate assembly lines. It’s basically a Longest Common Subsequence problem with the twist that each box or toy will occur up to 1016 times in a row, which makes the usual O(n2) dynamic programming algorithm take too long. I coded a variant of that algorithm by handling the repeats as single chunks.

Round 1C Rank: #196 for getting full credit on problems A and C.

The final scoreboards show the country of each participant (at least the ones who got some positive score), and I was wondering how the staggered starting times affected participation around the globe. The result was a visualization of each session’s global participation I posted on the JMP Blog.

 

Posted in Code | Leave a comment

Google Code Jam 2012 Qualification Round

I’ve been inactive recently at Project Euler, so I’m not sure how sharp I’ll be for the puzzle programming problems in this year’s Google’s Code Jam 2012. Yesterday was the qualifying round with 25 hours to earn 20 points on any of four problems, totaling 100 available points.

After earning over 20 points on the first two problems, I took a break for a while but not before taking a peak at the one hard problem of the group, Hall of Mirrors, so I could think about it during the day. The problem is to count your reflections in a mirrored room containing mirrored columns in given locations. Having done Project Euler’s Laserbeam problem, I recalled the simplification of reflecting the room instead of reflecting the ray of light so that the path is really a straight line. Here’s a pic of the idea from Google’s analysis:

The concept doesn’t easily map to having obstacles in the room, but it gave me enough of an idea to try the problem with a few hours remaining in the contest. It was quite a headache to think about all the reflections, but somehow I got it solved with an hour or to go. That gave me just enough time to complete the remaining easy problem and score 100 points!

Less than 200 of the 20,000 entrants solved the mirrors problem and even fewer got all 100 points. I know most folks were sensible enough to just stop working after getting the needed 20 points, but I’ll still count it as an accomplishment.

 

Posted in Code | Leave a comment