Code Jam 2019 Round 1A

I tried the 2.5-hour Round 1A Friday night and just missed the cut-off for advancing. The round started a few minutes late as the problems site was overloaded at first. When it did respond, I got the problems in a different order, with the last problem listed first. Thinking it was the first (and therefore easiest) problem, I started on it without noticing the difficulty scores. That “Alien Rhyme” involved finding pairs of words with common suffixes. There were a few gotchas to the greedy approaches, so I was lucky to work out a correct algorithm, but it took me a few tries to get a decent data structure for organizing the words by common suffixes. I ended up with a vector of maps, one per suffix length with each map itself mapping a suffix to a set of words. Amazingly, it worked.

Next I tried the Pylons problem which boiled down to finding a complete path through a graph. I couldn’t think of a good approach for the large 20×20 constraint, so I went for partial credit with a brute force solution which would work for the small 5×5 constraint. After that was submitted I noticed that the large graphs were dense enough to have many complete paths, so maybe a randomized brute force approach would work. I added a little randomization and resubmitted. It turned out I had the right idea, but I didn’t randomize it enough (only the starting points and not the node visitation order). So I still only got partial credit, but with a penalty for making a second submission.

The remaining problem was an interactive one and I had too little time left for the overhead of testing it locally, so I didn’t attempt it. I could see that the solution may require solving a Chinese Remainder problem, and I did solve it the next day. It’s nice that the site enters practice mode when the competition is over, so we can test out ideas later.

The good news is that I get to compete in Round 1B!

Google Code Jam 2019 Qualifying

This year’s Google Code Jam started yesterday with the 27-hour Qualifying Round. I managed to get all the problems correct for a score of 100/100 along with about 1000 of the 35,000+ participants. You only needed a score of 30 to advance, so many who could have done better probably stopped when they had enough points to qualify.

The problems always get harder with each round, but this year’s qualifying round seemed easier than usual. The first two problems only took a few minutes each, and I almost stopped there since I knew I already had 30 points.

The third problem, Cryptopangrams, was also easy to figure out, but the solution required doing math on 100 digit integers, which I couldn’t readily do in C++. I briefly looked around for a simple big-integer library to include but decided to relearn enough Python to use that. It felt silly googling things like whether array indices start at 0 or 1 (it’s 0 in Python), how to comment out a line (‘#’ character), and how to do integer division (‘//’ operator). I never did come across a good Python cheat sheet; instead I had to wade through various introductory teaching pages. Fortunately, there was no real time pressure and it was a simple problem.

All the problems have at least one small/easy test case and one large/hard test case. For the fourth problem, I could figure out the easy case quickly but had no idea how to solve the hard case. After re-reading the problem statement I realized the parameters were constrained enough (looking for at most 15 errors in a 1024 bit string) for me to solve. These are most fun problems to solve: when you start with no clue and keep looking at different angles until you find a solution.

The other complication for the fourth problem was that it was interactive. Instead of the usual read-problem-input/write-solution sequence, an interactive problem requires multiple queries and responses before the final solution can be determined, and that makes testing your code much trickier. Google nicely provides a testing Python script, which I eventually got working in my CLion environment. Had to use chmod to make testing_tool.py executable and had to prefix it with #!python for it to run. Then I set up a CLion target for the interactive_runner.py script which would launch the testing tool and my code and get them to talk to each other. Maybe I can remember enough of it to get through the next round when time is more constrained.

The Code Jam results list is a bit harder to navigate than it used to be. I wanted to look at other contestants’ solutions to the Cryptopangrams problems to see if any of them used C++. You can still look at submitted solutions, but when you do so you lose your place in the list, so you have to start over at the top of the list with each try. I looked at a few entries in the top 20 and none used C++ for that problem. Most used Python, some Java and one Ruby.

2017 Highlights

Just to avoid the embarrassment of going a calendar year without a blog post, here is a summary of some of my 2017 highlights. It’s not that I’ve been silent, but I’ve mostly switched to microblogging on Twitter (@xangregg) for sharing updates.

Tie-dye marathon

Some summers I keep track of a tie-dye “marathon” where I see how many days in a row I can go wearing a different tie-dye each day. This year I made it 32 days. Most of the images are on Twitter. Here are days 13 – 24.

Google Code Jam 2017

I tried both the code jam and the distributed code jam this year. In the regular code jam I made it to the round of 3000 before bombing out with a tied-for-last score. I keep forgetting the later rounds often require ready-made advanced optimization algorithms. I did better in the distributed code jam, advancing to the round of 500, which was enough to win my third code jam T-shirt. The distributed code jam is tricky since you submit code that is run on a distributed system of 100 CPUs. This year it helped that I built a test harness that used 100 threads for a better simulation of the actually process communication issues.

Packed Bars

Trying to find a decent way to show data with many categories in a skewed distribution, I created a new chart type called packed bars. Here’s an example showing costs of billion-dollar disasters before this year.

I presented packed bars in a poster at the IEEE VIS conference in Phoenix, and there are now implementations in JMP, R, Excel and D3.js. Ironically, the JMP script is the weakest implementation pending the arrival of JMP 14 in March 2018. It’s only great such data and has some learning curve, but skewed data is pretty common, especially in quality control (defect counts) and finance. I hope packed bars can become useful to others.

Low Water Immersion Tie-Dyes

As a follow up to last week’s test dye run, I tried a few shirts with my newly certified dyes. I also wanted to try out a possible button-down shirt supply. It’s hard to find dress shirts that are both all cotton and cheap enough to experiment with. However, I found a 97% cotton dress shirt for $23 on Amazon and decided to give it a try.

I’m using the low water immersion technique from Paula Burch’s site. Basically, you cram the shirt into a jar, pour dye(s) on it, wait an hour, add concentrated fixer, wait another hour or so, and rinse. Very simple if you’re happy with a random pattern (which has a greater risk of flopping).

Here’s the dress shirt.

navyshirt - 1

I tried mixing two dye colors, Navy Blue and Camel (light brown), but I don’t see any trace of the Camel. Nonetheless, the shirt turned out pretty well. The 3% spandex doesn’t seem to have causes any dying issues. The seams are apparently nylon and didn’t take up any dye.

Each jar had about four cups of water and about eight teaspoons of dye, which seems to have been too much. The short sleeve was all Rust, and doesn’t have as much variation as I was expecting from the test sample.

rustshirt - 1

This mix of Jade Green and Deep Yellow is my least favorite. Looks more like a laundry accident.

greenshirt - 1

Oh well, at least the dress shirt looks promising. Will order more of those.

Tie-dye Test Shirt

I haven’t been tie-dying much lately, and last time I tried my expected green shirt came out brownish yellow. I was afraid all my dyes had done bad, which apparently happens over time as they are exposed to moisture in the air. This week, I set out to get a sample of every dye I have. Not wanting to actually mix up 50 bottles of dye, I used an old shirt and put small smudges of dry dye on it in a grid.

dyesamples - 1 copy

The grid was set to match a wireframe shelf which I set down over the shirt to help keep each dye within the lines. So far so good. Then I poured soda ash water over the shirt, and things got screwed up a bit. I should have been gentler with the water application or placed another layer of fabric on top. Instead I got some splatter, but most of the color swatches can still be made out.

dyesamples - 2 copy

So it looks like most of the dyes are good. Below are labeled excerpts of each dye. It’s about the same layout as the shirt but rotated 90° clockwise.

samples1

samples2

A few colors are duplicated, reflecting how the collection has grown by “inheriting” dyes from retired tie-dyers.

I guess I was unlucky last year in using Moss Green, which is about the only bad color in the bunch. Chartreuse also looks a bit brownish.

How old is that globe?

Bonnie gave me this old globe for my 50th birthday last year, hoping it was also about 50 years old. There’s no date on it, but the seller thought it was from the 1960s.

globe - 4a

It wasn’t until I saw this recent XKCD flow chart for dating maps that I sought to get a better estimate of the globe’s age.

thumb1

Simple as it is, I made a couple errors trying to traverse the diagram, beginning with the very first choice.

diagram closeup 1

My globe has Istanbul, but I went off on the left path, not noticing that the order of the labels was reversed from the order in the question. It didn’t take long too realize the error, but I almost didn’t catch my second error.

map closeup2

I was trying to follow the “no” branch from the Bangladesh node, and I mistakenly ended up at the United Arab Republic node. Apparently following a line is harder than I thought.

Once I had my errors fixed, the diagram dated my globe to the range 1961 to 1964. In the course of doing a little more exploring, I discovered the manufacturer of the globe, Replogle Globes, has a How Old is Your Globe? page. It’s a list of 100+ countries that have changed names since 1939.Screen Shot 2016-06-05 at 2.18.33 PMIt’s not as fun as the XKCD chart, but it’s more complete, especially in the period of my globe, though I still had to supplement it with Wikipedia to get the exact dates of the name changes.

Interestingly, the results are both precise and slightly conflicting. My globe has Zambia, which was Northern Rhodesia until October 24, 1964, and it has the United Republic of Tanganyika and Zanzibar which only existed for six months ending on October 29, 1964. That gives an amazingly precise origin window of only five days. However, the globe also denotes Malta as being a British territory even though Malta gained independence on September 21, 1964. I’m guessing the globe is from late 1964 or early 1965 with the Malta indication (and possibly Tanzania) being out of date. Regardless, 50 years is about right for the age.

globeclose - 1 (2)

The globe is in pretty good condition and has a two axes of rotation, making it easy to get a good look at Antartica, for instance.

globe antarctica

Google Code Jam 2016

Another year, another Google Code Jam. Similar result this year: made it through qualification and Round 1, but got stymied in Round 2. The competition seemed tougher this year both in quantity and quality. Though the bar for qualification was raised this year, 22,000 coders made it to Round 1, up from 12,000 last year. There are three time slots for Round 1 with 1000 advancing from each slot. In the first slot, over 1000 coders had a perfect score (not me), and as a result 40+ perfect scores didn’t advance from that slot (time was the tie breaker).

Fortunately, the second slot was a little more forgiving and I was able to advance with a score of 56/100, good for position 933. The problem I missed, Technobabble needed a Maximum Flow algorithm for an efficient solution, and I’ve since added one to my toolbox.

In Round 2, I got stuck on the first problem which involved a giant rock-paper-scissors tournament. I did the small version with brute force, and tried multiple algorithms to increase efficiency for the large problem, mostly relying on memoization and symmetry. However, I needed to step back and rethink the problem instead of blasting ahead with code. I eventually did figure out the efficient solution just after time expired. The time pressure of the competition has an effect on thinking that needs to be overcome.

I’ve been using a C++ parallel processing harness for these problems, and that may have made me less concerned about finding the very best algorithm. When you have the right solution, it usually takes less than a second to execute even though you’re given eight minutes to submit results. If a subpar algorithm takes, say, 30 minutes, for all the test cases, there is a chance it could still finish in time with parallel solving of the test cases. But this time, my subpar algorithm would take many hours to execute, so the parallelization couldn’t save.

With that ringing endorsement, I’ve put my parallel solver harness on Github as parallelcodejam. I’ve used it a few years, and I think it was only beneficial once.

The parallel code does have one nice idea, I think, which is to use an atomic integer as an index into a work queue. That way, the queue itself doesn’t need to be thread-safe, and the synchronization overhead is minimal.