Project Euler hosts a growing collection of math programming problems that I started participating in earlier this year while it was on the mathschallenge.net web site. Typically, you have to write a program to solve a math problem and then enter your solution to get credit for it. It took me a couple of months to solve all the existing problems, and since then I’ve been solving new problems as they come out, every few weeks.
There are about 50 participants who solve all the problems as they come out and maintain their status as “100% genius” at the site. Among those, there are about a dozen who compete to solve new problems as quickly as possible since ties in the rankings are broken by time of submission of the solution. New problems always come out on Fridays at 1pm (EST) so I usually don’t get to work on them until the evening, which typically results in a “ranking” of around #15.
This week I was off work for the holidays and delayed our trip up to Hyco Lake for a few hours so I could take a crack at the new problems coming out. There were two new problems this time, involving finding solutions to the same Diophantine equation for certain values of n. In the first problem n was bounded to 1 million, and a simple brute force search worked fine. The second problem allowed n to go to 50 million. I started a similar brute force search running just in case it finished, though I knew there must be a simpler solution (all problems are designed to take less than a minute to execute). Being in a hurry, I only did a little analysis and then studied the first hundred or so brute force solutions to find a pattern and guessed that the pattern held for the whole range. It worked, and I ended up in first place on the scores list, at least until the next problem comes out.
Later, I let the brute force attack complete, which took about an hour. I also took the time to figure out why the pattern I observed held.

Not to be deterred, I reversed-engineered the data from the graph and regraphed it as a histogram, a boxplot and a smoothed density curve, which are all better than a scatterplot for analyzing a distribution of one variable. Unfortunately,
The paper next shows a similar scatterplot (not shown here) of LOC and argues that the similarity of the plots verifies the high correlation between KB and LOC. Not that the conclusion is bad, but why not plot them against each other to show a correlation? The graph at right does just that, showing the fitted line on a log-log scale. Once again, it’s from the reconstituted data.
I finished the last of the