Having Your Algorithm's Ass Kicked by the Internet

Having Your Algorithm's Ass Kicked by the Internet

TL;DR: Once upon a time, there was a Problem. And lo! I did create an Algorithm to solve it. ‘Twas an imperfect hero, greedy and brutal at heart, yet virtuous in its straightforward character. I spent, like, two hours on it, and I was well-pleased. Then came a hundred hundred Programmers from the Internet to annihilate the Problem and make my poor Algorithm look dumb.

The Legend of Monetization

In CodeCombat’s original pitch that got us into Y Combinator on stage, we said we could make money from employers by finding jobs for our most talented players. That way, we can keep the game free to play for everyone. Last month, the YC partners challenged us to test our assumption and find out whether we could actually get hirable players interested. We set out to make a hard developer challenge level as a proof of concept, and eventually came up with last month’s Gridmancer challenge.

The Wicked Problem

We wanted something we could put together quickly with the CodeCombat level editor that would be hard enough to test players’ programming fu, not just their Stack Overflow fu. I thought back to a problem I had struggled with while working on the CodeCombat world simulation performance. We’d have these dungeon levels with hundreds of Dungeon Wall Thangs, and even though each wall would hardly interact with any of the other Components, there were still places in the code with O(walls * frames * properties per wall) or O(walls * frames * thangs) interactions. Box2Dweb’s collision handling, though smart, also started to slow down with so many walls.

What if we could merge all the little walls into bigger walls, I thought? Collisions would be unaffected and everything would get much faster. And since all the walls are arranged on a grid, we can just turn the walls into the minimum number of rectangles needed to cover the walls:

Consolidating walls into larger rectangles led to 300% - 2000% speedups on levels with many walls. Nice.

The same problem came up again while recording my epic 120-hour workweek, this time in the pathfinding system I was creating. In order to do pathfinding with A* or similar algorithms, you need a graph representing the passable space. We already have a grid of all the places where there aren’t walls, but grid-based A* becomes slow with many hundreds of nodes. Game designers usually manually specify waypoint graphs, but those also suck–what you really want is nav meshes:

(Read Paul Tozour’s nav mesh manifesto to fall in love with nav meshes.)

Not only did we want nav meshes instead of waypoint graphs or grids, but we also wanted them to be automatically generated so that Artisans didn’t have to think about making pathfinding work. Now, I wasn’t about to implement automatic nav mesh generation from arbitrary convex polygons like in the Unreal Engine or NavPower, but since CodeCombat walls are all rectangular so far anyway, I realized that we could invert the collision rectangles and run the rectangle merging on the empty space instead of the filled space and extruding all the walls by the radius of the Thang needing pathfinding. It’s the same problem!

Now we could do A* on the rectangle graph and use a string pulling funnel algorithm to find the path, but instead we just do A* on all the connected vertices, which is even simpler.

Looks a lot better than my original pathfinding diagram:

Hero Algorithm Saves the Day

That was the Problem, and I needed an Algorithm. I started Googling for things like “minimum rectangle covering” and, amidst a bunch of comments like “I think it’s NP-complete when the polygon can have holes”, I found the proper name for this problem: the minimum dissection into rectangles of a rectilinear polygon. The answer came with a description of a complicated-looking computational geometry algorithm I didn’t understand and some nice pictures showing the process for what we’ll call the Eppstein Polygon:

Too hard, I thought. There was no actual code anywhere on the internet, for this or any other algorithm, and I wanted a short solution that wouldn’t be too slow, even if it wasn’t optimal. I decided to just scan through the level once from left to right, bottom to top, at each unoccupied space placing the largest possible rectangle starting from that space. I hacked together a quick method for visualizing my code by printing 1s and spaces to the Chrome debug console, spent a couple hours getting it right, and was quite happy with the algorithm:

Here’s the relevant code in GitHub in CoffeeScript as it’s integrated into the game engine, which is of course totally open source. It’s short, about fifty lines with comments and debugging statements. Not bad, right?

Wicked Problem Looking for Crusade

Back to our developer challenge level idea, having found the right Problem and knowing roughly how difficult an Algorithm would be. I spent a day or so creating the level, adding support for visualizing the rectangles and decorating the place with glittering coins, torches, and a few cozy dungeon shackles, and started writing the blog post.

We can get maybe a dozen players, we thought, and then we can reach out to employers and try to make a placement or two. A dozen. That’s what we were expecting when we posted our Gridmancer post three weeks ago. Heh. You’d think we’d have learned our lesson about what happens when CodeCombat gets posted to Hacker News [1][2][3][4][5] and reddit [1][2][3][4][5][6][7][8][9][10][11].

27,000 views, 3300 unique solution attempts, and 200 email submissions later, we’re still shoveling our inboxes each morning and trying to perform an O(m*n) employer-candidate matching algorithm. There were so many different solution techniques that I just had to write a blog post digging into them all and trying them on other map configurations.

Other Algorithms Absolutely Murder Wicked Problem

We categorized submissions into common techniques, for each picking a representative solution and measuring its total rectangle score and, as a proxy for runtime complexity and efficiency, their statement count. We also created a new fork of Gridmancer that uses the Eppstein Polygon to see who could still get the optimal solution in a harder test case. Read-only links to each example are included, so you can check out the code in action. (We may have missed some interesting variants; there were a lot to go through.)

My Original Algorithm

Gridmancer: 33 rectangles in 82,793 statements

Eppstein: 17 rectangles in 155,878 statements

Left to right, bottom to top, greedily grow rectangles as large as possible. We saw dozens of submissions that took the same approach.

For example, this one by Tom Steinbrecher:

Gridmancer: 33 rectangles in 131,740 statements

Eppstein: 17 rectangles in 212,469 statements

Expander Algorithm, example by Steffen

Gridmancer: 30 rectangles in 12,423 statements

Eppstein: 17 rectangles in 21,505 statements

Bottom to top, left to right, this is the inverse of my algorithm. It is also smarter about calculating whether to expand up or down for a given rectangle.

Balloon Algorithm, example by Alec Larsen

Gridmancer: 30 rectangles in 179,998 statements

Eppstein: 18 rectangles in 1,072,641 statements

Bottom to top, left to right, greedily grow rectangles as large as possible. Very similar to mine except in the order it scans through the grid, which happens to be better for Gridmancer but not Eppstein. It grows the rectangles slightly differently, too.

Slurper Algorithm, example by Anh Nguyen

Gridmancer: 29 rectangles in 138,091 statements

Eppstein: 17 rectangles in 998,487 statements

Greedy bottom to top, left to right, but combines rectangles as it goes along.

Recursive Algorithm, example by prs82

Gridmancer: 29 rectangles in 12,732 statements

Eppstein: 17 rectangles in 24,441 statements

Not-entirely greedy left to right, bottom to top, using some recursive elements, at each point expanding first to the right, then to the top. So fast, and so concise!

Naive Optimizer Algorithm, by Keith Tom

Gridmancer: 29 rectangles in 698,392 statements

Eppstein: 17 rectangles in 257,232 statements

This one is pretty unique. It first does a naive pass, then uses a few heuristics to merge any rectangles it can find as merge-worthy. Nice.

Joke Algorithm, example by Sergiu Petrila

Gridmancer: 29 rectangles in 33 statements

Eppstein: Nope!

A few players just hard-coded the coordinates of all the rectangles. Ha, ha. Very fast!

Top-Down Expander Algorithm, example by Raphael Seydalinov

Gridmancer: 32 rectangles in 9,980 statements

Eppstein: 17 rectangles in 17,877 statements

This one is similar to others but goes top to bottom, left to right. Quite fast.

The Fast Splicer Algorithm, example by Allan Chavez

Gridmancer: 30 rectangles, 8,825 statements

Eppstein: 17 rectangles, 15,292 statements

Out of the solutions we looked at that approach 29 rectangles, this one has the lowest statement count for both metrics.

The Fast Sweeper Algorithm, by Mark Maxham

Gridmancer: 29 rectangles in 11,193 statements

Eppstein: 16 rectangles in 25,366 statements

This one is quite impressive due to its brevity, speed, and simplicity. It’s not too far a departure from many other approaches; it sweeps out rectangles from first the y-axis, then on the x-axis.

Marius Popa also implemented a similar, less efficient implementation.

Gridmancer: 29 rectangles, 116,580 statements

Eppstein: 16 rectangles, 129,431 statements.

There were also a couple hardcore programmers who, because of bugs in our editor, decided to implement their own Gridmancer challenge visualizations!

Carl Roett’s advanced, fast, optimal solution:

Arn-O implemented one in Python:

Richard Vasquez’s full geometrically optimal solution:

Ron Ilan’s “Rectangles” art project:

So not only did most of the submissions beat mine in terms of Gridmancer score, Eppstein score, runtime complexity, or simplicity, but many beat them on all counts! Nice job, swarm of internet programmers. My once-proud Algorithm has been humbled to the max.

Crowning a King

Until we get to translating the algorithms of Richard Vasquez or Carl Roett to see if their David-Eppstein-paper-inspired algorithms can come up with faster or more optimal solutions in some other test cases, it looks like Mark Maxham’s Fast Sweeper algorithm is the winner, finding optimal solutions to both Gridmancer and the Eppstein polygon. It’s also almost the fastest and shortest algorithm amongst them all. I can’t wait to implement this in CoffeeScript and get it into the CodeCombat repository!

Triumphant Programmers Looking for Jobs

These are not the student programmers from last month. Some have 30 years of programming experience; others are just starting out and haven’t been discovered yet. With so many players trouncing my solution and interested in developer opportunities, high interest from the employers we’ve talked to so far, and some candidates already passing their second technical interviews today, things are looking good for CodeCombat’s recruitment strategy. If you are an employer, or if you know any employers looking for talented developers, then get in touch and let us know who you’re looking for.

Because I just posted all the major solutions, the Gridmancer recruiting challenge offer is now closed, but stay tuned for much more powerful (and fun) challenge levels to demonstrate your programming powers.

StarCraft for Coding

You didn’t read all that, did you? You did? No? It was terrifically geeky, I admit. But you won’t have to be into algorithms to enjoy what we’re cooking up next: StarCraft for coding. The next level will let you pit your armies against all challengers in strategic combat. And if you played Gridmancer and are wondering if we’ve improved the debugging experience–yes. Try out the prototype of our new time-travel debugger!

Nick Winter

Nick Winter

San Francisco