The Fastest Growing Open Source Project

The Fastest Growing Open Source Project

After three months, CodeCombat has graduated from Y Combinator. The journey that began unexpectedly at Startup School went better than any of us had hoped.

We open-sourced everything at the start of YC. Over the course of three months, 120+ Archmages made over 2000 commits to the codebase, ranging from small tweaks to refactoring the entire server to adding a new programming language to our transpiler.

Had any other open source project grown as fast?

To find the answer, we analyzed the GitHub Archive's public timeline, which contains 185,000,000 GitHub events after February of 2011.

Fastest Growing Open Source Repositories

(We merged in a few pull requests yesterday to break the tie; sorry Jeff, we really love Discourse)

We found the fastest growing were:

  1. CodeCombat
  2. Discourse
  3. Laravel
  4. Pullup
  5. Popcorn Time

We defined the repository growth metric as number of unique contributors, calculated using the number of unique creators of head branches of merged pull requests, added in the first 86 days after a repository was created/launched (at time of writing, it has been 86 days since CodeCombat had its open-source launch.) We made the assumption that due to the rise of GitHub, the fastest growing open-source repositories were created after February of 2011, the earliest date we have data for. This assumption is likely to be correct due to the growth of the open-source movement, but we don't have the data to make a more definitive claim.

To analyze such a large dataset, we used Google BigQuery. First, we grouped and sorted repositories based on the number of merged pull request events. This step yields 9,455,755 unique repositories.

SELECT repository_url, repository_created_at, COUNT(repository_url) AS number_of_events FROM [GitHub.timeline] WHERE repository_created_at > DATE(TIMESTAMP("2011-02-11 00:00:00")) AND type="PullRequestEvent" AND payload_pull_request_merged="true" GROUP EACH BY repository_url,repository_created_at ORDER BY number_of_events DESC

Our second filter step was to cut down on this list by filtering out repositories with less than 100 pull request events. This leaves 2,063 repositories.

SELECT * FROM [GitHub.pullrequest_filter] WHERE number_of_events > 100 ORDER BY number_of_events DESC

For each of the 2,063 repositories, we queried all of their pull request events, and for each, looked at the ID of the contributor who created the head branch. From this data, we generated a time series of the number of unique contributors in each project.

SELECT payload_pull_request_head_user_id, created_at FROM [GitHub.timeline] WHERE payload_action="closed" AND repository_url=[Some Repository URL] AND type="PullRequestEvent" AND payload_pull_request_merged_at IS NOT NULL ORDER BY created_at 

We then filtered out the non-software repositories, and then graphed the top five. As far as we know, this analysis has never been done before, so if you'd like to verify these statistics please do so! We're very interested in seeing more big-data style open-source analytics.

Some more statistics

During the three months of the Y Combinator program:

  • Over 380,000 players from 209 of 211 countries spent a total of 6.6 million minutes on the site, with playtime growing 16% every week
  • Our blog was read 176,000 times
  • Diplomats submitted translations for 38 different languages
  • The CodeCombat team worked an average of over 60 hours a week
  • Our two servers handled peak traffic of 15,000 requests per minute while maintaining an average response time of 12ms.

Multiplayer

During Y Combinator, we also launched our multiplayer mode for more experienced developers! In this new mode, players write code controlling a hero and their minions. Each solution battles against thousands of others and is ranked on the leaderboard.

So far, over 1500 players are fighting for the top of the Dungeon Arena leaderboard. Multiplayer has tripled returning visitor count and increased average session length by 64% compared to single player.

DeathScythe stands victorious after an arduous battle We also added a spectate view so that we could replay some of the awesome matches, as well as watch battles between random sessions. For instance, here's an exciting match between two of the top players, DeathScythe and tedshot.

What's next for CodeCombat

We're more excited than ever about the future. For the next few weeks, we'll focus on creating new content, adding new features, and improving game performance and experience on a wide range of hardware configurations.

CodeCombat was also selected to participate in the Google Summer of Code, and the interest has been astounding; we received 70 applications! In April, we'll be selecting a few students who will spend their summer doing tasks such as creating new levels, implementing video chat, or adding more languages to the game. We're excited to work with them and see what they can achieve!

CodeCombat has come so far in three months; we can't wait to see what the next three bring. Join us!

Michael (evil_florist)

Michael (evil_florist)