Thursday, October 28, 2010

The elephant in the room: source control for IronPython

Currently, IronPython is hosted in a TFS repository on CodePlex (http://ironpython.codeplex.com/), which was a copy of MS's internal TFS repository. CodePlex also provides Subversion access, which makes it much more bearable. CodePlex also hosts our issue tracking and wiki pages, which probably won't change any time soon.

IronRuby's source code is hosted on github (http://github.com/ironruby/ironruby). It's also a copy of MS's internal TFS repository, but in git.

The interesting part is that IronRuby, IronPython, and the DLR are hosted in the *same* repository, since they evolved together. Thus, both the IronPython CodePlex repo and the IronRuby github repo are basically the same.
</history-lesson>

What this is going to look like in the future is an open question, as is the timeline. Originally, I wanted to focus on the 2.7 release and deal with the source control question later. However, it's been raised in a few places, so I think it's better to get some more feedback on whether we should switch source control methods (and if so, to what?) or just stay on TFS/SVN for the time being. Also up for consideration is whether you consider being part of the same repo as IronRuby is valuable, or whether IronPython should split out on its own.

We could, for example, drop the source control from CodePlex and just use the IronRuby github repo - it's already set up and we could start developing tomorrow (although it would probably be renamed 'ironlanguages' or something like that). It's also probably the only option if IronPython and IronRuby are to share a repo, as, so far as I know, the IronRuby guys have no plans on leaving github, which makes sense for them - git is the de facto choice in the Ruby community.

In Python, however, it's not so clear-cut - Python itself will be moving to Mercurial soon, and there are plans afoot to eventually put the Python stdlib in a separate repo from Python itself, which will likely also be a Mercurial repository. Thus there are advantages (subrepos, in particular) to being on the same DVCS. On top of that, both Michael Foord and I strongly dislike git - I prefer Mercurial, and I imagine the coffee at Canonical will have Michael singing the praises of bzr fairly soon :). Finally, CodePlex supports Mercurial, and thus everything could remain there if we so wish.

However, converting the repo to Mercurial could be a difficult task - the fate of the 1.1, 2.0, and 2.6 branches would have to be decided (include them in the repo, or not? Their structure is radically different from the Main branch). There are folders that could very well be stripped (WiX, Ruby, and *3* CPython installations, not to mention IronRuby) to save space, and with a DVCS once they're in the history everyone has to pay that cost in disk space, forever, even if we later remove them. The fate of the DLR would need to be decided - do we keep a local copy, pull from IronRuby's copy, or make it a third repo altogether?

My preference is to stick with TFS/SVN for the time being, get 2.7 out the door (manually syncing up the DLR sources with IronRuby in the meantime), and then look at converting to Mercurial. My second choice would be to work out of IronRuby's git repository, get 2.7 released, and then look at converting to Mercurial. Anything that doesn't eventually involve Mercurial is a lot further down my list :).

I would like to see the DLR become a separate project, of which IronRuby and IronPython are just clients, along with IronJS, Clojure-CLR, and any others. I don't think the DLR will change too drastically, but the MS guys who are more familiar might have other plans, and Tomas has said he would prefer them to be together for ease of testing.

While the coordinators have discussed this already, I think we need more feedback to get an idea of what we should do, so please share your thoughts. This has a direct bearing on how you will be contributing to IronPython.