Tuesday, November 23, 2010

Developing IronPython with Mercurial

While the informal poll of the IronPython community favoured Mercurial, Miguel de Icaza had some good arguments for using github to develop IronPython and IronRuby (and I figured it was best to heed the advice of someone who's been running OSS projects since I was in elementary school). In particular, there was already an up-to-date git repo available.

However! All is not lost, thanks to the wonderful hg-git plugin. With it, you can push and pull from a git repository, but still use the Mercurial commands you know and love. To save everyone the hassle, though, I've set up a read-only mirror on bitbucket that is synchronized with the github repo. You can find it at http://bitbucket.org/ironpython/ironlanguages.

I assume you have some familiarity with Mercurial. If not, take a look at Hg Init and then give the Mercurial book a quick read to get a handle on the basic concepts.

In this post I'll show how to work with the IronPython repository on bitbucket. In a later post, I'll cover how to work directly with the github repository as well as a more advanced use of bitbucket.

Initial Setup

Before doing anything else, you'll need the latest version of Mercurial. On Windows, I strongly recommend TortoiseHG because it already includes almost everything you'll need.

Now, make sure the following sections are in your Mercurial.ini file:

[extensions]
mq=
rebase=
bookmarks=

[diff]
git=1

While none of these are strictly required, the bookmarks extension is needed if you want to work with git directly, and rebase and mq are just really nice to have. You'll want them later anyway, so you might as well turn them all on now. Git diffs are more useful than the unified diffs Mercurial generates by default, so turn those on as well.

Working With a Bitbucket Fork

To use bitbucket forks, you'll need a bitbucket account. If you don't already have one, go to bitbucket.org and create an account. For all of the details on creating a fork, I'll just point at Bitbucket's own documentation, which is very good. In short, go to http://bitbucket.org/ironpython/ironlanguages, click on "fork", and give the fork a descriptive name and create it.

Now, create a clone of your fork:

> hg clone https://yourname@bitbucket.org/yourname/yourfork

With your own fork you can push to it so that others can see your changes. However, pulling will (by default) pull from your fork, which is not automatically kept in sync with the original repository. To get any changes from the original repository, you'll need to pull from it directly:

> hg pull --update http://bitbucket.org/ironpython/ironlanguages

To save a bit of typing, add an alias to the [paths] section of the .hg\hgrc file of your clone:

[paths]
ironlanguages=http://bitbucket.org/ironpython/ironlanguages

And then you can do a pull from the alias instead of the URL:

> hg pull --update ironlanguages

You'll probably need to do a merge at this point. If you haven't pushed any changes publically, you could use hg rebase as well, but in this case I highly recommend using hg merge because it's safer. You should merge from upstream only as often as you need to, as too many merge commits can clutter the history. However, always merge in the upstream changes just before you send a pull request. It will make my life much easier.

Now you can use Mercurial as you normally would to track whatever changes you are making.

Once your changes are done, you'll need to send a pull request. First, if you haven't already, commit and push your changes to your fork:

> hg commit
> hg push

Then, go to the original repository (http://bitbucket.org/ironpython/ironlanguages), click "pull request", and enter a quick explanatory message. One of the IronPython coordinators (most likely me) will pull the request into our local repo, review it, and then push it to the main repo on github; this will trigger a synchronization, which will then pull your changes into the bitbucket mirror. Phew.

Working Directly With the Bitbucket Mirror

While using your own fork is highly recommended (because pull requests make integrating the changes very easy), you may want to work with the mirror repository directly. In particular, you don't need a bitbucket account to work with the mirror directly. However, the mirror is read-only, so if you choose to work this way, you'll need to submit patches (described below). For minor changes -- basically, a single commit -- this is fine, but for larger changes I strongly recommend using a fork or MQ, or it will make creating the patch extremely difficult.

First, clone the mirror repository:

> hg clone http://bitbucket.org/ironpython/ironlanguages

You can now use hg as usual to track your changes. I recommend using rebase to stay up to date, to keep the history free of merges:

> hg pull --rebase

Rebase Warning

For the most part, the rebase extension is safe to use, with one exception: do not use rebase on a repo that you've made public unless you want to break every other clone of that repo. The only time I recommend using it is when you only have a local clone, because it will make the resulting patches cleaner than repeated merges.

If you want rebase-like functionality that is safe to push publically, you should use MQ.

When your changes are complete, you'll need to create a patch and submit it.

Submitting a Patch

First, either find the corresponding issue that your patch is resolving or create a new one. Next, do an hg pull --rebase to make sure that you're up–to-date with the latest changes, which will make the patch easier to apply.

Now you need to determine which revisions are needed for the patch using hg log. Ideally, this is only the most recent revision (tip), or the last few revisions. Once you know what revisions comprise your changes, you can use hg export to create a patch (which can then be applied using hg import – funny, that).

If your fix is only the most recent commit, it's fairly easy:

> hg export tip –-output 12345-fix-broken-foo.patch

When naming the patch file, please include the issue number and a very brief description in the name of the file.

If the patch has more than one commit, you'll need to specify the revisions to export. In this example, we take all of the commits from revision 123 and to the tip:

> hg export 123: –-output 12346-fix-broken-bar.patch

If there are other revisions (such as merge commits) intermixed with work commits, it is still possible to produce a clean patch, but you'll need to use the full revision set query language to do it. In that case you're probably better off creating a fork, merging it in, pushing to it, and sending a pull request.

Once the patch is created, attach it to the appropriate issue and someone should pick it up. If we miss it (sorry!), just send an email to the mailing list to remind us.

Sunday, October 31, 2010

Combining stdout, stderr, and pipes on Windows

This post is as much for my own reference as anyone else's. I need to capture the output of a program that outputs to both stdout and stderr in a file, but I also want it to display on the console so I can track its progress. The trick, of course, is to use tee (in this case, from UnxUtils). The only issue is how combine the streams, which cmd.exe is perfectly capable of doing, if you get the incantation correct:

runtests.cmd 2>&1 | tee django-tests-201010301352.log

What "2>&1" does is redirect stderr (2) into stdout (1), which is then piped into tee, which will write its input to the specified file as well as the console. With this, I can monitor the Django test runs while still keeping a log to review later.

Friday, October 29, 2010

Running the Django Test Suite on IronPython

I know, I know, I've written this post before. However, it's a lot easier now than it was back then, and as a bonus it actually runs to completion! This is, by far, the best way to gauge the status of Django on IronPython. Once the test suite passes (except for stuff that cannot be supported, like GeoDjango), then this project will be basically complete.

There is one major hitch: doctest. Doctest is an interesting Python library that tests code by comparing it to expected output (by converting the result to a string). The problem is that certain constructs (especially dictionaries) will output differently in different implementation of Python. Thus, because Django relies on doctest, many tests would fail even though they were actually correct, just because the output was different. Thankfully, the Django project is working to get rid of doctest (I smile every time I see one of Alex Gaynor’s “we have always been at war with doctest” commits go in).

I covered the setup of the environment in my post on running the Django tutorial on IronPython, this may be familiar.

First, you'll need to install IronPython. Using the latest (IronPython 2.7 Beta 1 as of this writing) is recommended. The MSI installer is best, but you can use the zip as well. You'll just need to remember where it's unpacked, as you'll need that path later.

Next, you'll need to get Django/IronPython. Because Django requires some IronPython-specific patches, you'll need to download the entire source tree. To do this, you'll need Mercurial (I recommend TortoiseHG). Create an empty directory to work in, as well. I find Mercurial easier to work with from the command line, even with TortoiseHG installed, but you can do all of this from TortoiseHG as well. You’ll also need to enable MQ and, if you’re new to Mercurial, check out this quick tour of bibucket and this great Mercurial tutorial.

To get the Django/IronPython sources, from your empty directory:
> hg qclone http://bitbucket.org/jdhardy/django-ipy-patches django-ironpython
> pushd django-ironpython && hg qpush --all && popd
> hg clone http://bitbucket.org/jdhardy/django-ironpython-tests
> cd django-ironpython-tests

The first line pulls the sources from bitbucket into the 'django-ironpython' folder, the second line applies all of the Django/IronPython patches, and – here’s where it’s different from the tutorial! – the  next-to-last line clones the django-ironpython-tests repository, which contains a bunch of helpers for running the Django tests. The last line just switches to the django-ironpython-tests folder.

Now you’ll need to open testenv.cmd in a text editor and possibly make some changes. In particular, change _ipy_root to point to your IronPython installation if you didn’t use the MSI installer.
Now you can run the runtests.cmd file, which will run the entire Django test suite. There will be errors and failures, but the results are promising, with about 66% of the tests passing:

Ran 2595 tests in 3022.324s

FAILED (failures=405, errors=468, skipped=16, expected failures=1)

Thursday, October 28, 2010

The elephant in the room: source control for IronPython

Currently, IronPython is hosted in a TFS repository on CodePlex (http://ironpython.codeplex.com/), which was a copy of MS's internal TFS repository. CodePlex also provides Subversion access, which makes it much more bearable. CodePlex also hosts our issue tracking and wiki pages, which probably won't change any time soon.

IronRuby's source code is hosted on github (http://github.com/ironruby/ironruby). It's also a copy of MS's internal TFS repository, but in git.

The interesting part is that IronRuby, IronPython, and the DLR are hosted in the *same* repository, since they evolved together. Thus, both the IronPython CodePlex repo and the IronRuby github repo are basically the same.
</history-lesson>

What this is going to look like in the future is an open question, as is the timeline. Originally, I wanted to focus on the 2.7 release and deal with the source control question later. However, it's been raised in a few places, so I think it's better to get some more feedback on whether we should switch source control methods (and if so, to what?) or just stay on TFS/SVN for the time being. Also up for consideration is whether you consider being part of the same repo as IronRuby is valuable, or whether IronPython should split out on its own.

We could, for example, drop the source control from CodePlex and just use the IronRuby github repo - it's already set up and we could start developing tomorrow (although it would probably be renamed 'ironlanguages' or something like that). It's also probably the only option if IronPython and IronRuby are to share a repo, as, so far as I know, the IronRuby guys have no plans on leaving github, which makes sense for them - git is the de facto choice in the Ruby community.

In Python, however, it's not so clear-cut - Python itself will be moving to Mercurial soon, and there are plans afoot to eventually put the Python stdlib in a separate repo from Python itself, which will likely also be a Mercurial repository. Thus there are advantages (subrepos, in particular) to being on the same DVCS. On top of that, both Michael Foord and I strongly dislike git - I prefer Mercurial, and I imagine the coffee at Canonical will have Michael singing the praises of bzr fairly soon :). Finally, CodePlex supports Mercurial, and thus everything could remain there if we so wish.

However, converting the repo to Mercurial could be a difficult task - the fate of the 1.1, 2.0, and 2.6 branches would have to be decided (include them in the repo, or not? Their structure is radically different from the Main branch). There are folders that could very well be stripped (WiX, Ruby, and *3* CPython installations, not to mention IronRuby) to save space, and with a DVCS once they're in the history everyone has to pay that cost in disk space, forever, even if we later remove them. The fate of the DLR would need to be decided - do we keep a local copy, pull from IronRuby's copy, or make it a third repo altogether?

My preference is to stick with TFS/SVN for the time being, get 2.7 out the door (manually syncing up the DLR sources with IronRuby in the meantime), and then look at converting to Mercurial. My second choice would be to work out of IronRuby's git repository, get 2.7 released, and then look at converting to Mercurial. Anything that doesn't eventually involve Mercurial is a lot further down my list :).

I would like to see the DLR become a separate project, of which IronRuby and IronPython are just clients, along with IronJS, Clojure-CLR, and any others. I don't think the DLR will change too drastically, but the MS guys who are more familiar might have other plans, and Tomas has said he would prefer them to be together for ease of testing.

While the coordinators have discussed this already, I think we need more feedback to get an idea of what we should do, so please share your thoughts. This has a direct bearing on how you will be contributing to IronPython.

The Road to 2.7 (Or: The Future of IronPython)

There have been a few people asking what they can contribute to IronPython. Right now, we need to identify what's not working in 2.7B1 that needs to be in 2.7 final. The best thing to do would be to identify any issues that are causing you pain and bring them up on the list. Then we can decide what meets the bar for a 2.7 release (with issues that have patches getting priority, of course!). Dino, are there any issues that you know are in 2.7B1 that must be fixed for 2.7 final?

Dino has provided some instructions on contributing to IronPython. We need people to run through that and if there's anything it doesn't cover (I do intend to add subversion instructions directly at some point), and run the test suite as well. Also, doing all of that on Mono, to see what work needs to be done there.

Besides knowing what needs to be done, we need a rough timeline. I would like to see a release before the end of the year, or at least a solid release candidate with a possible release early next year. The idea behind an aggressive schedule is to focus on getting the features we have solid and not worry about adding new features or libraries (with the possible exception of zlib). That said, this all just my desires, and I really want to get an idea of what every one else is thinking. Please send feedback to the IronPython list on what you want to see.

Sunday, October 24, 2010

Running the Django Tutorial on IronPython

The Django tutorial is fantastic. It's a great way to get a feel for Django, and a great test for how it will work on IronPython. Getting it working will give you pretty good idea of how to setup Django/IronPython for most things.
First, you'll need to install IronPython. Using the latest (IronPython 2.7 Beta 1 as of this writing) is recommended. The MSI installer is best, but you can use the zip as well. You'll just need to remember where it's unpacked, as you'll need that path later.
Next, you'll need to get Django/IronPython. Because Django requires some IronPython-specific patches, you'll need to download the entire source tree. To do this, you'll need Mercurial (I recommend TortoiseHG). Create an empty directory to work in, as well. I find Mercurial easier to work with from the command line, even with TortoiseHG installed, but you can do all of this from TortoiseHG as well. You’ll also need to enable MQ and, if you’re new to Mercurial, check out this quick tour of bibucket and this great Mercurial tutorial.
To get the Django/IronPython sources, from your empty directory:
> hg qclone http://bitbucket.org/jdhardy/django-ipy-patches django-ironpython
> pushd django-ironpython && hg qpush --all && popd
> md django-tutorial && cd django-tutorial
The first line pulls the sources from bitbucket into the 'django-ironpython' folder, the second line applies all of the Django/IronPython patches, and the last line creates a directory ('django-tutorial') to hold the tutorial files, and switches into it. If you're not already using the command line, open a command window and switch to the 'django-tutorial' directory.
Now we need to set up some environment variables, which are a simple way to configure some settings for IronPython and Django. I usually copy these into a file called 'tutenv.cmd', which is easier than typing them each time (the variables are lost when the command window is closed):
@echo off

rem Get the folder this file is in
set _root=%~dp0

rem Add the IronPython installation paths to PATH; change these if you used
rem the .zip version of IronPython
set PATH=C:\Program Files\IronPython 2.7;C:\Program Files (x86)\IronPython 2.7;%PATH%

rem Add the django path and current paths to IronPython's search list
set IRONPYTHONPATH=%_root%..\django-ironpython;%_root%;%_root%deps

rem Tell Django when the its settings are
set DJANGO_SETTINGS_MODULE=mysite.settings
If you created a file, run it to set up the tutorial environment. Now, do a quick sanity check:
> ipy
IronPython 2.7 Beta 1 (2.7.0.10) on .NET 4.0.30319.1
Type "help", "copyright", "credits" or "license" for more information.
>>> import django
>>> ^Z
If that doesn't work, you need to make sure that the PATH variable includes your IronPython installation, and that IRONPYTHONPATH includes the path to the 'django-ironpython' folder created back at the beginning of the post.
Now we need to add in some dependencies (namely, sqlite and zlib support). These are in another bitbucket repository, django-ipy-tutorial-deps:
> hg clone http://bitbucket.org/jdhardy/django-ipy-tutorial-deps deps
Now you can start the tutorial. The first step, creating the project requires using django-admin.py; you can run it with IronPython like so:
> ipy ..\django-ironpython\django\bin\django-admin.py startproject mysite
Besides that, you can follow the tutorial almost exactly as written – just remember to substitute 'ipy' for 'python'! If you have any issues, please file them in the issue tracker.

Friday, October 22, 2010

Contributing to Django/IronPython

Since the new Django/IronPython repository (django-ipy-patches) is based on MQ, I thought I'd give a quick intro into what MQ is and how to work with it on bitbucket. First off, MQ is short "Mercurial Queues". The queue, in this case, is a queue of patches to be applied to a repo that can be versioned separately from the repo itself. These patches live independent of the history and can be easily added and removed as changes are made. You can find a lot more detailed information in the Mercurial book.

Using Bitbucket

The initial setup is a bit weird, but it works quite well after that. First off, get Mercurial – I prefer TortoiseHG, myself. Next, get a bitbucket account. Finally, go to the django-ipy-patches page and click the "fork" button. You can name your fork whatever you want. Once it's created, make a note of the URL in the "hg clone" line. For this example, it's https://bitbucket.org/jdhardy/django-ipy-patches-test.

Next, clone the django-trunk repository, but name it after your fork, and then switch to that directory:

> hg clone http://bitbucket.org/jdhardy/django-ipy-patches-test
> cd django-ipy-patches-test

Now, clone the fork you created earlier and pull in all of the patches. Note that we're pulling it into a specific directory (.hg\patches):

> hg clone http://bitbucket.org/jdhardy/django-ipy-patches-test .hg\patches
> hg qpush --all

You can now use MQ to manage patches as usual.

> hg qnew fix-1 -m "Fix issue #1"
> hg ci --mq
> hg push --mq

Now you can go back to the bitbucket page for your fork, click "pull request", and send me a request to pull your changes back into django-ipy-patches. Please do a pull from django-ipy-patches first to sync things up as much as possible.

Without Bitbucket

If you don't want to use bitbucket, you can get the django-ipy-patches queue directly:

> hg qclone http://bitbucket.org/jdhardy/django-ipy-patches

Make your changes using MQ as usual, but instead of doing a push, export the changes instead:

> hg log --mq
> hg export –mq –r ...

The trick here is figuring out which changes you need to export using hg log and then export them. You should then post the bundle to the issue tracker, and I'll apply it as soon as I can.

MQ Shortcut

You may have noticed that most of the normal hg commands take an --mq parameter that causes them to operate on the MQ repository instead of the actual repository. I use a simple batch file as a shortcut so that I can use `mq push` instead of `hg push --mq`:

@hg %1 --mq %2 %3 %4 %5 %6 %7 %8 %9

Put that line in a file named "mq.cmd" somewhere in your PATH and you can save yourself a little bit of typing.

Thursday, October 21, 2010

The End of the Beginning

Well, the shoe has finally dropped: IronPython and IronRuby have been axed. Jimmy Schementi spilled the beans back in August, and despite my thoughts to the contrary, the fate of the DLR team was already sealed. As best I can gather, the team was officially kaput on September 1st, or thereabouts.  I found out in late September, and since then the former DLR team has been working to make the handoff as smooth as possible.

As of today, myself, Jimmy Schementi, Michael Foord, and Miguel de Icaza have become the coordinators of the IronPython, IronRuby, and DLR projects. Bill Chiles and Dino Viehland, who worked on IronPython, will continue to work in an unofficial (i.e. spare time) role as well. Some of the old IronRuby team will probably continue to run the show over there as well. But officially, Microsoft is completely hands-off.

Why?
Why end one of the few teams that was actually doing something new and different and interesting at Microsoft? The official word is that it's because of "resource constraints" and "a shift in priorities". Now, I realize that even Microsoft has to choose where to spend their dollars, but I have a hard time believing that the half-dozen staff on the DLR team were that big a deal compared to the 200+ working on WPF and SL, or the billion-dollar KIN debacle. Maybe that's why I only make the small-to-medium dollars instead of the big bucks.

It's also a bit mystifying that Microsoft would do this after promoting the dynamic keyword so heavily in .NET 4.0. That part of the DLR (the "inner ring") isn't going anywhere – it's in the .NET framework, which means it will be supported more or less forever. The hosting APIs, or the "outer ring", is now in the hands of the community. Hopefully the people working on projects like IronScheme, IronJS, and Clojure-CLR will contribute what they want to see out of the DLR, although getting changes into the "inner ring" is likely to be impossible.

The Future
I know there are companies and software using IronPython and (I believe) IronRuby in production: Resolver One is completely built around IronPython; it's built-in to Umbraco; Jimmy has said that's its in use at Lab49. I hope that these companies will step up by offering some of their employees' time to help a project that they use. Other than that, there's at least a few Python programmers like myself who have to work in .NET, and hopefully they will also help out. The future of IronPython and IronRuby is entirely in the hands of those who use it, which is a new experience for those used to Microsoft calling all the shots.

So what's next? As a group, we're still hashing that out. Now that the cat is out of the bag, we're going to involve the community as well. My goal is to get a production-ready IronPython 2.7 released before the end of the year. To do that, we'll need continuous integration infrastructure. After that, all sorts of decisions will need to be made, from boring stuff like managing infrastructure, to setting a roadmap for 3.2, to deciding what cool .NET stuff we want to include (such as LINQ support or static compilation).

I want to see Django running on IronPython. I want to see the DLR be the system for embedding scripting into .NET applications, supporting a multitude of languages. I want the Iron* languages to become so popular that Microsoft regrets ever cutting them off. But I and other coordinators can't do it alone. We need help. We need people to contribute code, libraries, documentation - anything. From this point on, IronPython and IronRuby will live or die by their communities.

Come join the mailing lists (IronPython, IronRuby) and help us decide where we are going to go from here. This isn't the beginning of the end. Far from it.

Wednesday, October 20, 2010

Restarting Django/IronPython

Once again, I'm trying to work on Django/IronPython. This time around, I've finally found a workflow I'm happy with (for now) using MQ. Using MQ is a bit tricky to understand, but it has the distinct advantage that Django/IronPython patches are distinct from normal development and thus much easier to submit back to the Django project – which I also plan to start doing.

Of course, the repo has changed again – it's now django-ipy-patches. It's a patch queue (which bitbucket handles quite nicely) instead of a fork because using a fork mixed the IronPython changes in with normal development and made them hard to find amongst all the merge changesets (and you can't rebase a public repository). Forks are better for very short-lived branches, but I think patch queues are the better option for long-lived projects like this. I'll move the issues from the old repository over and then shut it down fairly soon.

I'll post some instructions soon on how to contribute and use MQ, and an update running the tests.

Saturday, October 16, 2010

Using Extensions with a Custom IronPython Build

I tripped over this today – if you're using a custom build of IronPython (such as one built in debug mode), then any extensions have to be rebuilt against that build of IronPython.

This occurs because the references in the extensions are against the official builds, which are signed (strong-named), but a custom IronPython build is not signed (or signed with a different key).

Another option would be to build the extensions with a non-strong-named reference to IronPython, but I don't if this is possible or how to do it. If anyone else does, please let me know!

Thoughts on Diversification

I recently watched Rob Conery's talk from NDC2010 on his concerns about Microsoft. It's a good talk, and if you work with Microsoft tech at all (especially it it's your main technology, as it is mine), then you owe it to yourself to watch it.

Done? Good.

Rod describes the talk as 'incendiary'; I disagree. Personally, I didn't think anything in there was off the mark (although he should have left out his Twitter fight over Azure pricing, but whatever). He's right – Microsoft, overall, is not nearly as interesting as they used to be. He implores 'Microsoft developers' to look outside of Microsoft at what else is out there, and wants Microsoft to start pushing boundaries again.

None of this is really a surprise to those of us who have always had a foot outside of the Microsoft world; hell, the only reason I got a foot in the Microsoft world is because it paid the bills. I've been a Python fan since the first time I saw it, 7 or 8 years ago; one of my co-op jobs in University was working for a (sadly, now defunct) company that integrated Jython into the management software for their hardware platform. Before that I taught myself C++ after cutting my teeth on QBasic. I used Java and Ruby in University courses; the OS of choice was mainly Linux.

Knowing only one tech stack is extremely limiting and, honestly, foolish. Focusing on one is often a necessity dictated by needing to eat, but you should be keeping an eye on what other communities are doing, because you never know when it might be useful. It's truly unfortunate that many programmers didn't know about the beauty of functional programming until Microsoft introduced LINQ and F#; I've been using those techniques in Python for years and missed them horribly when working in pre-3.5 C#.

This doesn't just apply to programming. Some people recommend learning one new programming language a year; I agree, almost. I think you should learn one new skill a year. Last year I built myself an office, doing all of the carpentry, drywall, and electrical myself (with a little help from friends and family, of course). This year I'm learning how to cook properly (and have developed an unhealthy obsession with Alton Brown). Next year I plan to finally learn how to play the guitar and understand music theory (we'll see how that goes). Someday I hope to rebuild a car.

Step outside your comfort zone. Learn something new. What you find just might surprise you.

Sunday, September 12, 2010

Using Downloaded IronPython Modules

One of Internet Explorer’s many “helpful” features is one that will “taint” any downloaded files as so that the system knows they are from the internet. Honestly, I can’t see what value this feature adds other than breaking CHM files, and preventing IronPython from using downloaded modules.
This was brought to my attention by Shay Friedman, who was trying to use IronPython.Zlib but couldn’t get it to work. In particular, the error message was misleading:
IronPython 2.6.1 (2.6.10920.0) on .NET 4.0.30319.1
Type "help", "copyright", "credits" or "license" for more information.
>>> import clr
>>> clr.AddReferenceToFileAndPath('C:\Users\Jeff\Downloads\IronPython.Zlib-2.6-clr4\IronPython.Zlib.dll')
Traceback (most recent call last):
  File "", line 1, in 
IOError: System.IO.IOException: file does not exist: C:\Users\Jeff\Downloads\IronPython.Zlib-2.6-clr4\IronPython.Zlib.dll
   at Microsoft.Scripting.Actions.Calls.MethodCandidate.Caller.Call(Object[] args, Boolean& shouldOptimize)
...
>>>
The file, of course, does exist, so why can’t IronPython find it?
There are actually a few things that interplay here: first, it must be downloaded with a  browser that taints the file (which I believe are just IE and Chrome), and second, it must be unzipped with Windows’ built in unzipping tools. The built in tools have the interesting property that when unzipping a tainted zip file will also taint all of the unzipped files. Finally, the punchline: .NET will not load an assembly that is tainted.
So how do we get around this? Well, you can:
  • use a different browser
  • use a different unzipping tool (I highly recommend 7-zip)
  • unblock the zip file prior to unzipping
To unblock the file, just right click on the zip file, click “Properties”, and click “Unblock”:
unblock-file
If you’ve already unzipped the file, you can just unblock the DLL. Depending on where you unzipped the file to, you my need to use an elevated Explorer window. You can also unblock multiple files from the command line.
This may well affect applications other than IronPython, so it’s just one more thing to watch for.

Wednesday, August 11, 2010

NWSGI 3.0 Plans

It looks like it's just about time for another major release of NWSGI - the last two Decembers have had major releases, and I see no need to break the trend this year. There are going to be a few major changes this time around, so if anyone has any objections, let me know as soon as possible.

Most importantly, it will only support IronPython 2.7 and thus will require .NET 4.0. Like the IronPython team, I'm not going out of my way to break compatibility with .NET 2.0 (or IronPython 2.6, for that matter), but I won't be distributing anything but IronPython 2.7/.NET 4.0 binaries.

Similarly, I will not be supporting IIS 6 (Windows Server 2003) anymore. Again, I won't go out of my way to break it, but I won't be testing against it either. This means I can get rid of the wildcard handling for good, since IIS 7 has a good URL rewriter available.

The biggest change is that I am decoupling the WSGI processing from the ASP.NET pipeline. All of the functionality is currently part of an IHttpHandler implementation, which restricts it to be used with IIS (and Cassini) only. NWSGI 3.0, on the other hand, will allow NWSGI to be used with HttpListener or other servers such as Kayak by moving all WSGI processing into a separate class with no ASP.NET dependencies. The redesign will also allow me to improve the test coverage from zero to, well, something.

Finally, the licence will change to the Apache Licence 2.0 to match IronPython. The basic terms are identical to the Ms-PL licence that was used previously; the Apache licence is just more explicit and also more widely used.

As with the previous versions, I expect to release the final version shortly after IronPython 2.7 is released.

Monday, August 9, 2010

The fate of IronPython?

It appears that Microsoft will not continue to fund IronRuby. Hopefully it will continue to flourish as a community project; I wish them luck. This does raise the question of whether IronPython will meet the same "fate"; in the absence of word from the IronPython team (it is the weekend, after all), I think I'll indulge in some wild speculation.

Mr. Schementi's last day at MS was July 23, meaning he probably gave his two weeks' notice an July 9. Thus, the writing was on the wall by the and of June/beginning of July. That's my rough timeline. But, something doesn't fit.

On July 1, Enthought announced that they would be porting NumPy and SciPy to IronPython and .NET. I would imagine that they would have gotten some assurance from Microsoft that the IronPython project would continue. Or, they've gotten shafted - it happens.

My hypothesis - that IronPython will continue to be funded, for now. The team has said that there are/were potential customers within Microsoft (the Dynamics team was one, I believe), which is critical for continued support. However, I believe there may be one other unexpected saviour - the Windows High-Performance Computing (HPC) team.

Python is the scripting language of choice in the HPC community, largely because of the NumPy/SciPy libraries mentioned earlier. I wouldn't be surprised if Microsoft was making a push to get Windows and .NET deeper into that space (it's small but profitable), and IronPython with NumPY/SciPy support could be a key part of that play.

Also, both teams had releases on July 16th – IronRuby 1.1 and IronPython 2.7 Alpha 1. I think that Alpha is an important signal of the team's expectations.

This is what happens when I have insomnia. Hopefully this will still make sense in the morning.

Sunday, June 27, 2010

Changes to builds for IronPython.SQLite and IronPython.Zlib

I've done some work on the builds for both IronPython.SQLite and IronPython.ZLib; with IronPython 2.7 on the way, the number of variants I need to build is going up. IronPython 2.7 will require .NET 4.0, so that saves me having to build a IronPython 2.7/.NET 2.0 version of everything as well.

Still, three variants is too many to build manually through Visual Studio, which is how I've built everything up to this point. The packaging into a zip file was also done manually. To save the effort, I've adopted psake as a build automation tool. It's based on PowerShell, which is a very nice administrative language (I actually prefer to Python for quick & dirty admin tasks – the horror!). The actually builds are still handled by MSBuild so that I can manage the files in Visual Studio; the psake script calls msbuild to build the library. I also have tasks to clean the source tree and build the .zip packages.

The MSBuild Files

The real trick in this is building an .csproj file that can handle being built for both .NET 2.0 and .NET 4.0. The key to it all is the TargetFrameworkVersion property; by default, it is hardcoded to a specific .NET version (v2.0, v3.5, or v4.0) in the .csproj file. To change it during the build, msbuild needs to know that we might provide it on the command line. If you've ever opened a .csproj file, you make recognize these lines:

<Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration>
<Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform>

These lines will set the Configuration and Platform properties if they are not already set from the command line; that's what that Condition attribute does. We just need to do the same thing for the TargetFrameworkVersion property:

<TargetFrameworkVersion Condition=" '$(TargetFrameworkVersion)' == '' ">v2.0</TargetFrameworkVersion>

This allows us to change the TargetFrameworkVersion from the command line. For a little future proofing, I add a property for the IronPython version as well:

<IronPythonVersion Condition=" '$(IronPythonVersion)' == '' ">2.6</IronPythonVersion>

The psake file

The psake build script is (by convention) default.ps1; I would have preferred build.ps1, but whatever. The core task is the Build task; all it does is call MSBuild:

task Build -Depends GenVersion {
    exec { 
        msbuild /nologo /verbosity:minimal "$ProjectPath" `
            /t:Build `
            /p:Configuration=$Configuration `
            /p:TargetFrameworkVersion=$TargetFrameworkVersion `
            /p:OutputPath="..\$OutputPath" `
            /p:IronPythonVersion=$IronPythonVersion
    }
}

There are also tasks to build the zip files (requires 7-zip on the build system) and run the tests for the given project. All in all, using psake has been a great move, and while I have a few minor issues with it, I highly recommend it.

Building the Damn Thing

This is all well and good, but how do you actually use it to build it the library? Switch to the build directory, and, in a PwoerShell window, run:

.\psake

That's it, if a default .NET 2.0 Debug build is what you want; it will also run the tests. To build a Release build for .NET 4:

.\psake build –framework 4.0 –parameters @{config='Release'}

Finally, you can build the zip packages:

.\psake package

It's all much simpler than firing up Visual Studio and building zip files by hand.

Visual Studio 2010 Upgrade

I've also upgraded the IronPython.SQLite and IronPython.Zlib solution files to Visual Studio 2010. You can use the free Express editions to build them. Also, none of these changes affect NWSGI, yet.

Saturday, May 1, 2010

IronPython: SQLite and Zlib

Wow, it’s been a long time – I wanted to manage at least one post per month, but it’s been three months. Ah well, I’ve been busy working on filling in a couple of holes in the IronPython standard library: sqlite3 and zlib.

IronPython.Zlib

IronPython.Zlib implements the zlib module for IronPython using ComponentAce’s zlib.net, which is a purely managed implementation of the zlib library. IronPython.Zlib is entirely managed code and works with both 32-bit and 64-bit IronPython. It passes all of the Python 2.6 zlib and gzip tests and most of the zipfile tests.

IronPython.SQLite

IronPython.SQLite is a port of pysqlite to IronPython using C#-SQLite, which, similar to zlib.net, is a managed implementation of SQLite. Thus, IronPython.SQLite is also 100% managed code. It passes about 87% of the Python 2.6 sqlite3 tests; the remaining ones are mostly corner cases or rarely used functionality.

.NET 4 Support

Neither of the above libraries have been built or tested with IronPython for .NET 4.0. I don’t see any reason they shouldn’t work, but they’ve never been tried. If anyone does try, let me know how it goes!

Sunday, January 31, 2010

Debugging Techniques for IronPython: Breakpoints

Currently, programs IronPython is not particularly easy to debug – they Python code gets mixed in with the code for IronPython itself and can be hard to follow. This is the first of a few tricks to make debugging easier.

Trigger a Breakpoint

This is my favourite and most used technique – I sometimes wish Python had the equivalent of JavaScript's debugger keyword to make it easier to use. The trick is to use the System.Diagnostics.Debugger.Break() function to insert a breakpoint in your code – when the breakpoint is hit, Windows will offer to launch a debugger for you. Also make sure that you launch IronPython with the –X:Debug switch, or it will be next to impossible to debug.

Insert the following line where you want to trigger the breakpoint:

import System.Diagnostics; System.Diagnostics.Debugger.Break()

Windows will then open the following dialog (if you have Visual Studio installed; I don't have a machine without VS to see what it looks like otherwise):

vsjitbdgr

I prefer to use VS for debugging, so I'll open a  new instance of Visual Studio 2008.

Browsing the Call Stack

Now, we'll need to go looking for the actual Python code in the call stack. Open the call stack browser, and you'll see something like this:

dbgrbrk

There are a few things to note here, especially if you're not familiar with the VS debugger:

  • Each line represents a call stack frame (a function call, basically).
  • The yellow arrow is the current execution location; this is our breakpoint.
  • Faded-out lines are frames that do not have any debug information available. They can be ignored.
  • Lines beginning with "Snippets.debug" are the actual Python stack frames we are looking for. Double click on those lines to open up the source for those files in the Visual Studio editor.

Once the Python source is loaded in the editor, you can inspect variable values by hovering over them with the mouse, just like in any other Visual Studio language. To find the calling stack frame, you'll have to skip over several IronPython frames to find the next Python frame.

A better experience would elide the IronPython frames; perhaps Harry Pierson's debugger work could be applied to improve it. That, and other Visual Studio improvements for IronPython, are another project entirely.

Saturday, January 30, 2010

Running the Django Test Suite On IronPython

UPDATE: A simpler version of these instructions is available on the django-ironpython Bitbucket page.

This guide will explain how to setup and attempt to run the Django test suite on IronPython. Once the test suite runs, it should be much easier to fill in the parts of Django that don't work properly.

What You'll Need

It's not terribly difficult to set this up, but there are quite a few pieces.

  • IronPython 2.6 – use the installer so that you get the standard library as well.
  • Django trunk – an SVN checkout for now; if you have Mercurial, you can get it from django-ironpython.  Following the hg repo will get you my IronPython fixes.
  • adonet-dbapi – For the sqlite3 module implementation – MS SQL is a future target, but not right now (use the "get source" link if you don't have hg installed).
  • System.Data.SQLite – used by the sqlite3 module (just download the binary zip; no need to install)
  • 1 cup flour…

Getting Started

First up, install IronPython and checkout the Django trunk and adonet-dbapi somewhere. I'll use %USERPROFILE%\Documents\Repositories\django\ and %USERPROFILE%\Documents\Repositories\adonet-dbapi\ in these examples. Next, create a "DLLs" folder in the IronPython install folder and drop System.Data.SQLite.dll into it (this way IronPython will automatically reference it).

The next step is to prepare the Django test suite. This requires you to create a small Django app that contains things like database settings. The full instructions can be found on Alex Gaynor's blog, or you can download it and unzip it (I'll assume %USERPROFILE%\Documents\Repositories\django-test\).

Running the Tests

OK, command prompt time – if you're not comfortable with the command prompt, this won't be for you.

set PATH=%PATH%;C:\Program Files (x86)\IronPython 2.6
set IRONPYTHONPATH=%USERPROFILE%\Documents\Repositories\django\;%USERPROFILE%\Documents\Repositories\adonet-dbapi\abapi;%USERPROFILE%\Documents\Repositories\django-test\
set DJANGO_SETTINGS_MODULE=django_test.settings

Now, make sure you're in the django directory and run the test cases:

cd %USERPROFILE%\Documents\Repositories\django\
ipy tests\runtests.py -v 1

So far, so good.

The Problems

It'll bomb immediately on an assertion failure. Django does not like the fact that on IronPython str == unicode and thus their lazy evaluation doesn't work immediately (see issue #1 for details). Comment out that assertion, and it fails again – and I haven't fixed this one yet. Stay tuned.

Thursday, January 28, 2010

LINQ Support for IronPython

This is a collection of thoughts and a summary of a mailing list conversation about LINQ support in IronPython. The IronPython team (which seems to be just Dino & Dave at this point) seemed receptive to the idea; it's just a matter of resources. Therefore, please vote on the linked issues so that they can get their priorities straight.

There are two key parts to LINQ support: extension methods and expression trees. Each is useful on its own, but both are required to really take advantage of LINQ.

Extensions Methods

An extension method is a way of adding a function to a class without editing the class, or providing a default implementation for a class implementing an interface. LINQ is almost entirely composed of extension methods on the IEnumerable<T> and IQueryable<T>interfaces, so supporting them in IronPython is critical to supporting LINQ and many other interfaces. In this case I'm only talking about IronPython being able to c0nsume extension methods, not create them.

In C#, an extension method is made available by a using directive for the namespace containing a static class that contains the method. For example, the Enumerable class is in the namespace System.Linq and contains about half of LINQ's extension methods. To use this class from C# requires only:

using System.Linq;

The Python equivalent of this would be:

from System.Linq import *

Now, this style is frowned upon in Python circles because it pollutes the namespace unnecessarily. A more Pythonic equivalent would be:

from System.Linq import Enumerable

When doing an import, IronPython would have to check if Enumerable contains any static methods marked with ExtensionAttribute and add them to the list of possible methods to resolve for the applicable type. I actually tried to implement this at one point, but haven't had the time to finish it up.

The issue for this is CodePlex #17250 - Support for LINQ extension methods.

Expression Trees

An expression tree is an abstract, language-independent representation of a piece of code that can be more easily parsed and transformed than the raw code it was generated from. From the expression tree, the LINQ provider (such as LINQ to SQL) determines how to convert it into a query in its target language (such as SQL). The DLR actually uses "expression" trees as well (a superset of the LINQ classes that support statements as well as expressions), which are compiled into IL code and then executed.

In C# (and VB), lambda expressions are convertible to expression trees. Python also has lambda expressions, and these should also be convertible to expression trees – in particular, expression trees that exactly match what would be created by the C# compiler for an equivalent lambda, which is what every existing LINQ provider would expect.

The issue for this is CodePlex #26044 - lambda should be convertible to Expression<...>.

Tuesday, January 26, 2010

IronPython Web Roles for Windows Azure

Following up to my previous post on IronPython worker roles, I'm going to discuss how to implement web roles using IronPython. The code discussed here is on bitbucket, as usual.

There are currently two ways to implement IronPython web roles on Azure: using the ASP.NET Dynamic Language Support or using NWSGI. Someday I hope the IronRubyMVC project matures to be an option as well.

Using ASP.NET Dynamic Language Support

This example can be found in the PyWebRole folder of the project.

First, you'll need to download the ASP.NET Dynamic Language Support (ASP.NET DLS) files. Next, create a standard C# web role and delete all of the .cs files except WebRole.cs. After that, there's really not much to it – drop the assemblies from the ASP.NET DLS package into the project's bin folder and add the following bits to the web.config file (alternatively, just copy the Web.config from the project):

<configSections>
    <!-- Add this to the existing <configSections> element -->
    <section name="microsoft.scripting" type="Microsoft.Scripting.Hosting.Configuration.Section, Microsoft.Scripting, Version=1.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" requirePermission="false"/>
</configSections>

<system.web>
    <!-- Replace the existing <pages> element (if any) -->
    <pages compilationMode="Auto" pageParserFilterType="Microsoft.Web.Scripting.UI.NoCompileCodePageParserFilter" pageBaseType="Microsoft.Web.Scripting.UI.ScriptPage" userControlBaseType="Microsoft.Web.Scripting.UI.ScriptUserControl">
        <controls>
            <add tagPrefix="asp" namespace="System.Web.UI" assembly="System.Web.Extensions, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35"/>
            <add tagPrefix="asp" namespace="System.Web.UI.WebControls" assembly="System.Web.Extensions, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35"/>
        </controls>
    </pages>
</system.web>

<system.webserver>
    <modules>
        <!-- Add this to the existing <modules> element -->
        <add name="DynamicLanguageHttpModule" preCondition="integratedMode" type="Microsoft.Web.Scripting.DynamicLanguageHttpModule"/>
    </modules>
</system.webserver>

<!-- Add this whole section -->
<microsoft.scripting debugMode="true">
    <languages>
        <language names="IronPython;Python;py" extensions=".py" displayName="IronPython 2.6" type="IronPython.Runtime.PythonContext, IronPython, Version=2.6.10920.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" />
    </languages>
</microsoft.scripting>

From here, just follow the directions on the ASP.NET DLS page and in the package to create an ASP.NET Dynamic Language site.

NWSGI

First, download NWSGI 2.0. Next, deploying NWSGI to Azure is exactly the same as deploying it to any other server using xcopy. That's about it. You can see this example in the PyHelloWorld folder of the project.

Now What?

Unfortunately, now you're pretty much on your own. So far, none of the big Python web frameworks work 100% on IronPython, there is no Python library to access Azure's storage tables, blobs, or queues, and the .NET storage client library relies on LINQ support (which IronPython doesn't have, yet).

Personally, I plan to implement the data access in C# and call that from IronPython, running my own web framework. We'll see how it goes.

Sunday, January 10, 2010

IronPython Worker Roles for Windows Azure

Curiosity has finally got the better of me and I've started looking into Windows Azure again. It's matured quite a bit since I looked at it last year and now looks like a pretty solid platform to work with.

Aside from using NWSGI to write a web role (which I'll show later), I wanted to see if it was possible to write a worker role in Python. Happily, it is, and it's not that complicated. In fact, it's pretty similar to how NWSGI works – load up a Python file and run some functions.

Worker Role Requirements

A standard C# worker roles requires three functions: OnStart, OnStop, and Run:

public class PyWorkerRole : RoleEntryPoint
{
    public override bool OnStart() { /* ... */ }
    public override void OnStop() { /* ... */ }
    public override void Run() { /* ... */ }
}

This can be mapped to a Python module in a fairly straightforward fashion:

def start():
    return True

def run():
    pass

def stop():
    pass

This has some advantages and disadvantages compared to using a class, but I like it for its simplicity.

The Implementation

Azure requires an actual .NET class to implement a worker role, so we create one that hosts the IronPython engine. This is a good example of how to embed IronPython to run very simple scripts. The core IronPython hosting function is shown here; for the rest, see the files linked below.

private void InitScripting(string scriptName)
{
    this.engine = Python.CreateEngine();
    this.engine.Runtime.LoadAssembly(typeof(string).Assembly);
    this.engine.Runtime.LoadAssembly(typeof(DiagnosticMonitor).Assembly);
    this.engine.Runtime.LoadAssembly(typeof(RoleEnvironment).Assembly);
    this.engine.Runtime.LoadAssembly(typeof(Microsoft.WindowsAzure.CloudStorageAccount).Assembly);                 
    
    this.scope = this.engine.CreateScope();
    engine.CreateScriptSourceFromFile(scriptName).Execute(scope);             
    
    if(scope.ContainsVariable("start"))
        this.start = scope.GetVariable<Func<bool>>("start");
    
    this.run = scope.GetVariable<Action>("run");
    
    if(scope.ContainsVariable("stop"))
        this.stop = scope.GetVariable<Action>("stop");
}

First, we create a ScriptEngine and add some useful assemblies;  then we create a Scope to execute in; then we actually execute the script. Finally, we try to pull out the functions and convert them to C# delegates; run is required but start and stop are optional. Those delegates are called from the C# wrapper (from Run, OnStart, and OnStop, as appropriate).

The rest of the file is pretty much taken from the worker role template, so I'll leave it out.

Doing Actual Work

Now, a worker role needs some actual work to do – usually, reading items from a queue and processing them. Happily, the Azure StorageClient library is perfectly usable from IronPython.

from Microsoft.WindowsAzure import CloudStorageAccount
from Microsoft.WindowsAzure.StorageClient import CloudQueueMessage, CloudStorageAccountStorageClientExtensions

def run():
    account = CloudStorageAccount.FromConfigurationSetting("DataConnectionString")
    queueClient = CloudStorageAccountStorageClientExtensions.CreateCloudQueueClient(account)
    queue = queueClient.GetQueueReference("messagequeue")

    while True:
        Thread.Sleep(10000)

        if queue.Exists():
            msg = queue.GetMessage()
            if msg:
                Trace.TraceInformation("Message '%s' processed." % msg.AsString)
                queue.DeleteMessage(msg)

The only catch is that CreateCloudQueueClient is an extension method, so it must be called as a static method on the CloudStorageAccountStorageClientExtensions class.

Using the Code

To actually use the code, create a C# worker role as per usual, but replace the generated class file with PythonWorkerRole.cs (see below). Next, add the IronPython assemblies as references to the project. Then, create a string setting for the role (under the Cloud project's Roles folder) called ScriptName and set it to the name of the script file. Finally, add a .py file to the worker role and ensure that (under 'Properties') its 'Build Action' is 'Content' and 'Copy to Output' is 'Copy if Newer'.

The code can be downloaded from my PyAzureExamples repository, including zip archives of it. It includes the PyWorkerRole project and the Cloud Service project.