Thursday, July 2, 2009

NWSGI 2.0: Removing URL Warts

The normal procedure for configuring NWSGI leaves an unsightly URL wart: the URL must contain the .wsgi file. There are two ways to get rid of that wart: URL rewriting and wildcards.

URL Rewriting

When using URL rewriting, the web server changes ("rewrites") the incoming URL into something NWSGI can understand. IIS 7 has a URL rewriting extension to do just that. It's quite easy to use, too:

<rewrite> 
    <rules> 
        <clear /> 
        <rule name="Redirect Warty Requests" stopProcessing="true"> 
            <match url="simple.wsgi/(.*)" /> 
            <conditions logicalGrouping="MatchAll" /> 
            <action type="Redirect" url="{R:1}" redirectType="Permanent" /> 
        </rule> 
        <rule name="Remove WSGI Wart"> 
            <match url="^(.*)$" /> 
            <conditions logicalGrouping="MatchAll" /> 
            <action type="Rewrite" url="simple.wsgi/{R:1}" appendQueryString="true" /> 
        </rule> 
    </rules> 
</rewrite>

Replace simple.wsgi with the name of the .wsgi file you would normally use. Now, http://example.com/simple/simple.wsgi/Products/Widgets can be accessed as http://example.com/simple/Products/Widgets. The first rule ("Redirect Warty Requests") will redirect any uses of http://example.com/simple/simple.wsgi/Products/Widgets to http://example.com/simple/Products/Widgets to ensure that only one set of URLs is used (which is important for search engines). However, most modern applications expect to have their URLs rewritten and can be configured to generate the "clean" URLs by default.

If you are using IIS 6, you'll need to use something like ISAPI_Rewrite to achieve the same effect. The following should have the same effect as the above rules:

RewriteEngine On
RewriteRule ^hello.wsgi/(.*) $1 [NC, R=301]
RewriteRule ^(.*)$ hello.wsgi/$1 [NC]

Wildcards

Normally, IIS dispatches requests by extension; this is why we have the .wsgi file in the first place. However, it can be configured to pass all requests to a specific handler by configuring a wildcard extension. Unlike URL rewriting, which is transparent to NWSGI, NWSGI needs to know that it has been configured as a wildcard handler. To do this, add a <wildcard> element to the configuration:

<wsgi>
    <wildcard physicalPath="C:\simple\simple.wsgi" callable="simple_app" />
</wsgi>

<system.web>
    <httpHandlers>
        <add verb="*" path="*" type="NWSGI.WsgiHandler, NWSGI, Version=2.0.0.0, Culture=neutral, PublicKeyToken=41e64ddc1bf1fc86" />
    </httpHandlers>
</system.web>
<system.webServer>
    <handlers>
        <add name="WsgiHandler" path="*" verb="*" type="NWSGI.WsgiHandler, NWSGI, Version=2.0.0.0, Culture=neutral, PublicKeyToken=41e64ddc1bf1fc86" resourceType="Unspecified" />
    </handlers>
    <validation validateIntegratedModeConfiguration="false" />
</system.webServer>

The physicalPath and callable attributes have the same meaning as on the <scriptMapping> element. If the configuration includes a <wildcard> element, any <scriptMapping> elements are ignored. Also, make sure that the path attribute of the handler mappings in set to *.

Ultimately, the effect is the same as URL rewriting: http://example.com/simple/simple.wsgi/Products/Widgets can be accessed as http://example.com/simple/Products/Widgets.

Which to Choose

If possible, you should prefer URL rewriting, as it should be faster than using wildcard mappings; however, it requires support from the application. If you're on IIS 6 or IIS 7 without a URL Rewrite extension installed, or your application doesn't support rewritten URLs, then wildcard mappings are available.