xhtml done right?

My working Apache configuration for serving application/xhtml+xml to Firefox and text/html to IE.

Update 20060418: Okay, so a lot of the stuff written below is no longer true. See the updates at the bottom of the article.

This page, and pages like it on this website, are served up by Apache as application/xhtml+xml in Firefox, but as text/html for Internet Explorer because IE can't handle the xhtml. The reasons for doing so are, um, well, let's see, this guy has some good reasons for serving xhtml as application/xhtml+xml. Others say application/xhtml+xml should die [well, that link's down already. I'll send him an email.], or just outright no to xhtml. Well, one neat thing you can do is include inline svg:

image/svg+xml

Look at the source and you'll see the xml. With a graphic that large, though, it makes for a big old clunky xhtml page. Also, the w3c validator doesn't like it (check by clicking the XHTML link at the bottom of the page). Better to put it in an object tag, like as follows:

  <object data="/img/cecily1.svg" type="image/svg+xml" width="750" height="600"></object>

IE doesn't seem to like svg in any shape, even when pulled in with an object tag, so I'm not sure if that's a good idea either. But I digress. For better or worse (and with an eye toward the ever-glimmering future), I decided to serve it up with Apache's content negotiation. Oh, and one more reason to serve application/xhtml+xml is that Firefox tells me if I've missed a closing tag.

A quick google search on some of this page's keywords should bring up most of the pages I looked at. I got the most help from the official w3c page on serving via content negotiation, but it took a little experimentation. First, as per the w3c advice, all of my files end in .xhtml, and each has a symbolic link from a .html file of the same name. In the apache available-sites file, in the directory section, I needed the MultiViews and FollowSymLinks options. That works as long as you point at http://un.ctuo.us/xhtml (without the .html, and you let Apache decide which way to send it), or at http://un.ctuo.us/index, but if it's just http://un.ctuo.us/ then Firefox defaults (in error, in my opinion, but I'm not sure if it's Firefox's fault) to the .html, so I had to add "DirectoryIndex index" to the Apache config. That works! So my Directory section looks like this:

<Directory /var/www/un.ctuo.us/>
    Options Indexes FollowSymLinks MultiViews
    DirectoryIndex index
    AllowOverride None
    Order allow,deny
    allow from all
</Directory>

Update 20060403: As per these Apache performance tuning tips, I've changed

  DirectoryIndex index

to

  DirectoryIndex index.xhtml index.html

I also played with the creation of type-map files, as it suggests, but those extra files further cluttered up my directories that are already full of redundant files (more about that later in about this site) and as far as I can tell I'd have to end the URLs on my site in ".var" for that to work (ugly!), so it's out. While I might take a performance hit for using MultiViews, it's a simpler, more elegant solution.

Update 20060411: Oops, just noticed that

  DirectoryIndex index.xhtml index.html

doesn't take advantage of content negotiation. Internet Explorer isn't served the .html page automatically when visiting http://un.ctuo.us/, so it's back to

  DirectoryIndex index

No performance tuning for my Apache, then. Oh well, I tried.

Update 20060418: I've just realized that Google doesn't recognize the file format of my site, as you can see by the un.ctuo.us search results. Looks like Google doesn't recognize sites sent as application/xhtml+xml, and apparently it's not smart enough to content negotiate with Apache. I'll have to think about this one and get back to you.

Update 20060418: Okay, so after the trouble with Google I flirted with the idea (after looking at Tampering around with XSLT) of sending pages as XML with links to the XSLT stylesheets I'm using, but then I realized I'd have a lot of the same problems. I did some research and found the very informative Perils of using XHTML properly, and I'm convinced to just serve everything up as HTML 4 Strict. The back end will still be XML, as will be described in the page about this site, but all the browsers will see is straight HTML. No MultiViews, no content negotiation, just HTML.

Another lesson I just learned (from this Ampersand for URLs post) is that in order for these pages to run through the XSLT correctly all of the ampersands in my hrefs need to be referred to by the character entity "&amp;". So, for example, instead of

  <a href="http://www.google.com/search?hl=en&amp;lr=&amp;q=un.ctuo.us&amp;btnG=Search">

I need

  <a href="http://www.google.com/search?hl=en&amp;amp;lr=&amp;amp;q=un.ctuo.us&amp;amp;btnG=Search">

Fun!

So now that these pages are HTML (I learned a little from HTML and XSLT) the SVG obviously no longer works, and it messes up validation like crazy, but I'm going to leave it in anyway, for posterity, and that others may benefit, I'm not sure how.

Update 20060420: On second thought (or, like, seventh thought), pages will be XHTML 1.0 Strict, but served up as text/html. Seems to be the best compromise. I guess that's why everyone else does it that way.

keywords: tech, xhtml, html, content negotiation, Apache, multiviews, type-map created 2006-03-13 last modified 2010-09-13