Tutorial Categories:

HTML/CSS JavaScript/AJAX Server-Side Marketing General Comp-Sci

XHTML Versions and Document Format

By Justin Poirier

The W3C works on drafts of new versions of standards over time, often working simoultaneously on sequential versions to stay ahead, and claims that a version has reached "recommendation status" when it is complete and they suggest all developers should use it. This does not necessarily correspond to when the version actually enters into widespread use, as certain versions do not get widely-used until years after they reach recommendation status.

XHTML 1.0 was the first version of XHTML. It reached recommendation status in January 2000, and "XHTML 1.0 Transitional" (one of 3 types of XHTML 1.0 along with "Strict" and "Frameset") is still the most widely-used version. Most successful websites use XHTML 1.0 Transitional in a way that ensures backwards-compatiblity with HTML 4.01 (the last version of HTML that the W3C created before switching to XHTML). This is accomplished by following the W3C guidelines for such backwards-compatibility (at http://www.w3.org/TR/xhtml/#guidelines), and by web servers returning XHTML 1.0 Transitional documents with the MIME type "text/html" as opposed to the proper MIME type for XHTML, "application/xhtml+xml". The latter practice deals with the fact that IE, as of the 2nd beta version of IE 8, does not support the MIME type "application/xhtml+xml". Using "text/html" also causes browsers that can support "application/xhtml+xml" to treat these pages as "text/html", rendering them as HTML 4.01. However developers still write this code to be both proper HTML 4.01 and proper XHTML 1.0 Transitional with the expectation that some day all browsers will support XHTML.

Web servers often determine the MIME type of a document by analyzing the document's actual contents and guessing the type using heuristics. In the case of XHTML 1.0 Transitional documents, web servers typically see the first few lines of code (described below), determine the page to be XHTML 1.0 Transitional, and then return it with MIME type "text/html" for the reasons we've discussed. The web servers of popular web host Netfirms determine document types in this way.

There have been a few subsequent versions of XHTML released since 1.0, all of which are either meant only for certain device types or not yet in widespread usage.

One way that the elements used in an XML document can be defined is by using a Document Type Definition (DTD), which is a listing of all such definitions. This listing can be internal, appearing within the XML file, external, appearing in a separate file, or split into two parts using both types. The DTD used by an XML file is specified in a line called the Document Type Declaration (or DOCTYPE) with the following format:


<!DOCTYPE    Root element of document    SYSTEM1 or PUBLIC2    If PUBLIC DTD used, its FPI3    URI of external DTD    [
Internal portion of DTD contents
]>
1 if using external DTD specific to this aplication
2 if using external DTD that has been created by some authority for repeated use
3 or Formal Public Identifier, ie. the industry code by which it is known to user agents

Sections of this line referring to either the external or internal portion of the DTD can be ommitted if no such portion exists. If a PUBLIC DTD is used, the user agent will first try to locate it using only the FPI, only resorting to the URI given if it is unable to do so. In the case of an XHTML document the W3C decides what elements are available to be used, so the DTD is external, in a file created by them. The DOCTYPE line specified above must always be the first line in an XHTML document (as opposed to the second in other XML documents). For XHTML 1.0 Transitional it translates to the following:


<!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/html1-transitionaldtd">

Note the URI of the DTD, stored on the www.w3.org website. In lieu of this authors are encouraged to download and use local copies of the DTD.

The html tag of an XHTML 1.0 Transitional document should have the following attribute-value pairs: