XHTML Versions and Document FormatBy Justin Poirier
The W3C works on drafts of new versions of standards over time, often working simoultaneously on sequential versions to stay ahead, and claims that a version has reached "recommendation status" when it is complete and they suggest all developers should use it. This does not necessarily correspond to when the version actually enters into widespread use, as certain versions do not get widely-used until years after they reach recommendation status.
XHTML 1.0 was the first version of XHTML. It reached recommendation status in January 2000, and "XHTML 1.0 Transitional" (one of 3 types of XHTML 1.0 along with "Strict" and "Frameset") is still the most widely-used version. Most successful websites use XHTML 1.0 Transitional in a way that ensures backwards-compatiblity with HTML 4.01 (the last version of HTML that the W3C created before switching to XHTML). This is accomplished by following the W3C guidelines for such backwards-compatibility (at http://www.w3.org/TR/xhtml/#guidelines), and by web servers returning XHTML 1.0 Transitional documents with the MIME type "text/html" as opposed to the proper MIME type for XHTML, "application/xhtml+xml". The latter practice deals with the fact that IE, as of the 2nd beta version of IE 8, does not support the MIME type "application/xhtml+xml". Using "text/html" also causes browsers that can support "application/xhtml+xml" to treat these pages as "text/html", rendering them as HTML 4.01. However developers still write this code to be both proper HTML 4.01 and proper XHTML 1.0 Transitional with the expectation that some day all browsers will support XHTML.
Web servers often determine the MIME type of a document by analyzing the document's actual contents and guessing the type using heuristics. In the case of XHTML 1.0 Transitional documents, web servers typically see the first few lines of code (described below), determine the page to be XHTML 1.0 Transitional, and then return it with MIME type "text/html" for the reasons we've discussed. The web servers of popular web host Netfirms determine document types in this way.
There have been a few subsequent versions of XHTML released since 1.0, all of which are either meant only for certain device types or not yet in widespread usage.
One way that the elements used in an XML document can be defined is by using a Document Type Definition (DTD), which is a listing of all such definitions. This listing can be internal, appearing within the XML file, external, appearing in a separate file, or split into two parts using both types. The DTD used by an XML file is specified in a line called the Document Type Declaration (or DOCTYPE) with the following format:
<!DOCTYPERoot element of document
PUBLIC2 If PUBLIC DTD used, its FPI3 URI of external DTD
Internal portion of DTD contents
Sections of this line referring to either the external or internal portion of the DTD can be ommitted if no such portion exists. If a PUBLIC DTD is used, the user agent will first try to locate it using only the FPI, only resorting to the URI given if it is unable to do so. In the case of an XHTML document the W3C decides what elements are available to be used, so the DTD is external, in a file created by them. The DOCTYPE line specified above must always be the first line in an XHTML document (as opposed to the second in other XML documents). For XHTML 1.0 Transitional it translates to the following:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/html1-transitionaldtd">
Note the URI of the DTD, stored on the www.w3.org website. In lieu of this authors are encouraged to download and use local copies of the DTD.
The html tag of an XHTML 1.0 Transitional document should have the following attribute-value pairs:
This sets the language for user agents rendering the document as XHTML.
This sets the language for user agents rendering the document as HTML.
XML allows authors to use an element or attribute name more than once, with different semantic meanings--perhaps because the document was combined with another document with similarly-named element types. These meanings are called namespaces. When it is specified, for each occurence of an element, which of the namespaces of that element-type it is associated with, and when namespaces are given unique names, any software that uses the parsed XML can be made to treat occurences of an element differently depending on their associated namespaces. The syntax for specifying the namespace for an element occurence and all of its children uses the xmlns attribute set equal to the name of the namespace, so in the line above we are setting the highest-level element in an XHTML document to the namespace named after the URL shown (namespaces are always given the name of a URL, since all URL's are unique and therefore by using a URL from within their own website, authors can ensure that no XML document with which their's will be combined will have a namespace of the same name. The URL need not point to anything in particular.) User agents like browsers know the namespace in the line above to be the XHTML namespace, and render the page accordingly.