[OT, META] For Approval: BSD License for UCAR's VAPOR Software

David Woolley david at djwhome.demon.co.uk
Sat Aug 12 16:04:19 UTC 2006


> The "text/html" media type is suitable for XHTML1 within the
* transitional profile. See also RFC2854:

XHTML1 doesn't really support namespaces when served as text/html, and
the text/html compatible subset has to conform to some quite strict
requirements.  If it did support them, it could still not allow the
default namespace to be changed locally (not that the Word documents do
that), as that could cause an HTML browser to start honouring elements
with the same name as those in HTML, or one of its proprietary extensions.

Specifically, mixed namespaces are "not strictly conforming".
<http://www.w3.org/TR/xhtml1/#ref-xmlns>.

The W3C guidelines on the use of media types
<http://www.w3.org/TR/2002/NOTE-xhtml-media-types-20020801/> specifically
say that text/html is not suitable for mixed namespace documents:

   In particular, 'text/html' is NOT
   suitable for XHTML Family document types that adds elements and
   attributes from foreign namespaces,

* the content. Remember that XHTML remains backward compatible with the
* previous HTML4.1 standard, provided that some precautions are taken in

XHTML, even XHTML 1.0 is most definitely not backwards
compatible.  There is a subset of XHTML 1.0 defined in appendix C
<http://www.w3.org/TR/xhtml1/#guidelines> of its specification, that is
considered safe to serve as text/html (although a lot of what purports
to XHTML on the web is not XML, and, even if XML, is not within the
safe subset).

That subset relies on certain error recovery and loose parsing in real
life web browsers, and tries to avoid constructs that would parse
differently (e.g. TBODY must be explicit).  A truly conformant HTML
browser would throw it out as the / in the <br /> type notation has a
different meaning in SGML.

XHTML 2.0 would be largely meaningless to an HTML browser because it
is essentially a new language.

> However, the (X)HTML document should better include the required DOCTYPE
* declaration for the core XHTML standard before the opening <html ...> tag,

Which the example didn't.

* because it defines precisely the content model for the default anonymous
* namespace (the other option is to specify the default xmlns="..." as a

When you serve XHTML as text/html you are serving it to browsers that
aren't required to understand XML or XHTML, so providing an XHTML 
DOCTYPE is no guarantee that they will actually apply XHTML rules.  IE
certainly doesn't reject not-well-formed documents.

In fact, the above mentioned W3C guideline on the use of text/html
says:

   XHTML documents served as 'text/html' will not be processed as XML
   [[63]XML10], e.g. well-formedness errors may not be detected by user
   agents. Also be aware that HTML rules will be applied for DOM and
   style sheets (see C.11 and C13 of [[64]XHTML1] respectively).

* pseudo-attribute of the root <html> element of the document. Otherwise
* browsers will use the legacy compatibility mode and may not present the
* content according to the strict parsing rules (and this may change the
* behavior of various HTML elements, or the interpretation of some CSS
* styles, notably the style inheritance rules.

Most of the CSS variations are due to omitted opening tags in HTML,
which result in an implied element in HTML but nothing in XHTML.  The
rules that permit XHTML 1.0 to be served as text/html basically restrict
it to those cases where there is no such ambiguity.

> 
> There are three flavors of XHTML 1.0 standardized: Strict, Transitional

These simply reflect the corresponding flavours of XHTML 4.0 that they
are trying to approximate.  XHTML 1.0 attempts to be as close to 
XHTML 4.1 as is possible within the constraints of XML.

> See:
> http://www.w3.org/TR/xhtml1/#docconf

This only applies to documents served with an XML media type and makes
them not strictly conforming.

> for more hints about how to make conforming XHTML documents that will
* parse correctly under HTML4.

The rules are in appendix C of the specification, not in the
conformance section that you just referenced, but they result in 
good behaviour under HTML 4 browsers, not good HTML 4.  They are also
full of subtle catches, that people don't realise  because people don't
actually test on XHTML compliant browsers.

For a better written explanation of why sending XHTML as text/html is a
bad thing see Sending XHTML as text/html Considered Harmful
<http://hixie.ch/advocacy/xhtml>.  Hixie is one of the technical leads on
Opera and also worked for Netscape.

In the Word case it is simply embrace and extend, but for most people,
they use XHTML 1.0 for fashion reasons, not because they have any
real need to do so.



More information about the License-discuss mailing list