[META] For Approval: BSD License for UCAR's VAPOR Software

Philippe Verdy verdy_p at wanadoo.fr
Sat Aug 12 14:11:39 UTC 2006


From: "David Woolley" <david at djwhome.demon.co.uk>
>> Mr. Wilson and others involved in license-discuss:  We have buy-in to 
> 
> ....
>> Content-Type: text/html; charset="us-ascii"
>> 
>> <html xmlns:v="urn:schemas-microsoft-com:vml"
> 
> (Note xmnls: is invalid under text/html, and the namespace is a Microsoft
> proprietary one.)

It is definitely NOT invalid, because "text/html" includes XHTML which *requires* XML conformance, and so includes the support of XML namespaces. See:
http://www.w3.org/TR/xhtml1/#media
and
http://www.w3.org/TR/2002/NOTE-xhtml-media-types
The "text/html" media type is suitable for XHTML1 within the transitional profile. See also RFC2854:
http://www.rfc-editor.org/rfc/rfc2854.txt
Another requirement is that the "text/html" MIME type requires encoding end of lines as CR+LF sequences only (not just CR or LF), regardless of the character encoding ('charset') involved, even though XHTML allows several variants according to XML parsing rules, in MIME-compatible transport layers (however HTML and XHTML are conveniently transported via HTTP which is not strictly conforming to MIME, and HTML or XHTML documents may use other encodings for end-of-lines.

And with the correct namespace declaration, the Microsoft extension is perfectly valid, and browsers are NOT required to recognize it, but MUST still support the XHTML core, and should display the rest of the content. Remember that XHTML remains backward compatible with the previous HTML4.1 standard, provided that some precautions are taken in addition to the strict XML conformance (notably, the empty elements like <br /> or <image src="..." ... /> *must* be closed with a /> in XHTML, but there must be a space before it to allow correct parsing under legacy HTML 3 or 4, where the / will be interpreted as an invalid attribute name that will be ignored).

However, the (X)HTML document should better include the required DOCTYPE declaration for the core XHTML standard before the opening <html ...> tag, because it defines precisely the content model for the default anonymous namespace (the other option is to specify the default xmlns="..." as a pseudo-attribute of the root <html> element of the document. Otherwise browsers will use the legacy compatibility mode and may not present the content according to the strict parsing rules (and this may change the behavior of various HTML elements, or the interpretation of some CSS styles, notably the style inheritance rules.

There are three flavors of XHTML 1.0 standardized: Strict, Transitional and Frameset. The Transitional flavor has excellent compatibility with HTML 4, with just the exception of some unnecessary elements that were removed from the core document model, and the requirement about lowercased element and attribute names, the required quotes around all attributes, and unminimized attributes (e.g. <dl compact="compact"> instead of just <dl compact>).

Additional requirements is that embedded scripts must protect the occurences of < and > operators which may be incorrectly parsed according to XML as tag delimiters, this requires using <![CDATA[...]]> sections, or external scripts, and it is not possible to use SGML exclusions.

So conforming MIME and document headers for XHTML 1.0 Transitional compatible with HTML4 parsers, and with conforming support of the MS VML extension would be this:

...
Content-Type: text/html; charset=US-ASCII
....

<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:v="urn:schemas-microsoft-com:vml">
<head>
<title>...</title>
</head><body>
...
</body></html>

(note that the optional initial XML declaration is not required when the encoding is UTF-8 or US-ASCII, given that US-ASCII is fully compatible with UTF-8 and that the XML default version is 1.0 and the XML default encoding is UTF-8, and the default encoding for MIME is US-ASCII in HTTP, or ISO-8859-1 for "text/*" MIME subtypes which is fully compatible with US-ASCII.)

See:
http://www.w3.org/TR/xhtml1/#docconf
for more hints about how to make conforming XHTML documents that will parse correctly under HTML4.
(And forget HTML3 now!)

See also section 5.2 of the HTML 4.0 standard...




More information about the License-discuss mailing list