(X)HTML5 CHARACTER ENCODING

If you may have read the Web Applications 1.0 specifications you may have believed UTF-8 to be the sole acceptable character endcoding. Or, if after reading WHATWG Wiki’s Differences HTML and XHTML you may have noted that (X)HTML5supports UTF-8 Character Encoding only. That isn’t so.

On December 5, 2006 I sent this message.

HTML5/XHTML5 specs _implicitly_ state UTF-8 is required. None of the specs – as far as I can see – state UTF-8 as _explicitly_ required, i.e., USE UTF-8 OR DIE.

“So,

“Is UTF-8 the sole acceptable charset for HTML5?”

Ian Hickson replied on December 5, 2006,

“No, any character set can be used. We haven’t quite defined how character encodings work yet, but basically it’ll be whatever IE7 supports today.”

That would be most Every charset.