theAdhocracy | Six common html myths

Source
Link to Original 🔗
Published
15 Nov 24
Categories
HTML & CSS, Frontend
Tags
HTML, HTML parser

A list of common misconceptions about HTML, with lots of excellent detail about how HTML parsers actually work.

(Though I'm not sure how common ‒ or even controversial ‒ some of them are; I'm not sure any one is arguing to use XHTML these days, are they?)

On the death of HTML4:

HTML4 is just outright dead. Browsers do not parse HTML documents as HTML4, regardless of the DOCTYPE.

On a genuinely solid reason not to use "self-closing" elements in HTML:

Lastly, because HTML parsing rules are not the same as SGML’s or XML’s, the trailing solidus carries an additional danger. If it directly follows unquoted attribute values without proper space before it, then it will be parsed as part of that attribute value.
• <img class=wide/> is an IMG with class value “wide/”.
• <img class=wide /> is an IMG with class value “wide”.

So don’t add that trailing solidus! It’s not better; it’s just more dangerous.

On how we need more HTML parsers:

The world needs more HTML parsers with different applications. Most available HTML parsers produce a DOM – a fully-parsed tree representation of the DOM the HTML represents. This is a memory-heavy operation and performs a lot of semantic cleanup to form a proper DOM document. However, lots of operations don’t need or want a DOM interface.

On how XML parsers (and RegEx, and string functions, and any other suggested alternative) will never actually be able to parse HTML; only an HTML parser can:

There’s significant advice on the web to use an XML parser for HTML, but it’s not safe. Reach for an HTML parser to parse HTML.

Explore Other Notes

Older ➡

Thoughts on the resiliency of web projects

Some interesting thoughts on how short-term wins and fun, quirky ideas can morph over time into technical debt and various other issues particularly inherent within an open-software …

<p>A list of common misconceptions about HTML, with lots of excellent detail about how HTML parsers <strong>actually</strong> …</p>
15 Nov 24
Murray Champernowne

Journal permalink