As I began marking up my website with microformats for IWC London 2020, I kept bumping into the same feeling of discomfort. I like the seperation of data, presentation, content etc. and consider it a fundamentally important part of the way the web works. Not just because it means that debugging is easier (you know where to look), but also because it makes understanding websites simpler and therefore more accessible.
For me, the class
attribute is inextricably linked with the presentational layer i.e. CSS. Alongside id
, class
is the main hook for CSS to style an element, but unlike id
it serves no other purpose (though we'll come back to that). An element's id
is at least targetable in the URL, allowing you to segment a page into linkable sections; all class
does is define hooks for other languages, like JavaScript and CSS, to play off.
But microformats live in the class
attribute. If you want to markup a page using the likes of h-entry
, h-feed
, p-author
etc. you put them in the class
attribute. I almost asked about this on the IWC chat, getting as far as writing (but never posting):
A question about microformats: do you have to assign them to the class attribute, or is that just the simplest implementation?
Before bailing and doing some research by myself. It felt too obvious of a question not to have been debated, especially by a group of people that formed around setting semantic standards. Turns out I was very right about that; there's a lot of info out there within the community.
My first port of call was data-
attributes. These were introduced in HTML5[1], but they've been widely supported for over a decade and feel like a solid candidate. The whole point of data-
attributes is to allow you to pass, well, data around a website. For instance, say you have an ingredients list for your favourite recipe. You could markup each item with a data-foodtype
attribute that states if it's meat, dairy, veg, fruit, or other. Then, on the page, you could have a JS script that automatically adds relevant icons next to each type, or even style the ingredients differently using CSS[2]. You could even have a search feature which allows a user to filter based on dietary requirements that way, but I digress.
The point is that the data-
attribute feels like a good fit for microformats. It allows an element to contain information relevant to understanding its contents or relationship to the page/website, without making that information visible to the user. To me that's a 1:1 use case with microformats, but apparently I'm just wrong 🤷♀️ Not only is the microformats2 spec (the latest version) clear that they should only be set in the class
attribute, the WHATWG HTML5 spec expressly forbids their placement in the data-
attribute:
These attributes are not intended for use by software that is not known to the administrators of the site that uses the attributes.
In other words, publicly accessible information, like microformats, cannot be considered valid data on an HTML element. Damn. Coincidentally, the microformats group agrees, as do sites like MDN and HTML5 Doctor. Guess that's a lost battle then.
Okay, so we can't use the permitted HTML5 custom attribute syntax, but microformats are a competing standard with their own vocabulary and ecosystem. They could have decided to use an entirely new attribute, such as format
or even microformat
. Sure, browsers wouldn't necessarily do anything with it, but that's not the point: you're creating a whole new standard, so why not create a new attribute?
Well, whilst I couldn't find any evidence that this discussion had been had, I did uncover some interesting bits and pieces about the class
attribute which effectively kills my logic. The microformats FAQ has a section dedicated to "Class interactions"[3] which outlines the semantic logic behind using the class
attribute, as well as another question on whether this is "...sneaking presentation back into data"[4], both of which have solid answers:
[This is] based on a misunderstanding of the way the class attribute in HTML was designed. Yes, class is very commonly, and appropriately used by web designers in conjunction with CSS to style pages... but despite this, class, according to the HTML specification "has several roles in HTML", including "for general purpose processing by user agents".
Microformats utilize this second aspect of the class (and id) attribute, and do so legitimately. It is not an abuse of the class or id attribute to use it to add semantic context to a document. Nor is the use of class in and of itself presentational - in fact, it is an important mechanism for separating presentation from structured content.
Basically, when I said we would come back to my assertion that class
isn't used for anything other than a presentational or interactive hook at the top of this article? Yeah, turns out that's not correct. The spec allows for a broader interpretation thanks to the inclusion of "general purpose processing", which is why the microformats group have chosen to place them within it.
Which is fine; that makes sense. The problem is that by this point, I was knee deep in three seperate specifications and had about a dozen other tabs open. In other words, I still don't feel that the reasoning behind using the class
attribute is intuitive. After all, there's the actual definition of a given element, and then there's the accepted definition and, to my mind, the accepted definition of class
doesn't sit that well with it becoming a data layer.
Except, it doesn't look like the indieweb community considers microformats a data layer, but a content layer i.e. a "human-readable" one. At this point my brain broke. The very next question on the the microformats FAQ explicitly states that "human-readable" information should not be hidden, then links out to the guiding principles which claim that microformats should be designed to be human first and visible. So I'm clearly missing some point somewhere, because as far as I'm concernd microformats are designed for machine readability and are hidden by design.
After all, microformats are quirkily named variables that are at least partially opaque (the heck does h-entry
denote, let alone the utterly esoteric u-key
); yes they use English language terms, so you can guess that p-location
is a place, but you still have to learn what the prefixes p
, h
, u
, and e
mean. Nor do humans really need that level of categorisation - but machines do. A human will see "San Francisco" and think "place"; only a machine needs a specific block of markup to ensure that is clear. And even if the claim is that people would benefit from them, they're set on an HTML attribute which is, by nature, not visible to the end user.
Which ultimately brought me back around in a loop and left me questioning why I needed microformats in the first place. I thought I was doing this so that machines and web services could accurately interact with my digital representation - my website - and that made some sense[5], but apparently it isn't? Except when you look at those self-same services, like indiewebify.me, they state that microformats are ueful so that:
...other people's software can understand [your profile information] and use it for [indieweb services].
Cool, that makes sense. I'm glad no one seems to agree 😂.
So, ultimately, what did I learn here? Well, I'm still not a fan of setting data-based definitions within the class
attribute, but I can at least understand the logic behind that decision and admit there probably isn't a better solution. Beyond that... I'll just keep myself to myself and not worry about it so much 😉