A first-hand account of how marking individual words or short, inline phrases as a different language (even when accurate) can be a jarring and inaccessible experience for many screen reader users.
The author strongly advocates for simply never using the HTML lang
attribute (outside of the root element, though even here they show scepticism to its benefit), and only breaking that rule where an entire paragraph or section of a page has been translated or quoted in a different language to the rest of the text.
Being an account of an actual screen reader user, it's hard to dismiss their indignant tone and heavy phrasing. That said, the article has driven a certain amount of online discourse, much of which has questioned the strength of the argument or added particularly useful commentary/insight.
I specifically found Daniela Kubesch's thread over on Mastodon very useful. The following are some excerpts from the ensuing discussion:
A solid explanation in support of the argument from Marco Zehe (also a screen reader user):
Kerstin Probiesch has also written an article about this, although this also heavily emphasizes the case in PDF documents. The problem is that, for the screen reader to switch the language, one voice has to stop mid sentence, switch the voice to one that speaks the other language, speak the word or phrase, then switch back, which can take half a second each. It ruins the whole sentence melody, sounds as if you were to take a breath in mid sentence, disconnecting the words in a very unnatural manner for no apparent reason. I find it quite annoying if it happens too often.
A supportive stance from Eric Eggert:
This is what we teach for 10+ years. Individual words should not be marked with different languages.
An opposing view from Léonie Watson:
FWIW, the opinions expressed in that article do not reflect my own (as a screen reader user).
Some excellent points from James Scholes diving into some of the nuance behind the topic:
The experiences of users learning their first second language versus those already fluent in multiple ones will differ significantly. Similarly, people dealing with multiple languages from the same family will have different experiences compared to those who frequently switch between languages with very different alphabets.
Finally, I don't think the article fairly or accurately apportions the blame. It's mostly aimed at standards bodies, without acknowledging that screen readers and speech synthesizers don't always handle multiple languages well.
And a pertinent example of how usage should probably consider the combination of languages being used from Nic:
i haven't tested this thoroughly, but i feel like this post overlooks the existence of other languages? For example, in chinese/japanese/korean, the lang attribute is needed to show the correct glyph, and i dont know how a screen reader would otherwise even attempt to pronounce it since the characters are all pronounced completely differently in different languages.
On the specific WCAG SC being debated:
Success Criterion 3.1.2 Language of Parts The human language of each passage or phrase in the content can be programmatically determined except for proper names, technical terms, words of indeterminate language, and words or phrases that have become part of the vernacular of the immediately surrounding text.
On the core misinterpretation (as presented):
The Success Criterion says phrase or passage, not words.
On a less well known piece of screen reader functionality:
First of all, modern screen readers have dictionaries in which the pronunciation of common foreign words is also specified.
On how jarring narrator switches can be mid-sentence:
This means that the actual purpose of the language change is completely missed, the person at the other end of the line did not understand the word at all because they cannot understand the French of the native speakers or because they did not cognitively participate in the switchover.