Making writing readable

How do you make your writing as accessible as possible? Plain text – a system of simplifying the words and phrases used to reduce overall complexity – is an "easy" solution, and I've never seen a better explanation and overview of the practice than this article.

There's a huge amount of useful information here, from the history of plain text, the reasoning behind it, and the methods that experts use to translate technical or lofty literature into it, to the existing attempts at automating or ranking text in terms of accessibility (seemingly tied to education level, with a focus on the US "grade" system). Spoiler: those models aren't particularly useful – who'd have thunk!

But best of all, the entire article is written in both plain text and the original "natural" language (non-plain text?). You can choose which you prefer via a top-level setting, or toggle each paragraph between the two, to see what the difference is directly. That's a really clever feature and I found the comparisons fascinating. Here's a quick example:

Animation showing a paragraph changing between plain text and academic text. The initial paragraph reads:
It's not a fancy animation, but the added context for the article is brilliant and massively helped my understanding of the subject matter.

In plenty of places, I was interested to see that the information conveyed subtly differed. In some paragraphs, useful context (often causal inference) had been wholesale removed from the plain text version, whilst in others, the plain text provided additional details that made an explanation either more accurate or more specific. I also found some parts of the plain text surprisingly difficult to parse; my brain just isn't used to seeing fragmented sentences like this, I guess.

Here's an example translation, with the plain text second:

Additionally, there is a tendency to censor content for these audiences rather than explain it, which can contribute to continued disparities, like the higher rate at which people with ID experience sexual violence than nondisabled people.
Writers will censor writing for these groups. To censor something means to take out information the writer thinks is not appropriate. Taking out information can make some problems worse.

For example, people with ID experience sexual violence more than nondisabled people. But some writers think people with ID should not read about sex or sexual violence. So, readers don’t have all the information they need.

To me, it seems fairly clear that the plain text version does a much better job of explaining what the text is inferring, chiefly here that the removal of information about sex or sexual violence leaves people with ID (intellectual disabilities) at greater risk of sexual violence themselves. They are denied access to the information that might allow them to understand what is and isn't "normal" or even legal, so may not realise that their treatment is unusual or problematic. However, I also find that plain text harder to read; it forces me to read the first paragraph several times to understand what it's saying.

Regardless, the article is a treasure trove of useful information and provides a huge amount of useful context to the discussion about making writing as accessible as possible 👏👏

On what plain text is and why it is important:

Writing text that can be understood by as many people as possible seems like an obvious best practice. But from news media to legal guidance to academic research, the way we write often creates barriers to who can read it. Plain language—a style of writing that uses simplified sentences, everyday vocabulary, and clear structure—aims to remove those barriers.

On the Flesch-Kincaid formula, a model for measuring the simplicity of language in a given text:

The Flesch-Kincaid formula measures two things:
  • How long words are.
  • How many words are in a sentence.
  • The formula says the shorter the words and sentence, the easier it is to read.

On the Dale-Chall formula:

Dale-Chall is another readability score. It measures two things:

  • How long each sentence is.
  • The number of easy or hard words.
Dale-Chall uses a list of 3,000 easy words. Dale-Chall says these are words most 4th graders know. Any other word is a hard word.

On the issues of word lists and gaming the Dale-Chall system:

The original Dale-Chall list of “familiar words” was compiled in 1948 through a survey of U.S. fourth-graders, and even the most recent update to the list in 1995 retains obsolete words like “Negro” and “homely” while omitting “computer.”
We can lower the Dale–Chall readability score even further by adding a second sentence (“Yes!”) that has just one word. This reduces the average sentence length, and so reduces the overall score.

On the Lexile Framework and the issues of proprietary models:

Unfortunately for us, the Lexile Framework is the intellectual property of MetaMetrics, the private company that created it, so we can only guess at the secret recipe...

On the risk of automated "translations" and statistical models:

Technology alone isn’t the answer. Even the most thoughtful algorithms and robust data sets lack context. Ultimately, the effectiveness of plain language translations comes down to engagement with your audience.

