Accessibility for content providers - language
Language has a nasty habit of tripping people up when it comes to accessibility. It’s not something that happens that frequently on most sites, but it’s priority level 1 in the World Wide Web Consortium (W3C) guidelines. So if something slips through the net then you won’t even make A compliance let alone Double-A.
The relevant checkpoint in the Web Content Accessibility Guidelines (WCAG) is 4.1:
Clearly identify changes in the natural language of a document’s text and any text equivalents (e.g., captions).
The language for the page
So the first thing you might want to make sure is that a language has been specified for the page before thinking about your own content. After all there’s no real point in highlighting the exceptions if you haven’t said what is the rule.
When you view the HTML source for one of your content pages you search for ‘lang=’ - you could find this at the top attached to the <html> tag or it could be something like <meta content=”xx-XX” http-equiv=”content-language” />. This shouldn’t be confused with Dublin Core metadata - if you’ve got something with “DC.language” then that’s what you’re looking at. If you can’t see a language definition page then check with your techies & get them to add it if it’s missing.
If you view the source code for this page then you’ll see that I’ve marked it up as “en-GB” right at the top. That’s the sort of thing you’re looking for.
Marking up changes
Unfortunately the standard settings for most rich text editors won’t be of much help in identifying language, to do this you will need to get into the HTML source for the content or convince your developers to add an option into the menu for you.
If you’re dealing with a few words within a sentence then you need to add a new tag around the appropriate text. So change your editor so that you’re looking at the HTML source. Track down the content that you need to mark as a different language and put in front of it <span lang=”xx”> and then after it </span> (the xx needs to be replaced with the appropriate language code that you can find about next). These tags should be placed immediately in front and after the text - there’s a thing in HTML about not having tags overlapping, so if you do this you don’t need to know anything more about it.
Remember that the guideline mentions “text equivalents” so if you’ve added an alt attribute for an image in another language then you need to put a <span> around the <img> tag.
If there’s a bigger chunk of text, such as a paragraph, then you can add the attribute ‘lang=”xx”‘ to the existing tag (again the xx needs to be replaced with the appropriate language code that you can find about next). So typically you might see the <p> tag and you’ll need to make it <p lang=”fr”> for a paragraph of French. If there’s more than one paragraph then you do this on each <p> tag.
You can’t put a <span> around a <p> tag (so not <span lang=”fr”><p> … </p></span>, but only the HTML pedants would be upset if you did something like <p> <span lang=”fr”> … </span> </p>. That will be OK & would mean that you need only remember one solution.
Now you can see why I’d suggest that if you’re doing this often you should get something added to the toolbar to make it easy for you.
What language code should you be using?
The W3C have provided some advice for selecting a language code, which basically in this context boils down to:
- use the shortest language code that is appropriate (i.e. you don’t need to specify UK English unless there’s something that is specific to the region, just saying English will be fine)
- you can find the appropriate sections of code in the official language registry
- you’re most likely to either use the single language code (e.g. ‘es’ for Spanish, ‘fr’ for French) or the language code with region (e.g. ‘es-MX’ for Mexican Spanish, ‘es-ES’ for Spanish from Spain)
What should be marked up?
The answer to this sounds obvious - anything in a different language - but when it comes to the fine detail it can be a grey area. Should you mark up the names of people and places? I’d be intrigued to know what language code you’d use to get a screenreader to pronounce ‘Belvoir’ as ‘beaver’ (as in Belvoir Castle).
Bear in mind that it’s not an issue of pedantry (”Is ‘vice versa’ accepted as English or should it still be treated as Latin? Discuss.”) or cultural battles (although I could imagine the Academie Française insisting that Paris should not only be marked up as French, but French as spoken in France at that). The aim is to help screenreaders by giving them sufficient information to be able to pronounce the word correctly.
So try to be consistent if nothing else - make sure that other content providers for your site understand whatever rule that you come up with.
Summary
- Check that your page specifies the main language used in the content - this is different to specifying the language in Dublin Core metadata
- Mark up any changes in the language of your content
- Remember to think about the alt text for images as well
- The implementation of this guideline is pretty subjective - the main thing we’d recommend is being consistent & think what will be helpful for the user
- If you’re using other languages in the titles for your content then you should alert your developers - those pieces of text will be automatically dropped into menus or headings and they need to indicate the change of language
- Not an accessibility issue, but while we’re here - if you’re going to be using characters that aren’t in Western European languages then you probably also need to check with your developers as they might need to make some adjustments