Identifying the language of a web page, as well as the language of its individual parts, helps to ensure that screen readers will correctly pronounce the content.
Techniques
Defining language in HTML
In HTML the language of content is identified using the lang attribute, the value of which is a standard BCP 47 Language Code. For example the following tag identifies the entire HTML document as being an English:
<html lang="en">
If a paragraph, table cell, list item, or any other block of text changes from the default language of the page, that too must be marked up with a lang attribute. For example, imagine that our English document contains a short paragraph in French, as in the following example:
<p lang="fr">Vaut mieux prévenir que guérir.</p>
Defining language in content management systems
WordPress, Drupal, and other content management systems all have a rich content editor for authoring content. Most, if not all, of these products automatically add a lang
attribute to the <html>
element on all pages within a website. The default language of the website can be specified in the website settings.
Unfortunately, few if any rich content editors provide a mechanism for identifying the language of parts within a page. Therefore, the only way to specify the language of parts of the page is to switch to your editor’s HTML view and add lang
attributes to the outer HTML element of any foreign language content, as explained in the preceding section. After doing so and saving the page, be sure to inspect your source code to be sure your editor preserved the code you added. If your editor is stripping out lang attributes after you’ve added them, talk to your website admin, as this is likely a configurable setting.