If you've ever worked on or built a multi-lingual website you will know there are a million and one things to keep in mind. Sorting out domain names, web server configuration, URL structure, page layout and the translation of content are likely to be high on your 'to do' list. With all that keeping you busy, meeting the accessibility requirements for your website may slip to the bottom of the pile. This shouldn't be the case as making your multi-lingual website accessible is easy to achieve.
What is a multi-lingual website?
A multi-lingual website is a website where the content is written in more than one language. The information displayed in different languages is often the same, but maybe tailored for different audiences. Booking.com is an example of a multi-lingual website as its content is available in 35 different languages.
Booking.com homepage shown in English (left) and Portuguese (right)
1: Language Codes
The first thing to get right when working with multiple languages on a website is making sure the language is identified in the code of the page. The Web Content Accessibility Guidelines require under success criterion 3.1.1 Language of Page that:
The default human language of each Web page can be programmatically determined.
Assistive technologies such as screen readers and Braille devices can not automatically identify the language being used on a website from the text alone. The language must be identified in the code of the page in order for assistive technologies to interpret it correctly. Once recognized, these technologies can automatically switch to that language, adjusting the accent, pitch and speaking rate of the content depending on the language in question. Modern screen readers such as Jaws and Window Eyes are able to speak multiple languages in appropriate accents with proper pronunciation.
Setting the primary language
First of all you will need to choose the primary language for your pages. This is the language the majority of your content is written in. For example if your page is predominantly written in English then your primary language should be set as English. Once the primary language of your pages has been chosen, you will need to find the language code which corresponds to that language. Language codes usually consist of two letters, however four letter codes can be used for further defining the language into different dialects. A two letter language code 'en' could be used to define 'English', whereas the four letter language code 'en-GB' could be used to distinguish British English from American English 'en-US'. Please note 'en-UK' is not a valid four letter language code. Next we need to apply the language code to our page. To set the primary language of our page as English we use the 'lang' attribute along with our 'en' language code and apply this to the HTML element at the beginning of each page.
If you are using XHTML, you will need to apply an additional attribute to set the language used in an XML document. The 'xml:lang' attribute serves the same purpose as the 'lang' attribute and should use the same language code. Your pages will not pass the W3C HTML validation check without this attribute.
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
For a multi-lingual website consisting of English, French and German pages, the primary language should be reset for each language.
The primary language of the page should correspond with the human language of the page
Using the wrong language code
If you don't provide a primary language code or set the code incorrectly it may be impossible for someone using a screen reader or Braille device to understand the content. If the primary language of the web page has not been identified, screen reading software in general will read out the content in the same language as the default setting for the screen reader. So if your screen reader has English set as the default language, it will read out web page content in English. This isn’t too much of a problem if you are only viewing web pages written in English. However if the page uses the wrong language code, for instance if the content is written in English, but identified in the code as French the screen reader will attempt to speak the English content using a French accent and pronunciation. For those of you familiar with 'Allo Allo', your website could end up sounding something like Officer Crabtree. While amusing, this might not be very easy to understand as the following audio clip shows: Using the wrong language code
2: Multiple Languages
The majority of web pages use a single language at a time, however there maybe occasions when you want to include a language other than the primary one on your pages. If this is the case, the Web Content Accessibility Guidelines success criterion 3.1.2 Language of Parts requires that:
The human language of each passage or phrase in the content can be programmatically determined except for proper names, technical terms, words of indeterminate language, and words or phrases that have become part of the vernacular of the immediately surrounding text.
If you wanted to include a passage in French on your page you would need to use the 'lang' attribute to mark the change in language. The 'lang' attribute can be used with almost every HTML element, making it very easy to change languages within a page. To include a French quotation on an English page you would simply add the lang attribute to the blockquote tag:
<p>Le plus grand faible des hommes, c'est l'amour qu'ils ont de la vie.</p>
If you are creating a multi-lingual website you may also need to provide links to the other language versions of your site. If the page you are linking to is written in a different language to the current page, you need to let people using assistive technologies know about this. This can be done by using the 'hreflang' attribute. The 'hreflang' attribute allows you to inform people the primary language of the page found when following the link is different to the current page. For example to link to a page written in French from a page written in English, you would use:
<a href="" hreflang="fr">French</a>
If you need to identify both the text of the link, and the links target as different languages you need to use both the 'lang' and 'hreflang' attributes:
<a href="" lang="fr" hreflang="fr">Francais</a>
Please note the 'hreflang' attribute should only be used for links.
3: Google and language recognition
Unlike assistive technologies such as screen readers, Google does not recognise language identifiers such as 'lang' attributes in the code of the page. Google tries to work out the main languages of your pages itself. In order to make language identification easier for Google, Google recommends only using one language per page.
4: Language direction
If you are creating a multi-lingual website which caters for languages written from right-to -left rather than left-to-right, you will need to make sure the direction of text is specified correctly in the code of the page. You can set the direction of text by using the 'dir' attribute on the HTML element. For languages such as Arabic, Persian and Urdu the 'dir' attribute should be set to be set to rtl (right-to-left):
A 'dir' attribute is not needed for pages written using left-to-right languages such as English as this is the default direction of text. Different page layouts are often required for right-to-left languages, as most right-to-left languages should be right aligned rather than left aligned. This means the page layout will need to be adapted for these languages, essentially mirroring the layout of the left to right language pages. For example the United Nations website adapts its layout for the Arabic language which is written from right-to-left. The whole layout of the page is reversed when compared to the English language version.These types of layout changes can be achieved using CSS.
The layout of the United Nations pages change depending on the language used
5: Character encoding
A character encoding is essentially a key to decipher an encrypted collection of letters and symbols used in a writing system. There are many different types of character encodings so it's really important to make sure you use the right character encoding otherwise people may not be able to read the text on your pages. Character encoding also helps computers understand your information, if you use the wrong encoding your pages may not be found by some search engines. The most widely used character encoding is 'Unicode'. 'Unicode' contains characters for most languages and scripts in the world and is supported on a large number of operating systems. This means Unicode can display multiple languages and scripts within a page, which makes it an excellent choice to use for multi-lingual websites. To specify Unicode for pages written in HTML 4 put the following line in the head of your pages:
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
For HTML 5 use:
Please see 'Declaring character encoding in HTML' for more information about character encoding.
6: Font sizes
When designing your multi-lingual website, it is important to realise that the font size you chose for your default language may not be suitable for all languages. Different languages such as Chinese, Japanese and Arabic might be difficult to read at font sizes that are suitable for English, French and German languages. For web pages displaying Chinese, Japanese or Arabic languages the default font size will need to be increased so the text is legible on screen. There are two ways this can be achieved. The first uses the CSS 'lang' pseudo class to set different font sizes and font families depending on the value of the 'lang' attribute: HTML
<html lang="en"> or <html lang="zh">
font-family: arial, verdana, sans-serif;
font-family: helvetica, verdana, sans-serif;
This technique is supported in Firefox, Opera and Internet Explorer 8 and higher. Chrome and Safari do not support this pseudo class. If you want to support web-kit browsers and earlier versions of Internet Explorer as well, the second option would be to use classes on the body element for each language required: HTML
<body class="english"> or <body class="chinese">
font-family: arial, verdana, sans-serif;
font-family: helvetica, verdana, sans-serif;
7: Length of words
The length of words varies from language to language. Content written in one language may take up more or less space on the page than another language. The design of the website should cater for different length words used through the site. Taking Amazon as an example, the length of content in the search bar of the website varies between languages. The word 'search' takes up 10 characters in French but only two characters in Japanese. The word 'basket' takes up 6 characters in English but when translated to German takes up a massive 13 characters. Amazon have adapted the design of this area of their pages, removing the wish list button from the search bar for those languages which use longer words such as German and Italian.
Different languages take up different amounts of space on the page
Depending on the content on your multi-lingual website, it may not be possible to change the layout and design of your site in this way. You can overcome these types of problems by using shorter words to fit in to the space available on your page and making sure you have your content translated before making essential design decisions.