Delivery of Bilingual Content
Since the passing of the Maori Language Act 1987, New Zealand has had two official languages (although since 1840 the Treaty of Waitangi has recognised the Māori language as taonga (treasure)).
Delivery of content in two (or more) languages is not peculiar just to New Zealand (Aotearoa) but that is where I have my experience. Another country with good bilingual content is Canada providing information in both French and English.
There are a few ways to deliver bilingual content on the Internet. In this document the author intends to catalogue and describe them.
- All In Together / Kotahi Katoa
- Different URLs
- Content Negotiation
All In Together / Kotahi Katoa
In September 2005 I had the pleasure of meeting Dr Mark Laws from Kedri at AUT. During our talk he described how the mind thinks in a mix of languages, and that to best cater for this we should show a mix of languages on the same page. That is, not only should we ask the user to Click Here but also as close as possible ask them to also Pāwhiria Ira.
For a demonstration of this in action, visit translator.kedri.info - the Māori-English translator.
Those of us who are less proficient in one of the languages can find it easier to have the two languages side by side, as this extends and reinforces our existing knowledge.
Note that immersion teaching programmes, eg, kohanga reo or kura kaupapa Māori, may disagree with having a different language visible.
As we tend to sprinkle te reo Māori throughout our conversations in New Zealand, this page may resemble the All In Together
method. I have attempted to use fancy CSS 2 to point out Maori content in red text.
Different URLs
Another popular way of dishing up content is to provide separate URLs. Many content management systems encourage putting the language code [ISO 639] into the URL. The end result is that content ends up living in two different places, and it can be easy for them to become unsynchronised.
There are many useability features that URLs provide, for example, giving an indication of the page's content, and its location in the information heirachy. Slapping a language code at the root of the heirachy could cause problems, especially if a link takes you to a page you don't understand the contents of.
An example of a fictitious news website:
- mnn.com/en/2005/NEWS/auckland/
- mnn.com/mi/2005/PANUI/tamakimakaurau/
- mnn.com/en/2005/WEATHER/cyclone/
In this example the content at URLs 1 and 2 are the same, except translated. If a search or a link takes us to URL 1, but you can't speak a word of English, how do you get to URL 2? Hopefully there is a little picture of a flag to take you to URL 2. What about URL 3? Can you guess what the URL for the Māori version of that page would be? (This is also a nightmare for the poor webmaster managing the content.)
Some sites do this well, for example, I was impressed by the recent New Zealand Government site about the Treaty of Waitangi. It comes in two flavours, with links to switch between versions:
A site that doesn't do this so well:
- tetaurawhiri.govt.nz/maori/ [mi]
- tetaurawhiri.govt.nz/english/ [en] How does an English-only speaker know the te reo Māori translation of "The Maori Language Commission"?
A site that could do better:
- censusatschool.org.nz/taking-part/what-questions-are-we-asking/ [en]
- tataurangakitekura.org.nz/taking-part/he-aha-nga-patai/ [mi] There is still English in the URL.
Content Negotiation

Above you should see an image. It has been chosen by the server, based on what languages you know. How does the server know what languages you know? Because you told us which ones you know! (The server is configured to prefer English, and can ask you if it can't determine an appropriate language.)
This magic is all defined in RFC 2616, the protocol for HTTP sections 14.4 and 12.1:
To explain this in the least technical method possible, this is how the image ended up on your screen:
- This webpage embeds rfc2616.png.
- Your browser requested the image. It said:
give me rfc2616.png. I can accept these types of files, and the user speaks these languages
. - The server found six different versions of rfc2616.png, and based on which languages you can speak, sent back the one that was most appropriate.
Client Configuration
Where you configure the languages you know depends on the browser you are using.
For Safari on Mac OS X, open System Preferences, and open International. (On OS X 10.4 you can search for Language selection
.) There you can edit the list, and order them by your preferred language and dialect.
For Mozilla, open Preferences. Find Languages under Navigator. Click Add, and in the Others box enter mi.
As you can see in the screenshots I prefer te reo Māori over Occers, over old Blighty, and hence I receive the Māori version of the file.
Server Configuration
How does this work on the server? This depends on your software. With Apache 1.3 and 2.0, you need to turn on MultiViews and then read the Content-Negotiation documentation.
I have the following files in this directory:
When I made the link above, I excluded the language code, and let the server choose that:
- rfc2616.png (no language code)
(This can also be extended to magically detect if someone has Excel, and then choose between an Excel spreadsheet and a CSV file.)
My experience with this method is it works well, but nobody has configured their browser to identify they can speak Māori.
Note that this does work well with search engines, as each different variation of the file ends up with its own URL.
References
RFC 2616
Request For Comments (RFC) 2616 was published in June 1999. It defines how clients request information from servers using HTTP (Hypertext Transfer Protocol). When you looked at this page, your browser would have made an HTTP request to retrieve it.
You can retrieve a copy from rfc-editor.org or the ISI FTP at ftp.isi.edu/in-notes. (It is available in text and PDF.)
Links above are given to the copy hosted by W3C.
RFC 3066
Request For Comments (RFC) 3066 was published in January 2001, obsoleting RFC 1766 from March 1995. Its title is Tags for the Identification of Languages
.
ISO 639 or ISO 639-1
ISO 639 defines the two character language codes used in the browser requests, the server responses, and the filenames. There is a second part (ISO 639-2) which added three letter names for many languages.
There are also longer hyphenated language codes that define the main language and the dialect. Refer to RFC 3066.
A short extract from the much longer specification:
| ISO 639-1 | RFC 3066 | Language | Dialect |
|---|---|---|---|
| en | en | English | |
| en | en-nz | English | New Zealand |
| en | en-uk | English | British / United Kingdom |
| mi | mi | Māori | |
| fr | fr | French | |
| de | de | German |
See the full list at the W3C.
Maori Language Act 1987
MAORI LANGUAGE ACT 1987
1987 No 176
An Act to declare the Maori language to be an official language of New Zealand, to confer the right to speak Maori in certain legal proceedings, and to establish Te Komihana Mo Te Reo Maori and define its functions and powers.
20 July 1987
Te Komihana Mo Te Reo Maori was in 1991 renamed to Te Taura Whiri I Te Reo Maori.
For the full text and amendments, visit the unlinkable www.legislation.govt.nz.