W3C International Internationalization

This page is no longer maintained and may be inaccurate. For more up-to-date information, see the Internationalization Activity home page.

Bi-directional text

HTML 4.01 allows bi-directional text, i.e., text that contains both left-to-right and right-to-left parts.

Scripts like Hebrew and Arabic, that are normally thought of as right-to-left, are in fact bi-directional, since they often use the so-called 'Arabic' numerals, i.e., the western-style 0, 1, 2,..., which are written from left to right. It is also quite common for an English word to find its way into a right-to-left script, much more than vice versa.

HTML 4.0 follows Unicode in its treatment of bidirectional text. See 'language tags and attributes.' CSS2 uses the same model, and since CSS can be applied to XML , that also makes it possible to write bidi text in XML.

No `visual order'

In e-mail, right-to-left and bi-directional text has sometimes been simulated by simply reversing the order of the characters. IANA even has an official `charset' for visually ordered Hebrew in e-mail: iso-8859-8. In HTML visual ordering is impossible (unless the whole document is placed in <PRE>...)

When an HTML browser finds a document labeled as iso-8859-8, it should therefore not assume the text is visually ordered.


W3C Bert Bos , i18n coordinator
Webmaster
Last updated $Date: 2008/05/07 17:12:34 $