Linuxdoc Linux Questions
Click here to ask our community of linux experts!
Custom Search

3. Fonts

It can seem like anarchy. There are an unknown number of fonts, each encoded with their own tables, driven by arbitrary keyboard layouts and outputs. In my opinion, Tamil can seriously compete with any other language for maximum number of font tables. Added to this commotion are the dynamic fonts for the web pages, that enable anyone to get away with a non-standard font as long as his pages are viewable.

Adding to all these is the official Indian Standard Code for Information Interchange (ISCII), the Government of India sponsored "unifying" scheme to bring all Indian fonts under the Devanagari umbrella. Anyone familiar with the way the characters are written in Tamil and in Devanagari script will understand the lack of any rationale in this approach.

Needless to say, this is serving to only add to the confusion. A good analysis of this and the unicode for Tamil is once again written by Sivaraj and can be found at . For those not familiar with the Tamil script, a good introduction written by Sivaraj is at .

Let us ignore the anarchy for a moment and get a picture of the frequently used font encodings. There are two main contenders and luckily they will converge soon. The first and most popular one is the Tamil Standard Code for Information Interchange (TSCII), developed by volunteers throughout the world, and the other, TAmil Monolingual (TAM), and TAmil Bilingual (TAB) encodings, were proposed by the Tamil Nadu Government. Once again, TAM is of limited use in an OS environment and we can safely ignore that. Almost all Linux efforts are in TSCII (Console, KDE, GNOME localizations).

3.1. TSCII

TSCII is a glyph-based, 8-bit bilingual encoding. It uses a unique set of glyphs; the usual lower ASCII set. Roman letters with standard punctuation marks occupy the first 128 slots and the Tamil glyphs occupy the upper ASCII segment (slots 128-256). A good overview of the early font encoding schemes and a the rationale behind the TSCII approach can be found at http://www.geocities.com/Athens/5180/tscii.html.

The home URL for TSCII volunteers is http://www.tamil.net/tscii. This site discusses the TSCII encoding and provides tools including fonts, keyboard drivers, editors and inter-conversion tools for various platforms. The font encoding table according to TSCII-1.6 can be found at http://www.tamil.net/tscii/charset16.gif.

The current version of TSCII is 1.6, and a revision is expected anytime now that will fix some anomalies in using various slots for encoding. This version 1.7 will be fully backward compatible with 1.6 and is expected to gain popularity. The TSCII discussion group currently brainstorms on modifications to TSCII-1.6. You may be able to participate in the discussions by becoming a member. You may also be able to download various beta tools from there. The font encoding table according to TSCII-1.7 (draft) can be found at .

3.2. TAB

TAB is a character based bilingual standard proposed by the government of Tamil Nadu. The TAB bilingual encoding table can be found at http://www.tamilnet99.org/annex4.htm. Tools for TAB encoding (mostly restricted to the Windows platform) can also be downloaded in the vicinity of this page.

3.3. Miscellaneous fonts and encodings

There are too many types, and unfortunately they are not documented well. It is beyond the scope of this document to discuss them.