Top  Previous  Next

I face a problem with certain html files alone.  After conversion, these html files remain in Tscii itself. Nothing gets converted to Unicode. Why?

First, let us make it clear that its ABSOLUTELY no fault of our converter. If you want to know then as to why the file remained the same without getting converted to unicode, then you should be a computer techie. If so, read the 'Technical reason' topic below.  If not, follow the steps below straightaway for a simple SOLUTION to get the file converted correctly.

Steps :
1. Open the problematic file (say a.html) in MS-Word
2. Save it as another html file (say b.html)
3. Open b.html in azhagi's converter and convert it.
4. The resulting converted file will be in unicode.

Note : Even after following the above steps, certain characters might still not get converted here and there. Nothing can be done about it except using azhagi's direct typing mode to edit/insert the unicode characters in the source of b.html


Technical reason :
View the source of the problematic file (say a.html) in an editor. You will see lot of chracters like « Å û etc. A browser like IE will preprocess these characters to "Tamil 'a' ", "Tamil 'va' " and so on (much the same way it replaces ' ' with space character) and will show the Tamil characters on screen finally. But, its not the job of a converter to do this preprocessing before commencing the conversion process. Hence, it just reads the English characters and outputs the English characters as they are. Well, to achieve this preprocessing only, we have suggested the above very simple steps. Follow them and you will have your file converted properly into Unicode Tamil.





Document version 4.0.1Copyright 2004 - 2006 Azhagi.com

For current/updated version of this document, always visit http://www.azhagi.com/azUnicodeHelp.html