|
How to convert formatted text to Unicode? (OR) How to convert formatted MS-Word
documents to Unicode (preserving ALL the formatting information like color, bold,
highlight, text alignment, table structures, WordArt etc.)?
|
· Open your document (say a.doc) in Ms-Word.
Tip : If you have formatted Tamil text (in tscii encoding) just typed out in
an editor
(say azhagi) and not saved yet, save it first in the editor in '.RTF' mode or copy and
save it straight in MS-Word (say as a.rtf or a.doc).
· Click on 'File->Save as Web Page...' and save the file in HTML format
(say a.html) .
Then, close the HTML file.
· Now, open Azhagi application and click on 'File -> Convert to Unicode'.
Azhagi's
converter tool will open.
· Click on 'Choose HTML file' and open 'a.html'. This will load the html file
and keep it
ready for conversion.
· Click on 'Convert HTML File' button."
· Your file will be converted (you can choose the name of the output converted
file) and
displayed in your web browser
· Open the created html file in Ms-Word and save it back in the original format,
giving a
different name (say a-unicode.doc)
Tip : Even if you wish to retain the converted file in HTML format itself, then
also you
have to open the file in Ms-Word again and save it as a different file (say a-
unicode.html). Otherwise, some characters (like bullets) may not appear properly
while viewing the converted html file in your web browser.
| Document version 4.0.1 | Copyright 2004 - 2006 Azhagi.com |