My home on the web - featuring my real-life persona!

Fun with character encodings, Greek ANSI

Now that I got the Java encodings half way under control, I encountered “Fun with character encodings” again. This time, it’s a Greek tragedy.

A couple of days ago I received a small text file with English strings. The strings are messages for a service pack and they are needed to be translated in Greek. Unfortunately, that wasn’t all, the text file needs to be in ANSI format because the installer InnoSetup requires that format. Hmm, I immediately thought that smells like trouble because most languages with a different codepage really need to be encoded in Unicode, ANSI does not have enough characters. But first, let’s get it translated.

I got the translation back as a Word file and while I could have probably just asked the translator to send it as a Greek ANSI, I thought I’d give it a shot myself. The first dumb try, open file in Notepad++ and select “Convert to ANSI”. Of course, I get:

greek.INVALID_VERSION_MESSAGE=??t? t? pa??t? e??µ???s?? µp??e? ?a e??µe??se? µ??? t?? ??d?s? %1 ?a? ? d???? sa? e??a? %2.

So I google to see if it is at all possible and yes, it seems like you can encode Greek text in ANSI but unline English, which uses codepage 1252, Greek has to use 1253. Well, that doesn’t seem to be that hard, so I try again. Still the same. OK, maybe a different text editor - nope, doesn’t work either. So, now I send the UTF-8 encoded text file to the Greek translator and ask him if he can convert it into ANSI.

While I wait, I do a little more research and I stumble over a little Microsoft tool named AppLocale. At first I misunderstood the purpose, I thought it is just to switch the system locale, something you can easily do through the control Panel. But after a little more reading, I realized that this may be my solution. I can use AppLocale to open another application and AppLocale will pretend it is a localized Windows environment. So, in my case I needed to look at my Greek ANSI file on a Greek system, which I don’t have. Instead, I use AppLocale to open my text editor and with this instance of the text editor, I open my Greek file. Lo and behold, all characters come out correctly.

greek.INVALID_VERSION_MESSAGE=Αυτό το πακέτο ενημέρωσης μπορεί να ενημερώσει μόνο την έκδοση %1 και η δικιά σας είναι %2.

My file was correct all along, I just couldn’t verify it on my system. I’ll make sure to keep this little application around because I have run into this in the past and usually just ended up submitting a Unicode file and let the developers deal with it. By this time my translator had also sent me the file back and certainly, his looked just the same.

Translator 1, Greek ANSI File 0

OMG - It’s full of mistakes!

So, we just had one of our applications translated into Greek. It is a very big application for a total of 13,000 words just strings. Initially, we had about a month so time was not a big issue and the translation got started on May 8th. Of course, these things change and all of a sudden, we needed it not by the beginning of June but for a show on May 20th. That means 12 days for the translation, cleaning up the bilingual files, importing the strings, fixing truncations and other issues, testing functionality and compiling DLLs. Of course we made it!

Now I was waiting for feedback. Nothing at all from the guys who were at the show. No “good job”, no “shame on you” - nothing. After a week, I inquired and I got the reply back that there were “a big number of errors”. That sent a shiver down my spine. We don’t have many translations into Greek, only one other application so I don’t know this translator very well. We don’t have any Greek reference material, but I asked and he confirmed that he knew the subject matter. And I myself can of course not check anything in Greek.

Turns out, it wasn’t all that bad. We had issues for all language because unless you are a printing press operator, you really can’t figure out some things. I remember asking our German guys questions and they had no clue either. Unfortunately, some terms that were wrong occured 50 or even 100 times so yeah, it looks like a lot. Correcting all strings took me a couple of hours of manual copy/paste, which is not bad at all.

It just irks me that the only feedback I get was that there are a lot of errors (which wasn’t even true). He never acknowledged that we did the impossible by turning this around so fast and that it worked fine. Only the tester mentioned that this must have been the fastest turn-around we had for any language but I am also getting a lot better at handling languages I know nothing about. The last translation we did for that was Russian - I am fine navigating through French, Spanish, Italian and Portuguese, but Russian and Greek are a whole different animal. If I see a truncation at runtime, I can’t just type in the text I see and search for it - I need a virtual keyboard and go letter by letter type in a keyword to search for. And I am amazed how nicely Trados and TagEditor handle the different character sets. I don’t think many people know what an ordeal it can be to have an application ready for non-Western character sets.

Ah well, believe it or not, I still love doing it - it’s a big girl puzzle and I am getting paid to solve it!