What to do when you got nothing to do?
OK everyone, I need your help. I am not very busy right now and I would love to do something worthwhile to keep myself occupied. I am expecting some things coming in next week, but I am going crazy right now. The occasional hundred words that come in a couple of times a day don’t keep me busy.
So far I have cleaned up files on our documentation network drive, zipped and archived old projects, done my homework, read an extra chapter in my ASP.NET book, and hassled my boss to give me stuff.
Can anyone recommend anything for me to do? It can’t involve noise, getting off my a$$ or anything that would suspiciously like “non-work” to a person passing by. I keep hearing people talk about maintenance on their TMs but I don’t know what to maintain there. Any suggestions welcome!
This may sound like the sweet life: doing nothing all the while getting paid - let me tell you, not so much. I am ready to shoot myself if I can’t find anything worthwhile soon. I know it will come pouring any time soon, some new software coming up and a translation into Russian that will give me plenty of in-house post-processing, but for now - nada, njet, nothing, überhaupt garnix!!!
Corporate Email Woes
I am usually a strong believer in the theory that an email that has been sent will also arrive in my inbox. Emails just don’t get lost just like that, even though every now and then someone may claim it happened. In all honesty, I too have claimed to have sent an email if in reality, I forgot. In the past, it was an email to my Mom to send her a photo, recipe, or something - I am pretty sure I am not the only one who has used this white lie.
Now, when it comes to business email, it’s quite different to me. The information or attachments in a business related email is usually critical to a job or a project and can have consequences. For example, if I receive a set of strings, I usually tell my developer that he can expect them back by the end of the week. My freelancers are usually very fast and I am able to estimate how long it should take them. They will let me know if they cannot make it within a reasonable amount of time but I don’t expect them to confirm every email I send. At their own discretion, they reply with an estimated delivery date or they just return the files. Considering the previous premise that emails reach their destination, I think it works fine both ways. Both sides cut down on the chit-chat back and forth a little - it’s not like we all don’t send enough emails anyway.
Unfortunately, my system has been shattered by our new overzealous email filter system. All of a sudden, emails I send are not going out, I don’t receive emails that my freelancers send - and the notification system is lacking to non-existant. First, my translator returned a translation in TTX files on Thursday. On Monday, I carefully inquired if he had received the files to which he replied he had delivered on Thursday. Quick check whit IT, of course it got caught in the mail filter because of “inappropriate language”. Haha, that would mean that either the help system contains foul language which it doesn’t (especially not since I was able to send the file out fine) or a word in the Spanish translation happens to match an English term which is on the index. No one knows, I was told there is no log listing which word was the offender. A few hours later, the translator received a note that the email he had sent Thursday could not be delivered.
Then all of a sudden, we cannot send or receive compressed attachments anymore - yes, a regular zip file is held back because who knows what’s in it. And again, no one gets a notification. The sender believes it went through, the receiver has no idea anything was blocked and the lonely email is sitting in quarantine. Apparently, this is now handled on a case by case base and the IT department checks the emails with attachment and patches them through if appropriate - a system which apparently doesn’t work very well. The reason by the way is protection against viruses I was told.
Next thing, the SDL Trados Synergy translation packages are blocked - same with the zip files, no notification, they are just quarantined. The packages are pkzipped so the system recognizes them as zip files and blocks them. At least with those, we found a solution because of the unique file extension they could write a filter rule that allows stppk out and strpk in. But even one of those was blocked recently again becasue of offensive language.
The whole thing is a major pain now. Not only do I have to confirm all email people send to me, I also need the translators to confirm they received my emails until IT gets the out-of-control email filter configured properly. In between, I also got mocked by an IT guy for being a “troublemaker” because I insist on receiving my email. Yeah, the audacity - I insist on receiving professional business emails sent by associates. Not sure if he was kidding, but I do believe before implementing a system like this, it should be looked at a little closer. I don’t even want to know how many customer emails got lost.
I am sure some freelancers have wished for an IT department that takes care of all the computer woes. Believe me, it really doesn’t work like that - woes sometimes aren’t eliminated but created ![]()
Client ranting about translation service
We all hear ourselves and fellow translators rant on about clients, how stupid and obnoxious they sometimes are - I guess that is a common thing in all professions. On the other hand, we rarely hear people rant about us and our service.
I ran into a blog post by Thomas Kilian who posted about the email conversation he had with a fellow translator. He got a couple of unsolicited offers (and if you read my previous post, you know how I feel about cold calls, cold emails aren’t much different). While I do not know this specific translator, I have to admit that his replies were a little, well - read for yourself: Entspammung vor dem Wochenende - blog post is German, sorry for that.
BTW, I don’t think either of them was right or wrong, but man, did that go wrong :p
Thomas, if you read this, most of us are actually pretty nice and competent. Feel free to ask me if you ever need a good, friendly translator - you name the topic ![]()
Cold callers asking for call-back
Is this normal? I am receiving an unsolicited call from a translation agency which I have never done business with. The name appeared on the caller ID so I didn’t pick up. The caller left a message, asking me to call him back. Is this normal? Why would I want to call him back? Usually, cold callers will just try again, but not ask you to call them.
I hope this doesn’t go around, and next I have Planned Parenthood, the local Police Department, the local Fire Department, Clean Water Action, the Juvenile Diabetes Research Foundation and what not call and ask me to call them back. Now, don’t get me wrong - I would like to get this message and then be able to decide whether I call back or not. And of course, if I DON’T call back, they should take the hint that I am not interested. But they don’t and just keep calling.
In recent days, this procedure and the whole “charities begging for money thing” has actually turned me away from giving money to anyone. If you give something once, they will call you every other month and ask for more. And they will not take No for an answer.
Fun with character encodings
What do ASCII, ANSI, Latin-1, Windows-1252, Unicode and UTF have in common?
They are a pain in the neck for translators - but also, they are ways to encode characters in files, even in plain text files that usually seem as “un-encoded” as possible. Most of the time, you don’t have a problem with it, you open a txt file, you don’t really know (or need to know) what character format it has. The only reason why most people even know about this is because of the “bush hid the facts” (see below) trick in Notepad. I am not going into the history and details of the various formats, at the bottom are some links to other pages that deal with that if you want to learn more. I am merely looking at the consequences it can have for me during translation.
What I care more about is the fact that it can really break your neck during translation of string files. I run into that on and off and every time it happens, I learn a little bit more about it. I wanted to write about it since quite a while, and since the whole thing came down again earlier this week, I think it is time now.
We have a little update tool for an application that is written in Java. Java programs usually have their strings in .properties files. Those files are usually encoded in the 8-bit characters of ISO 8859-1 (aka Latin-1) which contains most “regular” characters but lacks support for language specific characters like ü Ü é or ñ. Those characters have to be converted into Unicode escape characters sometimes referred to as Java escape characters. I think most of us have experienced other escape characters, for example the \n for a new line, \t for a tab. Unicode escape characters are a little more involved, using a \uHHHH notation, where HHHH is the hex index of the character in the Unicode character set. So, for example the ß in a Java properties file has to be encoded into \u00df. To convert those characters, I use Rainbow which is part of the Okapi Framework. It has a handy Encoding Conversion Utility that allows you to convert files from one encoding to another.
Sounds really easy, right? Right? Now what is this woman complaining about again? Well, it’s not that easy. The conversion tool is designed to work with 8-bit ASCII-based encodings. Now, so what IS the problem - it was just stated that Java properties files are ASCII-based encodings? Well, TagEditor takes the ASCII file and when you “Save as Target” after translation, it converts the file into a UTF-8. And that is still not the problem, the problem is that it uses a UTF-8 format without a BOM (Byte Order Mark). The BOM is an (invisible) 2 byte sequence in the beginning of a file which basically tells a program “This is a Unicode file”. Without the BOM, some programs do not recognize the encoding of the file and assume ASCII - and that is the problem with Rainbow (and also with Passolo, a program that just got bought by SDL).
If you try to convert the encoding of a BOMless Unicode file, it goes terribly wrong. As I mentioned, the correct conversion of ß will give you \u00df. Converting a BOMless file will “double escape” the extended characters, and you get \u00c3\u0178 - clearly not the same. The “double escape” is actually a good indicator that something went wrong, if you check your file and see that your extended characters are represented by two escape sequences, you know something went wrong. Of course, that can be difficult when dealing with languages like Greek, Russian or Asian languages, simply because every single character is escaped. I usually try to find a short string and count.
Now, how do you know how a file is encoded? Right now, I use Notepad++ to check. It has a handy little Format menu and allows you to see which encoding is used and it also allows you to convert from one encoding to another. Supported formats are Windows, UNIX, Mac, ANSI, UTF-8 w/o BOM, UTF-8 and UCS-2 Big and Little Endian. Surprisingly, Windows Notepad is one of the few programs that actually manages to decipher the Unicode encoding even without a BOM, just open the BOMless file in Windows Notepad and save them without change. Unfortunately, you usually just don’t know and usually it isn’t even an issue.
I actually happen to get to talk to Yves Savourel, who is working at ENLASO and with the Okapi Framework (and about a gazillion other things related to localization), and he has been very helpful. He explained a few things to me a little better.
- The issue:
- a BOMless UTF-8 file is recognized as “windows-1252″ encoding
- a UTF-8 file uses two or more bytes to encode the extended characters
- the application thinks each of those bytes is a separate character and converts each into a Unicode escape sequence
- The solution:
- in Rainbow, manually force the encoding of the source file to UTF-8
- in Rainbow, use the Add/Remove BOM utility to set the BOM properly
If you got through all this stuff, you may now wonder if you’ll ever run into this issue. It is also not just about BOM or not, the whole file encoding raises issues in other applications too. To be honest, I don’t know how often freelance translators are confronted with these types of files, but here are the situations where I keep my eye peeled:
- Java files (.properties)
This was the most recent issue that triggered this post. - String export files (often XML files or even plain txt)
I tend to get the strings for REALBasic applications in XML files, though I believe they are created by RegexBuddy. - Non-Windows files or Windows files that will be used on other OSs
We run into this issue with txt files the were created on a Mac and that will be used in InstallShield-type applications, for example to display the license agreement or a readme file. - All files
Haha, very funny - I know. What I mean is, I have experienced various issues with files, if I have to process them through different applications in order to get CAT-translatable files, for example if we receive a weird string file that Trados doesn’t understand and where we need to find a managable way to extract translatable text.
Anyway, maybe this will help someone else in the situation where the client comes back and claims the files are corrupt or so. Otherwise, I apologize for boring the heck out of you. You should have stopped reading my post a long time ago
Some interesting links with related information:
Okapi Framework
Notepad++
Bush hid the facts hoax and Bush hid the facts on Wikipedia
Mojibake
How to Determine Text File Encoding
Cast of Characters: ASCII, ANSI, UTF-8 and all that