Gettext translation resources

This page contains scripts and information about gettext translation. This page is maintained by Ask Hjorth Larsen.


Python GetText Translation Toolkit

PyG3T is a collection of scripts which perform various tasks to help with gettext-based translations. PyG3T is now hosted on Launchpad. The standalone scripts podiff, poabc and gtparse are now parts of PyG3T and are not hosted separately here.


Presentation about translation

This OpenOffice.org presentation (odp) document is in Danish. It describes how one may contribute to the free and open source community by translating applications, summarising the (very simple) procedure of downloading and editing translation files and which editors to use. The presentation is written by Kenneth Nielsen.

Miscellaneous important informations


Translations on Launchpad

Every project which is to be translated must obviously make po files available to translators and provide a way for them to be integrated in the project after translation. This is project specific and requires lots of work by maintainers, for which reason it is generally a good thing when many projects join and host their translations together. For example, GNOME hosts all official GNOME modules plus several "extra applications" which are not part of a standard GNOME installation. This makes it easy for translators and maintainers since the handling of translation files is centralized.

Much Ubuntu development is done through the page Launchpad which hosts - among other things - translations for all modules that are part of Ubuntu. Some modules are specific to Ubuntu, while the vast majority is used in other distributions as well. The Launchpad-provided po-files may be identical to those of the original module, or they may contain Ubuntu-specific modifications. This appears to be a very convenient and practical solution, but due to some unfortunate circumstances yet to be discussed, translators have to be very careful when using it.

When a po-file is integrated in Launchpad the translation will be visible to Ubuntu users, but the changes will not propagate up to the original project. If, say, one translates a GNOME module in Launchpad, the translation will benefit those GNOME users running Ubuntu, but not those running e.g. Debian.

The only correct way of integrating a translation is through the original project page (such as the GNOME translation page). We generally call the original project pages upstream while other pages (such as Launchpad) which may host the files with possible modifications are called downstream.

Thus, translation files must be integrated upstream, or else the changes will only benefit a subset of the project's users. Some projects host their translations directly in Launchpad, and the translations for those particular projects should be done directly through Launchpad, as Launchpad is the upstream source in these cases.

Conclusion: Do not translate strings found on Launchpad unless you know for a fact that Launchpad hosts the upstream version!


Same string, different context - a common error source in translations

If a program uses the same string in two different places in the code, then by default this would lead to only one translatable string, which is normally a good thing. Sometimes, however, it can be a problem if the two identical English strings have different meanings, such as is the case with the string "load" by virtue of its possible role as a verb as well as a noun. The msgctxt construct in gettext deals with this problem: a programmer can specify a different message context for the two words, and they will get separate translateable strings. But msgctxt is a relatively new feature, and many older message catalogs still use special identifiers superceded by a special character like

msgid "noun|load"
msgid "verb|load"

Important! Do not include the special (pipe) character and preceding letters in the msgstr. Otherwise it will all appear in the translated application. (Of course most translators quickly discover the horrible truth, namely that programmers rarely care to specify details such as whether a word is a verb or a noun, so we have to ask them every time.)

A correct Danish translation would be, respectively,

msgstr "belastning"
msgstr "indlęs"

Note that different projects use different ways to perform the distinction described above. Gnome packages generally use the form above. But in the Battle of Wesnoth they use the circumflex character (^) rather than the pipe character. In Freeciv they use yet another syntax which looks like

msgid "?play:Game"
msgid "?animals:Game"

So please be vigilant.


Updated: Oct 16, 2009.