Localization (also called l10n) is done by using strings in a specific language in the code (typically English) and then translating them to the target language.
Not using resource identifiers but actual strings makes it much more convenient and readable. A typical code to create a button that will be translated is:
Evas_Object *button = elm_button_add(parent); elm_object_translatable_text_set(button, "Click Here");
The messages that require translations are typically automatically extracted from the sources and put into .po files, one per language. For the example above, the “fr.po” file could contain:
#: some_file.c:43 another_file.c:41 msgid "Click Here" msgstr "Cliquez ici"
In the example above, the program that extracts strings has found two occurrences of the same string, one in some_file.c at line 43 and another one in another_file.c at line 41. It gives the original string after “msgid” and the translation goes after “msgstr”.
Strings without translation are stored as the empty string “” in the .po file and the program will use the original strings, providing a sane fallback.
It is possible that the “fuzzy” keyword is added by the extractor program on the line before “msgid”; it means the original string has changed and needs review.
The most common way to use a translation involves the following APIs:
elm_object_translatable_text_set(Evas_Object *obj, const char *text) elm_object_item_translatable_text_set(Elm_Object_Item *it, const char *text)
They set the untranslated string for the “default” part of the given
Evas_Object
or Elm_Object_Item
and mark the string as translatable.
Similar functions are available if you wish to set the text for a part that is not “default”:
elm_object_translatable_part_text_set(Evas_Object *obj, const char *part, const char *text) elm_object_item_translatable_part_text_set(Elm_Object_Item *it, const char *part, const char *text)
It is important to provide the untranslated string to these functions because the EFLs will trigger the translation themselves and re-translate the strings automatically should the system language change.
It is also possible to set the text and the translatable property separately.
Setting the text is done as usual while the translatable property is set
through the elm_object_part_text_translatable_set()
:
There are also get()
counterparts to the set()
functions above.
The approach described in the previous section is not always applicable. For instance, it won't work if you are populating a genlist, if you need plurals in the translation or if you want to do something else with the translation than putting it in elementary widgets.
It is however possible to retrieve the translation for a given text using
gettext from <libintl.h>
:
char * gettext(const char * msgid);
The input of this function is a string (that will be copied to an msgid field in the .po files) and returns the translation (the corresponding msgstr field).
In order to use gettext, you have to set the local before:
setlocale(LC_ALL,"");
LC_ALL
is a catch-all Locale Category (LC). Setting it will alter all LC
categories such as LC_MESSSAGES
and LC_TYPES
which are other categories
for translation: LC_MESSSAGES
is for message translations and LC_TYPES
indicates the character set supported.
By setting the locale to “”
, you are implicitly assigning the locale to
the user's defined locale (grabbed from the user's LC or LANG environment
variables). If there is no user-defined locale, the default locale “C” is
used.
bindtextdomain("hello","/usr/share/locale/");
This command binds the name “hello”
to the directory root of the message
files. In fact, the program will be looking for your hello.mo
in
/usr/share/locale/<your_language>/LC_MESSAGES/
directory where
<your_language>
can be fr_FR
for example defining in the user's
defined locale. This is used to specify where you want your locale files
stored. You will use “hello”
when setting the gettext domain through
textdomain()
, and it corresponds to the name of the file to be looked up
in the appropriate locale directory.
The bindtextdomain()
call is not mandatory; if you choose to install your
file in the system's default locale directory it can be omitted. Since the
default can change from system to system, however, it is recommended.
textdomain("hello");
This sets the application name as “hello”
, as cited above. This makes gettext
calls look for the file hello.mo
in the appropriate directory. By binding
various domains and setting the textdomain (or using dcgettext()
,
explained elsewhere) at runtime, you can switch between different domains as
desired.
When giving the text for a genlist item, you could use it in a similar manner as the one below:
#include<libintl.h> #include<locale.h> #define _(str) gettext(str) static char * _genlist_text_get(void *data, Evas_Object *obj, const char *part) { return strdup(gettext("Some Text")); /* or usual way * return strdup(_("Some Text")); */ } EAPI_MAIN int elm_main(int argc, char **argv) { setlocale(LC_ALL,""); bindtextdomain("hello","/usr/share/locale"); textdomain("hello"); /* ... */ elm_run(); elm_shutdown(); return 0; } ELM_MAIN()
Plurals are handled in a similar way but through the ngettext()
function.
Its prototype is shown below:
char * ngettext (const char * msgid, const char * msgid_plural, unsigned long int n);
msgid
is the same as before, i.e. the untranslated stringmsgid_plural
is the plural form of msgidA matching fr.po file would contain the following lines:
msgid "%d Comment" msgid_plural "%d Comments" msgstr[0] "%d commentaire" msgstr[1] "%d commentaires"
It is even possible to have several plural forms. For instance, the .po file for Polish could contain:
The index values after msgstr are defined in system-wide settings. The ones for Polish are given below:
"Plural-Forms: nplurals=3; plural=n==1 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;\n"
There are 3 forms (including singular). The index is 0 (singular) if the given
integer n is 1. Then, if (n % 10 >= 2 && % 10 ⇐ 4 &&
(n % 100 < 10 || n % 100 >= 20)
, the index is 1 and otherwise it is 2.
The user can change the system language settings at any time. When that is done, Ecore Events notifies the application which can then change the language used in elementary. The widgets then receive a “language,changed” signal and can set their text again.
The first step is to handle the ecore event:
static Eina_Bool _app_language_changed(void *data, int type, void *event) { // Set the language in elementary elm_language_set(setlocale(LC_ALL,NULL)); } int main(int argc, char *argv[]) { ... // Retrieve the current system language ecore_event_handler_add(ECORE_EVENT_LOCALE_CHANGED , _app_language_changed, NULL); ... }
The call to elm_language_set()
above will trigger the emission of the
“language,changed” signal which can then be handled like usual smart events
signals.
The xgettext tool can extract strings to translate to a .pot file (po template) while msgmerge can maintain existing .po files. The typical workflow is as follows:
A typical call to xgettext looks like:
xgettext --directory=src --output-dir=res/po --keyword=_ --keyword=N_ --keyword=elm_object_translatable_text_set:2 --keyword=elm_object_item_translatable_text_set:2 --add-comments= --from-code=utf-8 --foreign-user
This will extract all strings that are used inside the “_()” function (usual optional short-hand for gettext()), use UTF-8 as the encoding and add the comments right before the strings to the output files.
A typical call to msgmerge looks like:
msgmerge --width=120 --update res/po/fr.po res/po/ref.pot
POT file (.pot) stands for Portable Object Template file. It contains a series of lines in pair starting with the keywords msgid and msgstr respectively. In the above example there is only one such pair & msgid is shown first followed by a string in the source language, followed by a msgstr in the next line which is immediately followed by a blank string.
Now in order to translate the application, these POT files are copied as PO (.po) files in respective language folders and then translated. What I mean by translation here is that, corresponding to every string adjacent to msgid there is a translated string (in local script), adjacent to msgstr. For French it will look something like this:
msgid "Click Here\n" msgstr "Cliquez ici\n"
Create an MO (.mo) file using the following command:
msgfmt helloworld.po -o helloworld.mo
In root mode copy the MO file to /usr/share/locale/<LANGUAGE>/LC_MESSAGES. For French, do something like this:
cp helloworld.mo /usr/share/locale/fr_FR/LC_MESSAGES/
Don't forget to export your language here:
export LANG=fr_FR.utf8
Then compile and execute your program.
Languages vary wildly and even though you might know several of them, you shouldn't assume there is any common logic to them.
For instance, with English typography no character must appear before colons and semicolons (':' and ';'). However, with French typography, there should be “espace fine insécable”, i.e. a non-breakable space (HTML's ) that is narrower that regular spaces.
This prevents proper translation in the following construct:
snprintf(buf, some_size, "%s: %s", gettext(error), gettext(reason));
The proper way to do it is to use a single string and let the translators manage the punctuation. This means, translating the format string instead:
snprintf(buf, some_size, gettext("%s: %s"), gettext(error), gettext(reason));
Of course, it might not always be doable but you should strive for this unless a specific issue arises.
Depending on the language, the translation will have a different length on screen. Some languages have shorter constructs than other in some cases while it is reversed for others; some languages can also have a word for a concept while others won't and will require a circumlocution (designating something by using several words).
From the example above, a translation block looks like:
#: some_file.c:43 another_file.c:41 msgid "Click Here" msgstr "Cliquez ici"
In case you insert a new line at the top of “some_file.c”, the line indicator will change to look like
#: some_file.c:44 another_file.c:41
Obviously, on non-trivial projects, such changes will happen often. If you use source control (you should) and commit such changes even though no actual translation change has happened, each and every commit will probably contain a change to .po files. This will hamper readability of the change history and in case several people are working in parallel and need to merge their changes, this will create huge merge conflicts each time.
Only commit changes to .po files when actual translation changes have happened, not merely because line comments have changed.
Since calling gettext()
might happen very often, it is often abbreviated
to _()
:
#define _(str) gettext(str)
Quite often you will want to sort data for display. There is a string
comparison tailored for that: strcoll()
. It works the same as strcmp()
but sorts according to the current locale settings.
int strcmp(const char *s1, const char *s2); int strcoll(const char *s1, const char *s2);
The function prototype is a standard one and indicates how to order strings. A
detailed explanation would be out of scope for this guide but chances are you
will be able to provide the strcoll()
function as the comparison function
for sorting the data set you are using.
The system described above is a common one and will likely be known to translators, meaning that giving its name (“gettext”) might be enough to explain how to work. In addition to this documentation, there is extensive additional documentation and questions and answers on the topic on the Internet.
Don't hesitate to put comments in your code right above the strings to translate since they can be extracted along with the strings and put in the .po files for the translator to see them.
See also the translating documentation on Phabricator.