Understanding Django translation mechanism

Let's try to understand django translation mechanism. Firstly, django uses a widely used mechanism in most open-source software, which is gettext, so it's not unique to django alone. Many other open-source software also using gettext as their translation machinery. The difference is just the tooling you build around it. Tooling here means the script, the function name, and the step you formulate to get the translated text. Some are heavily automated, and some are quite manual.

The general ideas in gettext translation are:-

1. Mark the string as translatable. There are a lot of ways to accomplish this. In program code, the best approach is to use a function. Why? Because first, it is syntactically correct. Function can always return value so it makes perfect sense as you can return the translated string from the function. But if you want to be fancy, you can just use any marker and then run the pre-processing script on that code to generate the translation you want.

2. Parse the code and extract the string marked as translatable. In django, you can see it implemented here.

Basically, you just run this command:-

xgettext -d django -L Python -kugettext_lazy test_trans.py

where test_trans.py is the file you want to translate. django.po will be generated in the current directory.

3. Translate the message file and compile it into a message object to optimize looking for translated string later on.

Now, to really understand how it work in practice, save the guess work and get your hand dirty in the console:-

09:41:40 {master} ~/git/kai-app$ python manage.py shell
>>> from django.utils.translation import gettext_lazy, gettext, activate
>>> gettext('About LaLoka Labs')
'About LaLoka Labs'
>>> activate('ja')
>>> gettext('About LaLoka Labs')
'About LaLoka Labs'

It's not translated, so what is the problem here? Aaah, we forgot to generate the message object.

python manage.py compilemessages
from django.utils.translation import gettext_lazy, gettext, activate >>> gettext('About LaLoka Labs')
'About LaLoka Labs'
>>> activate('ja')
>>> gettext('About LaLoka Labs')
'LaLoka Labsについて'

It's work!

Now, what is the difference between ugettext and ugettext_lazy ? The docs is here - https://docs.djangoproject.com/en/4.1/topics/i18n/translation/#lazy-translation.

It said:-

Use the lazy versions of translation functions in django.utils.translation (easily recognizable by the lazy suffix in their names) to translate strings lazily – when the value is accessed rather than when they’re called.

These functions store a lazy reference to the string – not the actual translation. The translation itself will be done when the string is used in a string context, such as in template rendering.

This is essential when calls to these functions are located in code paths that are executed at module load time.

Hmm, what does that mean?

If you try it in console, you'll see:-

>>> gettext_lazy('About LaLoka Labs')
<django.utils.functional.lazy.<locals>.__proxy__ object at 0x7f9c34880c50>

Notice the difference. Unlike the previous example, this one does not return the translated string.

So why do you need gettext_lazy over gettext?

If you go further down the docs, it shows an example of translating some model's attributes. You should already know that models definition is loaded once when you start django. If you use gettext, which means the string will get translated at that time. But what if you access models attribute at runtime (later on), in a different translation context, like changing from en to ja? You'll not get the translated ja string because it has already been translated before. So using gettext_lazy will ensure the string is translated when you see it, not when the program gets loaded into memory.

We have had some cases in the past where the translatable strings did not pick up by makemessages. Turns out someone aliased the gettext_lazy function to gettext_lz. If you look at the django source I linked above, makemessages run this command to extract the string:-

        elif self.domain == "django":
            args = [
                "xgettext",
                "-d",
                self.domain,
                "--language=Python",
                "--keyword=gettext_noop",
                "--keyword=gettext_lazy",
                "--keyword=ngettext_lazy:1,2",
                "--keyword=pgettext:1c,2",
                "--keyword=npgettext:1c,2,3",
                "--keyword=pgettext_lazy:1c,2",
                "--keyword=npgettext_lazy:1c,2,3",
                "--output=-",
            ]

_lz is not defined anywhere there. _ is recognized by default by python gettext module.

Some more examples to help you understand when to use gettext and gettext_lazy:-

class Person(models.Model):
    name = models.CharField(help_text=_lz('This is the help text'))

    def say_something(self):
        return _("Hello world")

name above will be executed when django started, either through runserver or apache restart, so it needs to be lazy. say_something() only executed by user, for example when they call person.say_something() in console, so it doesn't need to be lazy.

Now, can anyone tell what this means (this is from an old django project settings.py):-

gettext = lambda s: s
LANGUAGES = (
    ('en', ugettext('English')),
    ('ja', ugettext('日本語')),
)

Some would answer that to avoid circular import since the translation module depends on settings.py so we can't import gettext from the settings module itself.

However, in latest version of django, this workaround is not needed anymore and django.utils.translation can be imported from settings.py. Here's what the comment says about the workaround:-

# Here be dragons, so a short explanation of the logic won't hurt:
# We are trying to solve two problems: (1) access settings, in particular
# settings.USE_I18N, as late as possible, so that modules can be imported
# without having to first configure Django, and (2) if some other code creates
# a reference to one of these functions, don't break that reference when we
# replace the functions with their real counterparts (once we do access the
# settings).