Developer Area/Language strings

From Mahara Wiki
< Developer Area
Revision as of 18:46, 1 August 2013 by Aaronw (talk | contribs) (Concatenation is bad for translation)

Jump to: navigation, search

For internationalization (i18n) and localization (l10n) reasons, all strings that are displayed to the user in Mahara are stored in "lang strings" and printed via the get_string($identifier, $section) function, rather than being placed directly in the code. This function will check to see if the user has any foreign language langpacks instealled, or any custom lang files, and will use the string from those if present. Otherwise, it will fall back to the core lang files.

How to use get_string() in PHP

The function get_string() takes these parameters:

  • identifier: The name of the lang string, unique within its section. The identifier must be acceptable as a PHP key. By convention, it's often the same as the English contents of the string, in all lowercase without spaces. Alternately, it can represent the purpose of the string rather than the exact wording of the string.
  • section: The file that the lang string lives in. This is technically optional; if left off, it will default to "mahara", a section that contains many common core strings.
  • params: (Optional) A lang string can contain one or more sprintf() params. If present in the lang string, matching param values will be expected from the get_string() call that uses the string, and will be placed in. Params are helpful for translation into languages with different word orders.

Examples:

<?php
$yesstr = get_string('yes'); // lives in lang/en.utf8/mahara.php
$clamstr = get_string('pathtoclam', 'admin'); // lives in lang/en.utf8/admin.php
$blogstr = get_string('blog', 'artefact.blog'); // lives in artefact/blog/lang/en.utf8/artefact.blog.php
$copyrightstr = get_string('feedrights', 'artefact.blog', $USER->displayname); // Lang string with one param

// A pluralizable string. The first argument is the count of pluralizable items; get_string will use this
// to determine whether to use the singular or plural (or which plural, in non-English languages with more than one plural form)
$updatefilesstr = get_string('updatednfiles', 'mahara', 5);

How to use lang strings from a Dwoo template

You can also access lang strings from inside Dwoo templates, with the {str} tag:

{str tag='identifier' section='section' arg1='first param value' arg2='second param value'}

The code that makes this work is in htdocs/lib/dwoo/mahara/plugins/function.str.php

How to use lang strings from Javascript

You can even use lang strings in Javascript. First, you have to preload them into the page by providing a $pagestrings argument to the smarty() function...

<?php
// Providing a $pagestrings variable to smarty()
$pagestrings = array(
     'admin' => array(
         'discardpageedits',
         'pathtoclam'
     ),
     'mahara' => array('yes')
);

... and then, you can access them from Javascript using the Javascript get_string() function, which is similar to the PHP get_string(), except that it leaves out the "section". (NOTE: This means you can't include two language strings with the same identifier and different sections.) It also only accepts substitution params of type "%s".

// This is happening in Javascript
alert(get_string('yes'));
confirm(get_string('discardpageedits', 'first param value'));

A number of Javascript strings are also hard-coded to be always available via get_string() in Javascript, if certain Javascript files are included by Dwoo. See the function "jsstrings()" in "htdocs/lib/web.php" for this list.

Anatomy of a lang file

A lang will be named {section}.php, where "section" is the section value to pass to get_string(). It will contain a series of lines adding keys to an array called "$string". By convention it begins with a check to prevent direct access from a browser.

<?php 
defined('INTERNAL') || die();

$string['changepassworddesc'] = 'New password';
$string['changepasswordotherinterface'] = 'You may <a href="%s">change your password</a> through a different interface.';
$string['oldpasswordincorrect'] = 'This is not your current password.';

// A pluralizable string. In English, the single should be mapped to key 0, the plural to key 1. For other languages, it depends on the
// pluralfunction defined in their langconfig.php
$string['updatednfiles'] = array(
    0 => 'You have updated %s file.',
    1 => 'You have updated %s files.',
);

Plural strings

There's a special syntax that should be followed for strings that have words that might be pluralized, such as "You have 3 blogs"/"You have 1 blog":

First, when using get_string(), pass the count of items in the string, as the first custom param to the string:

get_string('fileattachedportfolioitems', 'artefact.file', $numitems);

Then, in the lang file, you make the string an array. In English, the array item with key 0 should be the singular form of the string, while the array item with key 1 should be the plural.

$string['fileattachedtoportfolioitems'] = array(
    0 => 'This file is attached to one other item in your portfolio.',
    1 => 'This file is attached to %s other items in your portfolio.',
);

This syntax allows for other languages with different pluralization structures than English. Each language's langconfig.php includes a pluralfunction, which determines what key it will return for the count number. English returns either 0 for singular, or 1 for plural. Other languages may return more or fewer keys. So, the same string in another langpack might look like this:

$string['fileattachedtoportfolioitems'] = array(
    0 => 'Ta datoteka je pripeta k %s drugemu elementu v vašem listovniku.',
    1 => 'Ta datoteka je pripeta k %s drugima elementoma v vašem listovniku.',
    2 => 'Ta datoteka je pripeta k %s drugim elementom v vašem listovniku.',
    3 => 'Ta datoteka je pripeta k %s drugim elementom v vašem listovniku.',
);

See Language Packs/Plural forms and https://bugs.launchpad.net/mahara/+bug/901051 for more information.

Note: Mahara contains quite a few old plural strings which don't follow this format. It'd be great if you can fix them when you find them. The corrected string should use a different name than the old one, so that translators will be prompted to translate it. Only remove the old string once you've removed all references to it from the core code.

Where the lang files are

  • Core lang files live under $cfg->dirroot/lang/en.utf8/{section}.php<tt>
  • Plugin lang files live under <tt>$cfg->dirroot/{plugintype}/{pluginname}/lang/en.utf8/{plugintype}.{pluginname}.php
    • Note that the "section" for a plugin when invoking it in get_string(), is "{plugintype}.{pluginname}". For example: "artefact.blog", "import.leap2a", "blocktype.contactinfo"
    • Subplugins, such as a blocktype that belongs to an artefact, live under {$cfg->dirroot}/artefact/{pluginname}/blocktype/{blockname}/lang/en.utf8/blocktype.{blockname}.php and have "blocktype.{blockname}" as their section.
  • Foreign language langpacks are installed into your dataroot directory: $cfg->dataroot/langpacks/{langcode}
    • {langcode} will be the code for the language. For example "pt.utf8", "es.utf8", "en_US.utf8" etc
  • Local lang files live under $cfg->dirroot/local/lang/{langcode}/{section}.php

Translations

The main point of this system is to allow for Mahara to be translated. See the "langpacks" documentation for more on that. Basically, you download a langpack from langpacks.mahara.org, unzip it, and put it in $cfg->dataroot/langpacks/. Then, users are presented with a language selection menu at the login screen.

If a particular lang string is not present in the langpack, then the English language string from Mahara core is used, unless the language specifies a parent language and the parent language's langpack is installed.

langconfig.php

Languages contain a small amount of configuration data. This goes in a "langconfig.php" core lang file. The following values are the most important:

  • thislanguage: The name of the language, in the language. This is displayed in the language selection menu that users see.
  • locale: A list of computer locale strings which this language matches. See the other lang packs for an idea of what these should look like.
  • parentlanguage: (Optional) If this is supplied, then for untranslated strings Mahara will attempt to find a translation in the parent language (if it's installed).
  • pluralization:
    • See the User:Aaronw/Language_strings#Plural_strings plural strings API.
    • For langpacks converted from PO Format, these values will be automatically generated if you include a "Plural-Forms:" header
    • pluralfunction: The name of a PHP function that will indicate which pluralization rule should be used for a given count. This function should take exactly one integer argument (the count of items) and will return a key which indicates which pluralization form should be used. For instance for English, the rule returns a "0" if the count is one, and a "1" if the count is anything else. All the lang files for the language will then define their plural strings as arrays, with a value for each possible key this function can return.
    • pluralrule: The Javascript equivalent of pluralfunction. This should be a snippet of Javascript that will evaluate an integer stored in the variable "n", and will return exactly the same value as the pluralfunction if it received n as its argument.

Custom lang strings in /local

Many Mahara installations may wish to overwrite only a few lang strings. The easiest way to do this is to create custom lang files under the /local directory. If present, the strings in these files will take priority over strings in the core lang files or langpack lang files.

Custom lang files don't need to translate 100% of the lang file they're over-riding. They can contain as few strings as you to care to actually override.

Example:

<?php
// This file lives under /var/www/mahara/local/lang/en.utf8/mahara.php
defined('INTERNAL') || die();

$string['yes'] = 'Yessir!';
$string['login'] = 'Sign in!";

You can place foreign language files, and plugin language files under local/lang as well. These are all acceptable:

  • local/lang/en.utf8/mahara.php
  • local/lang/en.utf8/blocktype.contactinfo.php
  • local/lang/pt.utf8/artefact.blog.php

Concatenation is bad for translation

Because word orders are different in different languages, when possible you should avoid concatenating lang strings together. You should either use lang string parameters, or just use multiple language strings.

BAD:

<?php
// Works okay for English, but what about Spanish, where the adjective should follow the noun?
$yellowdogstr = get_string('yellow') . " " . get_string('dog');
$greendogstr = get_string('green') . " " . get_string('dog');

BETTER:

<?php
// English: $string['coloreddog'] = '%s dog';
// Spanish: $string['coloreddog'] = 'perro %s';
$yellowdogstr = get_string('coloreddog', 'artefact.dog', get_string('yellow'));
$greendogstr = get_string('coloreddog', 'artefact.dog', get_string('green'));

BEST:

<?php
// In some languages, the word for "dog" might change when it's paired with yellow, or the word
// for yellow might change when paired with dog
// English: $string['dogyellow'] = 'yellow dog';
// Spanish: $string['dogyellow'] = 'perro amarillo';
// Australian English: $string['dogyellow'] = 'dingo';
$yellowdogstr = get_string('dogyellow');
$greendogstr = get_string('doggreen');

And on a similar note, definitely don't concatenate word parts together.

BAD:

<?php
$blogstr = get_string('blog', 'artefact.blog');
$pluralblog = get_string('blog', 'artefact.blog') . get_string('s');

GOOD:

<?php
$blogstr = get_string('blog', 'artefact.blog');
$pluralblog = get_string('blogs', 'artefact.blog');

// Or if you've got a specific number of blogs, use the plural strings API
$numblogs = get_blog_count();
$blogstr = get_string('nblogs', 'artefact.blog', $numblogs);

This approach has the downside of leading to a proliferation of language strings, but unfortunately it's the only way to achieve clean translations among different languages with wildly different grammars.

Alphabetize your lang files

From an implementation standpoint, it doesn't matter what order the lang strings are in, within a lang file. However, from a human-readability standpoint, they should be in alphabetical order by identifier.

Many developers are tempted to group them together by function, but any functional grouping scheme eventually falls apart as new strings are added which defy the classification scheme. The end result is a completely disordered file where any particular lang string could be in any place. This is how most of the lang files in Mahara core are.

So just go ahead and alphabetize the lang strings, when creating a new lang file. If you want to ensure that related strings wind up next to each other, give them names that classify them by type.

BAD:

<?php
// Strings are sorta grouped together by functionality, in semi-random order
$string['baseline'] = 'Baseline';
$string['top'] = 'Top';
$string['middle'] = 'Middle';
$string['bottom'] = 'Bottom';
$string['texttop'] = 'Text top';
$string['textbottom'] = 'Text bottom';
$string['left'] = 'Left';
$string['right'] = 'Right';
$string['src'] = 'Image URL';
$string['image_list'] = 'Attached image';
$string['alt'] = 'Description';

$string['copyfull'] = 'Others will get their own copy of your %s';
$string['copyreference'] = 'Others may display your %s in their page';
$string['copynocopy'] = 'Skip this block entirely when copying the page'; 

$string['viewposts'] = 'Copied entries (%s)';
$string['postscopiedfromview'] = 'Entries copied from %s'; 

$string['youhavenoblogs'] = 'You have no journals.';
$string['youhaveoneblog'] = 'You have 1 journal.';
$string['youhaveblogs'] = 'You have %s journals.'; 

GOOD:

<?php
// Strings are in alphabetical order by identifier
// Strings with similar purposes have names that start with their purpose and then a 
//    description, so that they wind up next to each other alphabetically
$string['copyfull'] = 'Others will get their own copy of your %s';
$string['copynocopy'] = 'Skip this block entirely when copying the page';
$string['copyreference'] = 'Others may display your %s in their page';
$string['imagealt'] = 'Description';
$string['imagelist'] = 'Attached image';
$string['imagesrc'] = 'Image URL';
$string['positionbaseline'] = 'Baseline';
$string['positionbottom'] = 'Bottom';
$string['positionleft'] = 'Left';
$string['positionmiddle'] = 'Middle';
$string['positionright'] = 'Right';
$string['positiontextbottom'] = 'Text bottom';
$string['positiontexttop'] = 'Text top';
$string['positiontop'] = 'Top';
$string['postscopiedfromview'] = 'Entries copied from %s';
$string['viewposts'] = 'Copied entries (%s)';
$string['youhaveblogs'] = 'You have %s journals.';
$string['youhavenoblogs'] = 'You have no journals.';
$string['youhaveoneblog'] = 'You have 1 journal.';