Developer Area/Language Pack Generation
From Mahara Wiki
< Developer AreaRevision as of 17:21, 11 April 2016 by Aaronw
Two main scripts in the
mahara-langpacks directory of the mahara-scripts repository push new English language strings from Mahara into Launchpad, and then pull non-English translations out of Launchpad, to publish them on langpacks.mahara.org.
update-pot.shpolls the Mahara code's
htdocs/lang/en.utf8directory for changes to strings, converts Mahara's lang PHP files into a
.potfile, and pushes this update file into a Bazaar branch in the
mahara-langproject on Launchpad.
- Launchpad periodically imports the English-lanuage .pot file from the Bazaar branch, and uses it to populate its web-based translation interface for all the other languages.
- Launchpad periodically exports the latest translation data for all languages, into a separate .po file for each language, and publishes these onto another Bazaar branch in the
- langpacks.sh polls the mahara-lang repositories on launchpad, and generates official mahara language pack tarballs at http://langpacks.mahara.org.
mahara-scripts also has a
debian/ directory, which creates a package called
custom-site-mahara-langpacks_x.y_all.deb. This package installs
langpacks.sh and their dependencies, and sets them up to run on cron. (Currently the scripts are installed and run on the same Catalyst IT servers that host langpacks.mahara.org itself.)
The following is a general summary of what these scripts are trying to do.
- 1 Generation of .pot files for Launchpad
- 2 Launchpad's side of things
- 3 Generation of language packs
- 4 Manually update language packs
- 5 Manually create a tar ball of a language for testing
- 6 Installation of these scripts
- 7 Git-based translation branches
- 8 Combining Launchpad-based and non-Launchpad translations
Generation of .pot files for Launchpad
update-pot.shscript runs once a day as the maharabot user (at 7:52AM NZDT)
- Checks current branches at https://git.mahara.org/mahara/mahara.git for updates to English language files.
- If there have been changes, runs a php script called
php-po.phpto generate a single mahara.pot file (for each branch) from the
lang/en.utf8directories in the mahara HEAD commit.
- On the
masterbranch, it may also create po files for existing translations, when there have been changes to Mahara strings that don't need to be translated (e.g. typos).
- Pushes updated pot and po files to
lp:~mahara-lang/mahara-lang/<branch>, where Launchpad will import them into its web-based translation interface.
The package takes care of most of the necessary dependencies except for the maharabot user's ssh key, needed for the bzr push to Launchpad. That still needs to be installed manually on the server.
Launchpad's side of things
Because Mahara requires .po or .mo files for its translation interface, and Mahara itself doesn't directly use either of those formats, we use a proxy project to translate Mahara. This project is called "mahara-lang" (aka Mahara Translations). It doesn't have releases like the normal Mahara project, but it does have a separate series for each Mahara series. We basically turn Mahara's PHP lang string files into a POT file which is the only content of this "project", and then let Launchpad's translation interface work with that.
The Launchpad translation interface lets you configure a Bazaar import branch for each series. It expects the import branch to contain one or more English-language PO or POT files. We push to this branch from
update-pot.sh, and Launchpad checks it periodically, notice our updates, and stores them in its servers where it uses them to inform its web-based translation interface.
Human translators go to the Launchpad web interface, and translate the strings for a particular Mahara series and language. Launchpad saves these changes internally.
Launchpad also lets you configure an export branch for each series. Once a day, Launchpad automatically takes its internally stored translation data for all the languages on a series, converts it into a separate PO file for each language, and commits those files into the export branch.
Generation of language packs
langpacks.shscript runs once per hour as the maharabot user
- Checks all the languages in the language-repos.txt file for newly translated strings.
- The script is hard-coded to check for the latest version of this file in the mahara-scripts git repo, or to use a local version on the server with it. If a local version exists, it takes precedence.
- For each Mahara series, Launchpad will have an export branch in Bazaar, and each branch will contain a single .po file exported by Launchpad as described above.
- The last commit id for each language/branch is stored in the file /var/lib/mahara-langpacks/tarballs/mahara-langpacks.last (in the script's working directory). If you ever need to force regeneration of a particular language pack, you probably need to hack that file to remove the language and/or branch.
- The script po-php.pl converts the .po file into the directory tree of php and html files required by Mahara
- Tarballs of these directories are put into the document root of the http://langpacks.mahara.org site, and index.html, status.html files are generated
Manually update language packs
If you are in charge of mahara translation management, you can manually update language packs on . This is the case where the langpack scripts can not be run on the server.
- You should have permission to access the langpacks.mahara.org server.
Here are the instructions
- Update environment variables in /etc/mahara-langpacks.conf
- Run the script update-pot.sh
- Run the script langpacks.sh
- Copy the directory mahara-langpacks to the server
- Use 'scp' to copy the langpacks directory from your local machine to a temporary directory on the server.
- Use 'sudo -u maharabot cp -ar ...' to copy to the langpacks directory.
Manually create a tar ball of a language for testing
You can generate the language pack of a particular language to test. Here are the instructions
- Get the language scripts from mahara-scripts repository
mkdir ~/code cd ~/code git clone [email protected]:scripts/mahara-scripts.git
- Get the po file of the language from Launchpad
- Run the script po-php.pl
cd mahara-scripts/mahara-langpacks po-php.pl /path/to/po/files/<po file> /path/to/langpacks/<language code>.utf8 <language code>.utf8
- Build the tar ball
cd /path/to/langpacks tar -czf <language code>-master.tar.gz <language code>.utf8
Installation of these scripts
The scripts were initially written to run on one server, but more recently the langpacks.mahara.org site has been moved to a cluster of two web servers, each of which has a running copy of the scripts. This poses some challenges:
- The script
update-pot.shneeds to run on ONE server (currently, server 1 in the cluster)
- The script
langpacks.shshould run on each server, but at different times in order to avoid overloading. The script will store its data in the "$DATA" directory (defined in the file
/etc/mahara-langpacks.conf). This directory should not be shared between two servers.
- User maharabot on both servers must be created and his SSH keys must be updated on Launchpad.net
- The Bazaar client must be installed and configured on each server (
Git-based translation branches
Before we started using the Launchpad translation interface in 2010, we stored all the translations in PHP files in Git. The plan was to phase all translation branches over to Launchpad, but as of 2016 a couple of them still remain in Git, most prominently the Czech translation.
Fortunately, all the code for handling translations in Git is still present in the scripts mentioned above. The repo list file
language-repos.txt indicates whether each language is stored in Launchpad, or the URL of the git repository it should come from.
update-pot.shdoesn't actually do anything for Git-based translations right now. Prior to our Launchpad switchover, it used to generate the POT files and publish them to langpacks.mahara.org/pot/, where translators could download and use them in their translation tools. Now, if a translator wants to use the POT files directly, they need to fetch them from the Bazaar branch, like so:
bzr branch lp:mahara-lang/16.04
langpacks.plknows whether each language should be handled by Launchpad or Git, as specified in
language-repos.txt. In the repo, it looks for branches named after each supported Git series (master, 15.10_STABLE, 15.04_STABLE, etc).
- Within each branch, it looks for a PO file (i.e.
<lang>.po) and uses that the same as it would a PO file from Launchpad.
- If it doesn't find a PO file, it looks for a
lang/<lang>.utf8directory, and tries to pull translation PHP strings from there. So this means that Git translations, unlike Launchpad, can use PHP files directly. The PHP files get packaged up into the langpack without a PO conversion step.
- Within each branch, it looks for a PO file (i.e.
Note that the scripts (at present) only read from Git, not write to it. So if the repository where it's stored allows anonymous Git read access, everything should be good to go.
Combining Launchpad-based and non-Launchpad translations
This section is theoretical.
Launchpad and offline translations
Some of our translators would prefer to use offline POT-based translation tools rather than Launchpad's translation interface (which is admittedly a little clunky). Here are some ideas about how we might allow them to do that and combine this with the translations in Launchpad.
Currently, the stuff we're importing into Launchpad is actually what Launchpad considers "templates" rather than translations. This means that it only reads in an English-language lang file, and uses that to create a list of strings for other languages to translate. See https://help.launchpad.net/Translations/YourProject/ImportingTemplates
Offline translations can also be imported, by the methods described on this page: https://help.launchpad.net/Translations/YourProject/ImportingTranslations
- Uploading a tarball that contains the .po file for a language (but you can only do this for the trunk branch, so it's not very useful)
- Commit the language's po file into the relevant import branch in Bazaar, and then use the "One-off import" command under the branch's synchronization settings page
- Or if you're going to have a regular offline translator, you might set it to automatically import translations (although Launchpad warns this might overwrite translations created via Launchpad).
Launchpad and Git translations
If we had a situation where there were some contributors using Git for a language, and others using Launchpad, then we might be able to rig up something like this:
update-pot.sh pull from the Git repository and push into the Launchpad import branch
2. Set Launchpad to regularly import translations (and not just templates) from the import branch
langpacks.pl export the generated PHP files into the Git repository.
You'd need to give some careful thought about how to avoid a circular over-write sequence, though.