xgettext.pl - xgettext(1)-like interface for .tmpl strings extraction
This is an experimental script based on the modularized text-extract2.pl script. It has behaviour similar to xgettext(1), and generates gettext-compatible output files.
A gettext-like format provides the following advantages:
Translation to non-English-like languages with different word order: gettext's c-format strings can theoretically be emulated if we are able to do some analysis on the .tmpl input and treat <TMPL_VAR> in a way similar to %s.
Context for the extracted strings: the gettext format provides the filenames and line numbers where each string can be found. The translator can read the source file and see the context, in case the string by itself can mean several different things.
Place for the translator to add comments about the translations.
Gettext-compatible tools, if any, might be usable if we adopt the gettext format.
This script has already been in use for over a year and should be reasonable stable. Nevertheless, it is still somewhat experimental and there are still some issues.
Please refer to the explanation in tmpl_process3 for further details.
If you want to generate GNOME-style POTFILES.in files, such files (passed to -f) can be generated thus:
(cd ../.. && find koha-tmpl/opac-tmpl/default/en \ -name \*.inc -o -name \*.tmpl) > opac/POTFILES.in (cd ../.. && find koha-tmpl/intranet-tmpl/default/en \ -name \*.inc -o -name \*.tmpl) > intranet/POTFILES.in
This is, however, quite pointless, because the "create" and "update" actions have already been implemented in tmpl_process3.pl.
In the SCRIPT elements, the script will attempt to scan for _("string literal") patterns, and extract the string literal as a translatable string.
Note that the C-like _(...) notation is required.
The JavaScript must actually define a _ function so that the code remains correct JavaScript. A suitable definition of such a function can be
function _(s) { return s } // dummy function for gettext
tmpl_process3.pl, xgettext(1), Locale::PO(3), translator_doc.txt
There probably are some. Bugs related to scanning of <INPUT> tags seem to be especially likely to be present.
Its diagnostics are probably too verbose.
When a <TMPL_VAR> within a JavaScript-related attribute is detected, the script currently displays no warnings at all. It might be good to display some kind of warning.
Its sort order (-s option) seems to be different than the real xgettext(1)'s sort option. This will result in translation strings inside the generated PO file spuriously moving about when tmpl_process3.pl calls msgmerge(1) to update the PO file.
If a Javascript string has leading spaces, it will generate strings with spurious leading spaces, leading to failure to match the strings when actually generating translated files.