Home | Wiki | OI 1.x Docs | OI 2.x Docs OI logo

NAME

OpenInteract2::Manual::I18N - Internationalization in OpenInteract2

SYNOPSIS

This part of the manual will describe i18n efforts in OpenInteract2, how to create message bundles to distribute with your application, and how you can customize the process.

CAVEATS

I'm a newbie at i18n/l10n efforts. The main purpose is to find the path I think most web applications will trod and make that as simple as possible to navigate. The hooks in the framework to enable localization should be sufficiently unobtrusive so as not to preclude other efforts you may have in this area.

So if you have ideas about how things can be done better or more flexibly, please join the openinteract-dev mailing list and chime in. (See SEE ALSO for more info on the mailing list.)

WRITING LOCALIZED APPLICATIONS

100% localization is hard

Localizing every aspect of your application is extremely difficult. There are the easy things like translating words on the screen, date/time formats and money. Then there are the tough things: what does this shade of yellow mean in China versus Saudi Arabia? What happens if someone reads this sequence of graphics from right-to-left instead of left-to-right? And on and on for many more items you couldn't have even thought up yet.

OpenInteract won't presume to take care of all these for you. Instead we try to make the most common operations as simple as possible. Hopefully that will be sufficient for your needs.

IDENTIFYING LANGUAGE TO USE

We have ways of learning about you...

Ordered from most to least important, here's how we identify the language to use for the current request. First match wins.

Custom language identifier

OI has more hooks than your favorite rock band, and this area is no exception. During the request initialization process we identify all the languages available for this request. Normally this means all the languages for a particular user, but you can override it with GET/POST parameters or a setting in the session.

We also provide the means for you to step in and implement your own -- you could parse it from the URL, use Geo::IP, whatever. Just declare your class in the server configuration key 'language.custom_language_id_class':

 [language]
 ...
 custom_language_id_class = MyApp::I18N::LanguageId

And implement the class method 'identify_languages()', which takes a single argument of the languages identified so far. Here's a naive example:

 package MyApp::I18N::LanguageId;
 
 use Geo::IP;
 use OpenInteract2::Context qw( CTX );
 
 my $gi = Geo::IP->new( GEOIP_STANDARD );
 
 sub identify_languages {
     my ( $class, @oi_langs ) = @_;
     my $country = $gi->country_code_by_addr( CTX->request->remote_host );
     my @langs_from_country = $class->_some_nifty_method( $country )
     push @oi_langs, @langs_from_country;
     return @oi_langs;
 }

Note that if you return a list with entries it replaces what OI has so far identified. We took care of this above by first copying all the languages previously identified then adding to them.

SETTING UP LOCALIZAION IN YOUR PACKAGE

Type #1: Message replacement

This is the fairly simplistic means of using keys to represent blocks of text. The key gets replaced by the text for whatever language the current user is associated with. Here's an example: you setup your music library search form like this:

 Artist: _____________
 
 Title:  _____________
 
 Year:   _____________
 
                 <Search>

And you'd like to localize this. Like all other problems dealing with programming you just add a layer of abstraction, associating each piece of text with a key, then associating text to that key for each language:

 {search.artist}: _____________
 
 {search.title}:  _____________
 
 {search.year}:   _____________
 
                 <{search.button}>

Now you just have sets of data for each language:

 en:
 search.artist = Artist
 search.title  = Title
 search.year   = Year
 search.button = Search
 
 es:
 search.artist = Artista
 search.title  = Titulo
 search.year   = Ano
 search.button = Hallazgo
 ...

When the page is rendered these keys get replaced by the associated text. Fortunately Perl comes with libraries to make this happen fairly painlessly. And a nice side-effect is that the message files are in a sufficiently simple format that you can ship them off to someone else and just plug them in your application when they're ready.

There's more about the messages and the file format below.

Type #2: Template negotation

A second type of localization is template negotiation. Hopefully you won't need to use it as often because it can require more maintenance. Instead of replacing text in the template you replace the entire template wholesale.

It works in much the same way, except instead of placing text in the various language files you place template names under a particular key. (The name is in the normal 'package::template' syntax.) And just like invoking a template from your action you can do this in two ways:

  1. specify the template in your action

  2. specify the template in your action configuration

Here's a quick example of the first, passing the message key in your action generate_content() call:

 sub mytask {
     my ( $self ) = @_
     my %params = ( ... );
     ...
     return $self->generate_content(
                     \%params, { message_key => 'mytask.template' } );
 }

And an example of the second, passing the message key in the action configuration (action.ini):

 [foo template_source]
 mytask = msg:mytask.template

In your message files you'd have:

 messages_en.msg:
 mytask.template = mypackage::mytemplatename_english
 messages_es.msg:
 mytask.template = mypackage::mytemplatename_spanish

The templates get the exact same data under the exact same variable names, but you can control the layout and text per language.

See OpenInteract2::Manual::Templates and OpenInteract2::Action for more information.

Signficance of Message Filenames

The names of the filenames we process are fairly flexible, but one aspect is not. The language must be the last distinct set of characters before the file extension. So the following are ok:

  myapp-en.msg         # lang is 'en'
  myotherapp-es-MX.dat #      ...'es-MX'
  messages_en-HK.msg   #      ...'en-HK'

The following are not:

 english-messages.msg
 messages-en-part2.msg
 messagesen.msg

If you create a message filename that does not conform to this specification, it not only won't be processed but will halt the entire localization reading process altogether.

Message File Format

The message file format is fairly simple:

So here is a simple declaration for two message keys without continued values or runtime replacements:

 company.title=Welcome to MyCompany!
 company.phone   =   Call 412-555-1212 for more information.

Two things to note:

  1. The keys ('company.title' and 'company.phone') are abstract and semi-hierarchical. There's a FAQ below about why we chose opaque message IDs for the core OI packages, but you don't have to do so. The only tricky part is ensuring you don't stomp on someone else's namespace. One way to do avoid this is using your package/application name as the first part of the hierarchy.

  2. The message reader will truncate any whitespace around the '='.

Continued Message Values

Here's a declaration of two keys, one of which has a continued value:

 company.intro = You have decided to learn about MyCompany, a leader \
 in the maintenance of the status quo around the world. Ensure your \
 status is the one that's in quo!
 company.title = Welcome to MyCompany!

The main things here are:

  1. The '\' must be at the end of the line or the remainder of your message will get lost. (You may have whitespace between the '\' and the end of line, but that may not be the case forever.)

  2. You can have multiple continuations for a single value.

  3. The value returned will not have any embedded newlines. (TODO: This may change, speak up if you have strong feelings about it.)

Runtime replacements

Since we just use Locale::Maketext behind the scenes you can do anything in your message values that it allows. Here is a quick summary of the most common options.

First, you often need to embed one or more values in a message. Position is important: the translation of your message may shift around the order of the values so you cannot treat it like a sprintf. For instance, you might have:

 db.error.process = While processing the statement [_1] the database \
 returned an error [_2]

In another language this might be something like the following nonsense:

 db.error.process = La base de datos volvio un error [_2] mientras \
 que procesaba la declaracion [_1] 

When we ask for the message we need to pass in two values which will get plugged into the message at runtime:

 my ( $sth );
 eval {
     $sth = $dbh->prepare( $sql );
     $sth->execute();
 };
 if ( $@ ) {
   my $error_msg = $lh->maketext( 'db.error.process', $sql, $@ );
   # ...
 }

Since they're ordered there's no ambiguity.

Second, you often need to plugin values that depending on their value may change words around them. For instance:

 cart.numitems = You have [_1] items in your shopping cart.

Easy enough, but what happens when the number is 1? Or 0?

 You have 1 items in your shopping cart.
 You have 0 items in your shopping cart.

It's understandable, but not user-friendly. Fortunately Locale::Maketext does this for us:

 cart.numitems = You have [quant,_1,item,items,no items]

With a '1' this will generate:

 You have 1 item in your shopping cart.

And with a '0':

 You have no items in your shopping cart.

Nifty!

FAQ

Why did you use opaque IDs for the message keys?

In the Locale::Maketext docs Sean Burke recommends using keys based on the base language -- that is, not using opaque message keys. His suggestion makes for very readable translation documents but I think in practice it would be extremely brittle -- if you change the key in the base language even for punctuation you'll need to change all of them. Feh. (Then again, Mr. Burke is a bona-fide superhero, so we'll see how that shakes out...)

Additionally a lot of this was inspired by the message (or 'resource') bundle technology built in to the Java 2 platform. (See SEE ALSO for more on this.) Message bundles shipped with applications built on Struts or Spring typically use the hierarchical message syntax, with different levels separated by a dot. So you might have 'myapp.search.label.firstname' which gets more specific as you traverse the key from left to right. How specific you want to get is up to you.

That said, there's nothing stopping you from using your own standard for declaring keys in your application. Use ID numbers, letters, days of the week, whatever. Just make sure your package's keys don't trod on another's.

SEE ALSO

OpenInteract2::I18N

OpenInteract2::I18N::Initializer

Locale::Maketext

openinteract-dev mailing list:

http://lists.sourceforge.net/lists/listinfo/openinteract-dev

Article published in TPJ 13 by Sean Burke about Locale::Maketext:

http://search.cpan.org/~sburke/Locale-Maketext-1.06/lib/Locale/Maketext/TPJ13.pod

Web Localization in Perl by Autrijus Tang

http://www.autrijus.org/webl10n/TABLE_OF_CONTENTS.html

Java Internationalization: Localization with ResourceBundles

http://developer.java.sun.com/developer/technicalArticles/Intl/ResourceBundles/

COPYRIGHT

Copyright (c) 2003 Chris Winters. All rights reserved.

AUTHORS

Chris Winters <chris@cwinters.com>

Generated from the OpenInteract 1.99_04 source.


Home | Wiki | OI 1.x Docs | OI 2.x Docs
SourceForge Logo