Archive for the ‘Localization’ Category

5. Strange behaviour on our new Farsi (Persian) page

3 September 2008

First of all, you don’t need to know Farsi to help out here.  I don’t understand Farsi myself, but the problem is obvious when you look at the page.  Here’s a screenshot of the problem:

The problems are in the first and last rows.  Reading from right to left (and the HTML header instructs the page to be read from right to left), you should see the country name, the header (in Farsia), the date and then in brackets, the source.

Here is the HTML header:

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fa" lang="fa" dir="rtl">

As you can plainly see, the first line is screwed up because the closing bracket for the source (IASWI) goes after the final field, which is a link to a campaign and is highlighted in yellow.

The last line is even worse – it puts the source first, the country last.  This should have nothing to do with the fact that the text is in English — because it’s tagged as being in Farsi (the link actually takes you to a page with both Farsi and English text).

Here is the PHP code for generating this line on our page:

echo (‘<a href=”http://www.labourstart.org/cgi-bin/show_news.pl?country=’);
echo $countryen;
echo (‘”><b>’);
echo $country;
echo (‘</b></a> ‘);
echo (‘<a href=”‘);
echo $url;
echo (‘” title=”‘);
echo $userid;
echo (‘”>’);
echo $header;
echo (‘</a> ‘);
if ($row2['actnowcampaigncode'] > 0) {
echo (‘ <a href=”http://www.labourstart.org/cgi-bin/solidarityforever/show_campaign.cgi?c=’);
echo $actnowcampaigncode;
echo (‘”><span style=”background-color:yellow;color:black”><b>Act</b><i>NOW!</i></span></a> ‘);
}
echo $dd;
echo (‘-’);
echo $mm;
echo (‘-’);
echo $yyyy;
echo (‘ [');
echo $source;
echo ('] ‘);

Anyone have any ideas about how to fix this?  Thanks.

1. Conversion of text from Unicode – problem with Perl Text::Iconv

3 September 2008

LabourStart has recently converted its news links database to Unicode.  As we now work in 22 languages, it’s important that we be able to show characters correctly — including in our lists of languages displayed at the top of every page on the site.

But … most union websites don’t use Unicode.  And the JavaScript newswires we created, which now generate Unicode characters, were causing problems.  So we’ve gone into the script which creates the JavaScript every 30 minutes and told it to convert Unicode back into character encodings like iso-8859-1 (for Western languages), windows-1251 (for Russian), etc.

The problem is, while this works like a charm on the Russian, it’s not working on Norwegian — or not converting all the characters, not even all the comon ones.

Here is a page showing the current Norwegian JavaScript newswire.  To see the characters correctly displaying in Unicode, go here.

We’re using a Perl module – Text::Iconv to do this.  Here is the code:

if ($langcode eq “no”) {$converter = Text::Iconv->new(“utf-8″, “iso-8859-1″);}

$Header = $converter->convert(“$Header”);

Can anyone help sort this out for us?