Monday, December 08, 2008

eZtip: Character Encoding in Templates

Idea (by annais)Recently I was working on a multilingual site and had to add some static text in a number of languages. I added the supplied text and then viewed the resulting page. Having previously lost many hours of my life to dealing with character encoding issues I momentarily regretted I'd gotten out of bed when confronted with mess that was before me.

Quickly regaining my composure I realised that eZ Publish must be doing the transformation and with a bit of digging I discovered that by default templates are seen by the system as being in iso-8859-1 and are converted into utf-8 for display.
In my case eZ Publish was doing what it was told and converting the already utf-8 characters in utf-8.

There are 2 options for telling eZ publish the character encoding of a template. Firstly you can configure eZ publish that all templates are utf-8 by editing template.ini.append.php in overrides for the entire site or your specific siteaccess to limit the effect.

[CharsetSettings]
# The charset to use if no charset is specified in the template
DefaultTemplateCharset=utf-8


The other option is to specifically indicate a template as being encoded using utf-8. This involves adding the following line to the top of the template.

{*?template charset=utf-8?*}

In end I utilised the latter method. As I was making a small change to a large site that I was not familiar with, the specific method allowed for the change to be limited the effected template only.

I must admit I was surprised that these defaults are in place, given that eZ moved to a Unicode default some time ago. From the comments on this issue it would appear that concerns about backward compatibility are the reason that the default remains iso-8859-1 while the output is set to utf-8. I do wonder what effect this has on performance though.

4 comments:

  1. Are you sure that correct charset is utf8 (without a hyphen)?

    I have this setting set to
    DefaultTemplateCharset=utf-8

    and it works (at least i think so)

    ReplyDelete
  2. Hi Andrey

    I was basing this on the info @ http://issues.ez.no/IssueView.php?Id=13835&ProjectId=3

    I'll double check and update the post if required.

    Cheers
    Bruce

    ReplyDelete
  3. Hi Andry

    It would appear that both will work. eZCharsetInfo::aliasTable provides a lookup of character set aliases and utf8 = utf-8.

    See: http://pubsvn.ez.no/doxygen/trunk/html/ezcharsetinfo_8php-source.html#l00055

    utf-8 is the "correct" value and I've updated the post to reflect this.

    Cheers
    Bruce

    ReplyDelete
  4. What version of eZ was this in because I'm finding 4.0.1 [siteaccess]/template.ini.append.php already has the character encoding set to utf-8?

    Maybe it depends on the design package installed?

    ReplyDelete