Scandinavian character support in url management

Hello!

The Scandinavian characters do not seem to work in the url management (probably CUrlRule does not support them).

I have the following config:

‘urlManager’=>array(

'class'=>'CUrlManager',


'urlFormat'=>'path',


'showScriptName'=>false,


'urlSuffix'=>'.html',


'rules'=>array(


    'product/<id:\d+>/<name>'=>'product/view',


    'product/<id:\d+>'=>'product/view',


    'category/<id:\d+>/<name>'=>'category/view',


    'category/<id:\d+>'=>'category/view',


),

),

All urls work except when <name> contains Scandinavian characters. For example the following url does not work:

category/1805/J%E4%E4kiekko.html

which is decoded to:

category/1805/Jääkiekko.html

Do you know how I can fix this? I don’t want to remove or convert the Scandinavian characters as it would not be very good for the SEO…

Make sure the characters are encoded the same way in the database… else it will not find it :) I would advise instead using name… use a slug column with url safe characters for finding data.

@nsw:

Actually UrlRule should support Unicode characters. If you check the constructor code of CUrlRule, you see that the u modifier is added to every rule regex. Maybe you can do some debuggging to find out, why it’s not applied correctly in your case?

Sorry for not answering sooner. Thank you very much for your replies! The encoding should have crossed my mind!

The database is very old and is unfortunately not in utf8. I made the decision to change the url to only valid characters. I replace Å,Ä,Ö to A, A, O and so on…